Evan Smothers€¦ · Read me rst: The following are my lecture notes for Math 22B. They will...

Read me first: The following are my lecture notes for Math 22B.They will mostly follow Boyce and DiPrima, and will be very similarto what is seen in lecture. However, there may be some extra examplesand proofs in here that we didn’t have time to get to in class. Sincethey are mostly intended as a reference for me during lecture, manyexamples may lack the details seen in class (as well as strange notesthat I like to leave myself). Additionally, it is possible that there aresome typos in here, so if you see one please let me know and I’ll behappy to fix it. I will try to update these once or twice a week so thatthey stay relatively current.

This edition includes everything through §7.9

Now with fun clickable table of contents!

1

2

Contents

§1.1: Basic Mathematical Models, Direction Fields 3§1.2: Solutions of Some Differential Equations 7§1.3: Classification of Differential Equations 11§2.1: Linear Equations and the Method of Integrating Factors 13§2.2: Separable Equations 18§2.3: Modeling with First Order Equations 23§2.4: Linear vs Nonlinear Equations 28§2.5: Autonomous Equations and Population Dynamics 33§2.7: Numerical Approximations and Euler’s Method 37§2.8: The Existence and Uniqueness Theorem 43§2.9: First Order Difference Equations 50§3.1: Homogeneous Equations with Constant Coefficients 62§3.2: Solutions of Linear Homogeneous Equations; theWronskian 66§3.3: Complex Roots of the Characteristic Equation 74§3.4: Repeated Roots; Reduction of Order 78§3.5: Nonhomogeneous Equations; Method of UndeterminedCoefficients 82§3.6: Variation of Parameters 88§3.7: Mechanical and Electrical Vibrations 91§3.8: Forced Vibrations 99§6.1: Definition of the Laplace Transform 106§6.2: Solution of Initial Value Problems 112§7.1: Introduction to Systems of First Order Linear Equations 118§7.2: Remember Matrices? 121§7.3: Systems of Linear Algebraic Equations; LinearIndependence, Eigenvalues, Eigenvectors 127§7.4: Basic Theory of Systems of First Order Linear Equations 133§7.5: Homogeneous Linear Systems with Constant Coefficients 137§7.6: Complex Eigenvalues 144§7.7: Fundamental Matrices 150§7.8: Repeated Eigenvalues 158§7.9: Nonhomogeneous Linear Systems 164

3

§1.1: Basic Mathematical Models,Direction Fields

Where do differential equations come from? In any number of thesciences (physics, biology, engineering, etc.) one is given some sort ofsystem, which is represented by a bunch of information.

Example 1 (biology): suppose we are interested in a population ofzebras living in a particular region in northern Kenya. Past field stud-ies have shown that the zebras reproduce at a rate proportional to thecurrent population. In addition, the zebras are part of a predator-preyrelationship with lions, who kill k zebras each month on average.

Next, we must decide what information we would like to ascertain.In this case, we would like to know how this population of zebras willfare in the future. To do this, we need to make a mathematical model.We can choose one month as a unit of time, let p denote the populationof zebras, and choose r for the proportionality constant (aka growthrate) determining how fast they reproduce. Then we have the equation

Change in population per month =

(zebras born per month)− (zebras killed per month)

or mathematically

dp

dt= rp− k

This is a differential equation, and we can use it to determine thepopulation of zebras at a later time.

Example 2 (physics): Consider an object falling through the airnear sea level. Suppose we are interested in the velocity of said object,which we denote by v. Newton’s second law says F = ma, where F isthe force exerted on the object, m is its mass (measured in kilograms),and a is its acceleration (measured in meters/second2). We can rewritea = dv

dt, so we obtain the equation

F = mdv

dt

4

In addition, let’s consider the forces acting on the object: the forceof gravity is mg (here g = 9.8 m/s2). We can choose an orientation forour problem: in this case we choose v to represent downward velocity,so that the force due to gravity is a positive one (draw a picture).In addition, the object experiences drag force, the strength of whichdepends on the object we are considering. For example, a bowling ballfalls much faster than a signed photograph of Pete Weber, since thesecond object experiences much more air resistance while falling. Forour purposes we can assume the drag is proportional to the velocity:say the object has drag coefficient γ, which is measured in kg/s. Thenour model becomes

Total force =

(Force due to gravity)− (drag force)

or mathematically

mdv

dt= mg − γv

which we can also rewrite as

dv

dt= g − γv

mComment: in the previous examples, the quantities r, k,m, g, γ are

referred to as parameters. It is important to distinguish betweenthese and the variables p, x, t. By modifying our parameters appro-priately, we can make our two models applicable to a wide range ofsituations.

So far, we have derived two differential equations to model two verydifferent physical phenomena. How do we use these to determine in-formation about each system at a later time? We can proceed eitherqualitatively or quantitatively. First, the qualitative method: use di-rection fields (also referred to as a slope field).

To draw a direction field: think of the dependent variable as a func-tion of time. In the falling object example, this means writing v = v(t).Notice that by plugging in any value of v to the differential equation,we obtain a corresponding value for dv

dt. Interpreting dv

dtas the slope of

5

the tangent line to the graph v(t), we can draw a vector field in thetv-plane by evaluating [

dtdtdvdt

]=

[1

g − γvm

]at each point. Example: when v = 1 we get the vector

[1

g − γm

].

Suppose we take γ = 1 kg/s and m = 5 kg. Then we get the directionfield

Figure 1. Direction field for dvdt

= 9.8− 15v

Looking at this direction field, there is a special type of solutionthat we can pick out. When dv

dt= 0, the lines on the slope field will be

purely horizontal. In addition, v will be constant in time. As such wecall any such solution an equilibrium solution. In this case, we cansee analytically (by setting dv

dt= 0 in the differential equation) that the

equilibrium solution is given by v = 49. Physically, this corresponds tothe terminal velocity of our object.

6

Figure 2. Direction field for falling object with equilib-rium solution

Next, we draw a direction field for our zebra population. Takingk = 10 and r = 0.5 our differential equation becomes

dp

dt= 0.5p− 10

with slope field

Figure 3. Direction field for dpdt

= 0.5p− 10

Question: what is the equilibrium solution this time?

1.1 Suggested Problems: 5,7,9,21

7

§1.2: Solutions of Some DifferentialEquations

Next, use quantitative methods for solving differential equations.

Example 1 revisited: Returning to the zebra problem:

dp

dt= rp− k

Important trick: If we have a function f(t), then by the Chain

Rule ddt

(ln |f |) = df/dtf

(as long as f 6= 0 so both sides are defined). To

exploit this trick in our problem, rewrite the equation as

dp/dt

p− k/r= r

Then we apply the above trick to f = p− k/r and get

d

dtln(|p− k/r|) = r

We can integrate both sides of this equation with respect to t:

ln(|p− k/r|) = rt+ C

where C is a constant of integration. Remember we want to find p.The next trick: use eln(x) = x. Taking exponentials of each side:

|p− k/r| = ert+C

p− k/r = ±ert+C

p = k/r ± eCert

C is an undetermined constant, so we can rewrite our solution morenicely as

p = k/r + Cert

where C = ±eC .

A comment on constants: In the future we will manipulate undeter-mined constants more freely, simply rewriting C without the hat forany constant only depending on C (usually we do this to make thesolution look nicer, so there is some freedom in how we do this).

8

So we have found a family of solutions to our differential equation forany value of C. Note that p = k/r is the equilibrium solution, which

corresponds to C = 0. Returning to the model at hand, if C > 0, the

population grows exponentially in time, whereas if C < 0 the popula-tion will die out and the solution even becomes negative. (This showsthat we have to take care in interpreting solutions derived from ourmodel).

What was the original goal of our model? We wanted to be able todescribe the population at some later time. With a whole family ofsolutions, how do we know which one to choose? We need more infor-mation, as it stands the problem is underdetermined. We need to knowthe population of zebras at some starting time to know how the popu-lation will develop (a starting population of 10000 zebras will behavemuch differently than a starting population of 3 zebras). Restating thisidea mathematically, we need an initial condition. Combining thiswith our differential equation, the problem at hand becomes known asan initial value problem.

Back to the zebra problem: suppose the zebra population at a par-ticular time is measured to be 25 zebras and we want to know thepopulation 1 month later. In addition, we’ll use the values k = 10 andr = 0.5 from before. The usual convention is to take the initial condi-tion to be at time t = 0 (though other initial conditions are possible).Mathematically we can say that p(0) = 25 zebras and we want to knowp(1). We use this information to solve for C in the above solution.Plugging in:

p(0) = k/r + Cer·0 = 20 + C = 25

so that C = 5 and our solution is

p(t) = 20 + 5et/2

Thus the population after 1 month will be p(1) = 20 + 5e1/2 ≈ 28zebras.

We can also plot our solution on the direction field from before, andwe call the curve an integral curve. Note the way it follows the flow ofthe vectors. We can also represent the general solution (one without aninitial condition) by a family of integral curves starting at all possibleinitial conditions.

9

Figure 4. Integral curve for zebra population with ini-tial value p(0) = 25

Example 2: Returning to the falling object problem, we had thedifferential equation

dv

dt= g − γv

mUsing the same trick as before:

dv/dt

v −mg/γ= − γ

m

which can be rewritten as

d

dtln(|v −mg/γ) = − γ

mIntegrating:

ln(|v −mg/γ) = − γmt+ C

Taking the exponential of both sides:

|v −mg/γ| = eC−γmt

v −mg/γ = ±eC−γmt

v = mg/γ ± eC−γmt

10

Substituting in a different C, we can rewrite this as

v = mg/γ + Ce−γmt

If we take m = 5 kg, γ = 1 kg/s as before and plug in g = 9.8 m/s2

we get the family of solutions

v = 49 + Ce−t/5

Suppose we are dropping the object from a height of 100 meters andwant to know its speed when it hits the ground. Since the object startsat rest, the initial condition is v(0) = 0 m/s. Plugging in:

v(0) = 49 + C = 0

so that C = 0. Thus the solution for v is

v(t) = 49(1− e−t/5)

To find the speed when the object hits the ground we need a functiondescribing the object’s position x(t). However, we know

v(t) =dx

dt= 49(1− e−t/5)

and we have an initial condition of x(0) = 0 (because of the orienta-tion we chose initially). Integrating:

x(t) = 49t+ 245e−t/5 + C

Using the initial condition:

x(0) = 245 + C = 0

so that C = −245. To find the time when the object hits the ground,we set x(T ) = 100 and solve for T :

49t+ 245e−t/5 − 245 = 100

Solving numerically: T ≈ 5.3131, and so the velocity at time of im-pact is v(5.3131) ≈ 32.07 m/s.

1.2 Suggested Problems: 1,5,9

11

§1.3: Classification of DifferentialEquations

Within the broad class of differential equations, there are many im-portant distinctions that distinguish various equations from one an-other.

Number of variables: So far all equations have been for functionsof just one variable: t. Any differential equation for a function ofone variable is called an ordinary differential equation. These arethe main focus of 22B. However, there are also differential equationsfor functions of more than one variable. In this case, all derivativesare in fact partial derivatives, and so the equations are called partialdifferential equations. Examples of each:

d2θ

dt2+g

Lθ = 0

is an ordinary differential equation describing the small-angle ap-proximation of a pendulum’s motion.

∂2u

∂x2= c2

∂2u

∂t2

is a partial differential equation known as the wave equation, describ-ing the motion of a vibrating string whose position u(x, t) is a functionof both space and time.

Number of equations So far, we have considered only one equa-tion at a time. However, sometimes physical systems can be describedby more than one equation. As a simple example, we can view thedescription of a falling object’s position and velocity as a system oftwo equations:

dx

dt= v

dv

dt= g − γv

mOther systems of equations are Maxwell’s equations in electromag-

netics and the Navier-Stokes equations from fluid dynamics. These areboth systems of partial differential equations.

Order: The order of a differential equation is the order of the highestderivative appearing in it. The equations from earlier are both first

12

order, the wave equation is second order. In higher-order ordinarydifferential equations (say for a function y(t)), it is often easiest towrite y′ instead of dy

dt, especially for higher-order derivatives. Example:

y′′′′ + y′′y + 2ty = 0

is a fourth-order ordinary differential equation. A general nth-orderdifferential equation can be written as

F (t, y(t), y′(t), ..., y(n)(t)) = 0

We assume that such an equation can be solved for the highest de-rivative, that is that we can write it in the form

y(n)(t) = f(t, y(t), ..., y(n−1)(t))

Linear vs. Nonlinear: Suppose we are given the differential equa-tion

F (t, y(t), y′(t), ..., y(n)(t)) = 0

This equation is linear if F is a linear function of y, y′, ..., y(n). Recallthat a function F (x) is linear if F (x1 + cx2) = F (x1) + cF (x2) for anynumber c. Examples:

y′′ + y = 0

is linear, but

(y′)2 + 2y = 0

is nonlinear. The small-angle approximation for a pendulum men-tioned above is a linearization of the nonlinear equation

d2θ

dt2+g

Lsin θ = 0

A general linear ordinary differential equation of order n takes theform

a0(t)y(n) + a1(t)y

(n−1) + ...+ an(t) = g(t)

13

Important Comments/Questions:

What does it mean to solve a differential equation? Typically we saya solution to

y(n)(t) = f(t, y(t), ..., y(n−1)(t))

on the interval a < t < b is a function y where all the derivatives upto order n exist and satisfy the equation for each t between a and b. Ifwe have a solution for some values of t (for example 0 < t < 1), thenwe say it is a local solution. If the solution is valid for all values oft (this can mean 0 ≤ t < ∞ or −∞ < t < ∞ depending on context)then we say it is a global solution.

Generally all the equations we consider will be for real-valued func-tions, but there are also differential equations for complex-valued func-tions too. A particularly famous example is the Schrodinger equationof quantum mechanics, a second-order partial differential equation fora complex-valued function.

Two fundamental questions we ask about many differential equationsare that of existence and uniqueness. That is, can we guaranteethat a particular differential equation has a solution? In addition, ifwe know there is a solution, can we ensure that it is in fact the only one?

Finally, it is usually easy to check that a function satisfies a givendifferential equation: all you have to do is plug it in. However, to actu-ally find such a function is often much harder, and we will devote muchof the remainder of this course to techniques for finding solutions tomany different types of differential equations. Keep in mind the vari-ous classifications discussed in this section, as they will often play animportant role in determining what technique should be used to solvea problem.

1.3 Suggested Problems: 2,3,5,8,12,17,20,22,27

§2.1: Linear Equations and the Methodof Integrating Factors

Chapter 2 focuses on first-order ordinary differential equations (ODEs),which we can write as

14

dy

dt= f(t, y)

In this section, we work with linear first-order ODEs. In the lastchapter, we only considered constant-coefficient ODEs, but in this sec-tion, we now allow them to depend on time. The general form of suchan equation is

P (t)dy

dt+Q(t)y = G(t)

However, assuming P (t) 6= 0, we can divide by it and instead obtainthe equivalent ODE

dy

dt+ q(t)y = g(t)

where q(t) = Q(t)P (t)

and g(t) = G(t)P (t)

. Equations of this form can be

solved by using an integrating factor. To illustrate the idea, con-sider a particular example:

Example: Find the general solution of the equation

t3dy

dt+ 3t2y = 2t4

Important Trick: The left-hand side of this equation is an exactderivative (meaning it is dg

dtfor some particular function g(t)). By the

Product Rule

d

dtt3y = t3

dy

dt+ 3t2y

so we can rewrite the equation as

d

dt(t3y) = 2t4

Compare this to the trick from section 1.2: even though the methodis different, the end result is the same. The goal in each instance isto rewrite one side of the equation as a derivative, then integrate bothsides. In this case, we obtain

t3y =2

5t5 + C

or

y =2

5t2 +

C

t3

15

This is the general solution to the ODE: if we were given an initialcondition we could use it to find the particular solution.

Unfortunately, not all first-order linear ODEs have a left-hand sidethat simplifies to a derivative of some quantity immediately. Returningto the general equation

dy

dt+ q(t)y = g(t)

we would like to rewrite the left-hand side as an exact derivative.In the present state of our ODE, we cannot do this without furtherassumptions on q and g. Instead, we multiply the entire equation by afunction µ(t), which we will determine later to make the left-hand sidean exact derivative. Doing this:

µ(t)dy

dt+ µ(t)q(t)y = µ(t)g(t)

The left-hand side will become an exact derivative if

dµ

dt= µ(t)q(t)(1)

In this case we can rewrite the left-hand side as

µ(t)dy

dt+dµ

dty =

d

dt(µ(t)y)(2)

Let’s temporarily assume that µ is positive, and solve (1) to deter-mine what µ must be (it’ll turn out our assumption is valid automati-cally). Dividing both sides of (1) by µ

d

dt(lnµ(t)) = q(t)

Integrating both sides:

ln(µ(t)) =

∫q(t)dt+ k

where k is some constant of integration. Taking the exponential ofboth sides:

µ(t) = e∫q(t)dt+k = eke

∫q(t)dt

or renaming our constant

µ(t) = ke∫q(t)dt

16

Let’s return to the original ODE. Using (2) we obtain for our choiceof µ

d

dt(µ(t)y) = µ(t)g(t)

Integrating both sides:

µ(t)y =

∫µ(t)g(t)dt+ C

for some constant C. Dividing both sides by µ(t) and plugging inour choice of µ, we get

y =

∫e∫q(t)dtg(t)dt+ C

ke∫q(t)dt

=

∫e∫q(t)dtg(t)dt+ C

e∫q(t)dt

where in the last equality we divide everything by k and absorb itinto the constant C. Thus our final answer is

y =

∫µ(t)g(t)dt+ C

µ(t)for µ(t) = e

∫q(t)dt

Note that we do not need to worry about the constant of integrationwhen we evaluate µ, we eliminated it in our derivation.

Example 1: Use an integrating factor to find the general solutionof

ty′ + 2y = sin t

for t > 0. Step 1: put the equation in the proper form (make thecoefficient of y′ equal 1). Dividing the equation by t, we obtain

y′ +2

ty =

sin t

tStep 2: find the integrating factor. In this case q(t) = 2

t, so from our

equation

µ(t) = e∫

2tdt = e2 ln t = eln(t

2) = t2

Step 3: Multiply the equation by µ(t). Doing this, our ODE becomes

t2y′ + 2ty = t sin t

which we can rewrite as

d

dt(t2y) = t sin t

17

Step 4: Integrate both sides (integrating by parts on the right-handside):

t2y = −t cos t+ sin t+ C

Step 5: divide by the integrating factor to find y:

y =−t cos t+ sin t+ C

t2

Note that Steps 3-5 are essentially just plugging in µ to the equationfor y that we already derived, so memorizing the formula will sufficetoo. However, it’s a good idea to do things this way as practice forproblems you’ll be seeing in the future.

Example 2: Solve the initial value problem (IVP)

ty′ + 2y = t2 − t+ 1 y(1) =1

2t > 0

Step 1: put the equation in the desired form by dividing everythingby t:

y′ +2

ty = t− 1 +

1

tStep 2: find the integrating factor. Here we have q(t) = 2

t, so our

integrating factor is the same as in Example 1:

µ(t) = t2

Step 3: multiply the ODE by µ(t):

t2y′ + 2ty = t3 − t2 + t

or equivalently

d

dt(t2y) = t3 − t2 + t

Step 4: Integrate both sides:

t2y =1

4t4 − 1

3t3 +

1

2t2 + C

Step 5: divide by the integrating factor

y =1

4t2 − 1

3t+

1

2+C

t2

Step 6 (only for initial value problems): use the initial value to solvefor C. Plugging in:

18

y(1) =1

4− 1

3+

1

2+ C =

1

2Solving for C we see that C = 1

12and so the solution of our IVP is

y =1

4t2 − 1

3t+

1

2+

1

12t2

2.1 Suggested Problems: 1,10,12,14,19

§2.2: Separable EquationsFor this section, we make a change of notation: instead of writing

the independent variable as t like before, we now choose it to be x(we’ll use t for something else later). Thus our first-order ODEs nowtake the form

dy

dx= f(x, y)

Next, we let M(x, y) = −f(x, y) and N(x, y) = 1 to rewrite theabove differential equation as

M(x, y) +N(x, y)dy

dx= 0

Any first-order ODE may be written in this form, but now we makea restriction to solve a particular class of equations. Namely, we as-sume that M only depends on x and N only depends on y. Thus theassumption is that our equation has the form

M(x) +N(y)dy

dx= 0

Such an equation is called separable. This is because we can rewriteit in the differential form

M(x)dx = −N(y)dy

thus separating the x and y variables on opposite sides of the equalssign. Now some examples:

Example 1: Consider the differential equation

dy

dx=

x2

1 + y2

This equation is separable, since we can write it as

19

−x2 + (1 + y2)dy

dx= 0

To solve the equation, we want to write the left-hand side of theprevious expression as d

dxof some function. Notice that by the Chain

Rule

d

dxf(y) = f ′(y)

dy

dxApplying this to f(y) = y + y3/3 we can write

(1 + y2)dy

dx=

d

dx(y + y3/3)

In addition −x2 = ddx

(−x3/3). Combining these, we obtain

d

dx

(−x

3

3+ y +

y3

3

)= 0

Note: this is a trick similar to those we’ve seen before. We havewritten the left-hand side of the equation as an exact derivative, andnow we integrate (now with respect to x). Doing this:

−x3

3+ y +

y3

3= C

This gives us an equation for the integral curves of our equation,which we can now plot. Note: in this case we cannot explicitly solvefor y as a function of x, though in other cases we may be able to doso. Either way, a computer can plot the implicitly-defined family ofsolutions we have derived.

Before moving on to another example, let’s obtain a general formulafor solutions to the equation

M(x) +N(y)dy

dx= 0

To do this, we need to first find antiderivatives for M and N (as wedid in the last example). Let P be an antiderivative for M and Q anantiderivative for N . That is P ′(x) = M(x) and Q′(y) = N(y). Thenjust as in the preceding example, we have

d

dx(P (x) +Q(y)) = 0

so that the integral curves of solutions are given implicitly by theequation

20

Figure 5. Integral Curves for dydx

= x2

1+y2for various

values of C

P (x) +Q(y) = C(3)

for various values of C. Now, suppose we wish to solve an initialvalue problem: that is, we’re given the initial condition

y(x0) = y0

Then our particular solution lies on whichever integral curve passesthrough the point (x0, y0). To find this curve, we first solve for C byplugging in x0 and y0 to our general solution:

C = P (x0) +Q(y0)

Plugging this value for C into (3) and rearranging yields

P (x)− P (x0) +Q(y)−Q(y0) = 0

Finally, we would like to express this equation in terms of our originalfunctions M and N . Recall that P ′(x) = M(x) and Q′(y) = N(y) anduse the Fundamental Theorem of Calculus twice to obtain∫ x

x0

M(s) ds+

∫ y

y0

N(s) ds = 0

This is still an implicit equation, but in the most general situation itis the best we can do. Only in particular cases can we expect to solve

21

for y explicitly as a function of x.

Example 2: Use the methods we’ve seen so far to solve a specificinitial value problem. Consider

dy

dx=e−x − ex

3 + 4yy(0) = 1

First we rewrite the ODE in the standard form:

ex − e−x + (3 + 4y)dy

dx= 0

Next, take the antiderivative of both M(x) = ex − e−x and N(y) =3 + 4y to obtain

d

dx

(ex + e−x + 3y + 2y2

)= 0

Integrating with respect to x, our general solution is given by

ex + e−x + 3y + 2y2 = C

To find the solution corresponding to the initial condition y(0) = 1we plug in x = 0 and y = 1 to the preceding equation to obtain

e0 + e0 + 3 · 1 + 2 · 12 = 7 = C

Thus our solution satisfies the implicit equation

ex + e−x + 3y + 2y2 = 7

This is a quadratic equation in y, so we can solve it explicitly. Rewrit-ing in a way to make this more apparent, we are searching for roots ofthe equation

2y2 + 3y + (ex + e−x − 7) = 0

Using the quadratic formula, we know that the roots are given by

y =−3±

√9− 8(ex + e−x − 7)

4= −3

4± 1

4

√65− 8(ex + e−x)

This is multi-valued: in order to determine which root to choose,we should check which one satisfies the initial condition. Plugging inx = 0 and y = 1 we get 1 = −3

4± 7

4, so the solution must be

y = −3

4+

1

4

√65− 8(ex + e−x)

22

Important Note: This solution is not everywhere-defined (at leastas a real number), in fact if 8(ex + e−x) > 65 (which happens say forx > 3) then we no longer have a solution. This illustrates an importantpoint: many differential equations (especially nonlinear ones) have so-lutions that only persist for some finite amount of time. This is animportant consideration to take into account when using differentialequations to model physical phenomena.

Example 3:Solve the initial value problem

y′ =1 + 3x2

3y2 − 6y, y(0) = 1

Let’s solve the ODE using an alternative method. Separating vari-ables:

3y2 − 6y dy = 1 + 3x2 dx

Integrating each side yields

y3 − 6y2 = x+ x3 + C

Rearranging:

−x− x3 + y3 − 3y2 = C

To find C, we plug in the initial condition y(0) = 1:

−0− 0 + 1− 3 = −2 = C

Thus the curve of our solution is given implicitly by the equation

y3 − x3 − 3y2 − x+ 2 = 0

On what interval is this solution valid? Notice that the denominatorof the right-hand side of the original ODE is undefined when 3y2 −6y = 0, that is when y = 0 or y = 2. If we plug each of these in tothe equation for the integral curve we just found, we obtain the twoequations x3 + x = ±2. One solution of this equation is x = 1, so weknow that there is a singularity in our solution (meaning dy

dxapproaches

infinity there) at this point. A plot of the implicit equation confirmsthis:


23

Figure 6. Plot of y3 − x3 − 3y2 − x+ 2 = 0

§2.3: Modeling with First OrderEquations

This section serves as an extension of §1.1, focusing on constructingmathematical models involving differential equations for various ap-plications and subsequently solving the relevant ODEs to determinephysical behavior of the system. Let’s look at some examples:

Example 1: A tank with a capacity of 500 gal originally contains200 gal of water with 100 lb of salt in solution. Water containing 1 lbof salt per gallon is entering at a rate of 3 gal/min and the mixtureflows out of the tank at a rate of 2 gal/min. Find the amount of salt inthe tank at any time prior to the moment the tank overflows. Find theconcentration (in pounds per gallon) of salt at the moment of overflow.If the tank were infinite, what concentration would we approach?

First, draw a picture. First goal: find how the amount of salt ischanging with respect to time. Using t to denote time (measured inminutes) and Q to denote the quantity of salt (measured in pounds),we have

24

(Change in salt per minute)

= (Amount of salt entering tank per minute)−(Amount of salt leaving tank per minute)

(Amount of salt entering per minute)

= (Amount of water entering per minute)·(Concentration of salt in water)

= 3gal/min · 1lb/gal = 3lb/min

(Amount of salt leaving tank per minute)

= (Amount of water leaving per minute)·(Concentration of salt in water)

(Concentration of salt in water)

= (Amount of salt)/(Amount of water)

= Q/(200 + t)

since the net gain of water is 1 gal/min (here we assume that thetank hasn’t overflowed yet). Returning to the original equation, weobtain the ODE

dQ

dt= 3− 2Q

200 + tWe can solve this using an integrating factor:

µ(t) = e∫

2200+t = e2 ln(200+t) = (200 + t)2

Multiplying the equation by the integrating factor gives

d

dt

((200 + t)2Q

)= 3(200 + t)2

Integrating with respect to t:

(200 + t)2Q = (200 + t)3 + C

Solving for Q:

25

Q = 200 + t+C

(200 + t)2

To find C, we use the initial condition Q(0) = 100:

100 = 200 +C

2002

so that C = −4000000. Thus the amount of salt in the tank at timet (prior to overflow) is given by

x(t) = 200 + t− 4000000

(200 + t)2

At the time of overflow, t = 300. Plugging in:

Q(300) = 500− 4000000

5002= 484 lb

It follows that the concentration at the time of overflow is 484/500lb/gal. If the tank were infinite, our solution would apply for all valuesof t, so to find the concentration we can take

limt→∞

Q(t)

200 + t= lim

t→∞1− 4000000

(200 + t)3= 1

Example 2: Continuously compounded interest: A young entrepreneurborrows $8000 to start his own horse massage company. The lendercharges interest at an annual rate of 10% compounded continuouslyand the entrepreneur makes continuous payments at a constant annualrate of k. The lender requires that the loan be paid off in three years,or else the entrepreneur has to give up his horse massage business andbecome a math TA instead. What payment rate will allow the equineenthusiast to continue following his dream? Also, how much interestmust be paid during the three year period?

Use S to denote the amount of the loan (measured in dollars) and tto denote time (measured in years). Then we have

(Change in loan amount)

= (Interest accumulated)− (Amount paid off)

The interest accumulated will be the interest rate times the currentloan value, and the amount paid off is the constant annual rate k dollarsper year. This gives the differential equation

26

dS

dt= 0.1S − k

Dividing both sides by S − 10k:

dS/dt

S − 10k= 0.1

Rewriting:

d

dt(ln |S − 10k|) = 0.1

Integrating with respect to t:

ln(|S − 10k|) = 0.1t+ C

Exponentiating both sides:

|S − 10k| = e0.1t+C

Solving for S and manipulating constants:

S = 10k + Ce0.1t

Plugging in the initial condition S(0) = 8000 we see that C = 8000−10k. We want S(3) = 0. Plugging this in we obtain

0 = 10k + (8000− 10k)e0.3

Rearranging and doing some more algebra:

10k

10k − 8000= e0.3

1 +8000

10k − 8000= e0.3

10k − 8000

8000=

1

e0.3 − 1

k =800

e0.3 − 1+ 800 ≈ $3086.64 per year

Multiplying this amount by 3 and subtracting 8000, we see that theamount of interest accumulated is $1259.92

Example 3: Escape Velocity: Suppose we project a body of massm away from the earth in a direction perpendicular to the earth’s sur-face with initial velocity v0. Neglecting air resistance, but taking intoaccount the variation of the strength of the earth’s gravitational field

27

depending on the distance of the object from the earth, find the ob-ject’s velocity as a function of time. Find the initial velocity requiredto lift the body to some maximum height height ξ, and find the object’sescape velocity: the least initial velocity for which the object will notreturn to the earth.

Setup: Orient our axes so the object is projected in the positive x-direction, and we’ll use x to denote the distance of the object from theearth (draw a picture).

Important information: the gravitational force (i.e. the weight) act-ing on the body is inversely proportional to the square of the distancefrom the object to the center of the earth. Using R to denote the radiusof the earth (R ≈ 3959 miles) we have

w(x) =−k

(x+R)2

for some constant k (with a minus sign because the force pulls in thenegative x-direction). Since w(0) = −mg, it follows that k = mgR2 so

w(x) =−mgR2

(x+R)2

Since we neglect air resistance, this is the only force acting on theobject and the equation F = ma becomes

mdv

dt=−mgR2

(x+R)2

Now, we have too many variables: x, v, and t. To fix it, let’s get ridof t (since it’s not important for our purposes anyways). Thinking ofv as a function of x and x as a function of t, the Chain Rule gives

dv

dt=dv

dx

dx

dt=dv

dxv

which gives the new equation

vdv

dx=−gR2

(x+R)2

This is a separable equation: we can write it as

d

dx

(− gR2

x+R+v2

2

)= 0

An integration gives

28

v2

2=

gR2

x+R+ C

Even though we’ve switched our dependent variable from t to x, ourinitial condition still reads v(0) = v0 (but now it means v|x=0 instead

of v|t=0). Using this initial condition we obtain C =v202− gR, so our

solution is

v = ±√v20 − 2gR +

2gR2

x+RThe plus sign occurs when the object is moving away from the earth,

and the minus sign when it is falling back to earth. If the object reachesits maximum height ξ, then it satisfies v = 0, x = ξ in doing so.Plugging these in to our solution:

0 =

√v20 − 2gR +

2gR2

ξ +R

We can solve this for v0 to find the required initial velocity to lift theobject to a maximum height of ξ:

v0 =

√2gR

(1− R

ξ +R

)=

√2gR

ξ

ξ +R

For the escape velocity, we want to take the maximum height ξ toinfinity. Doing this, we obtain

v0 =√

2gR


§2.4: Linear vs Nonlinear EquationsReturning to questions posed in §1.3: when can we establish exis-

tence and uniqueness of solutions to general differential equations?In this section, we’ll only focus on first-order ODEs, but later on we’llexamine the question even more generally. It turns out the answer tothis question depends heavily upon whether the differential equationbeing considered is linear or nonlinear.

29

First let’s focus on first-order linear ODEs, which in §2.1 we wrotein the following way:

y′ + p(t)y = g(t)

In this case, we only need p and g to be continuous to obtain exis-tence and uniqueness to a solution of the IVP given by combining theprevious ODE with any initial condition for y. More formally:

Theorem 1. Suppose p and g are continuous on an open interval I :α < t < β containing the point t = t0. Then there exists a uniquefunction y = φ(t) satisfying the differential equation

y′ + p(t)y = g(t)

for each t in I, and also satisfies the initial condition

y(t0) = y0

Comments:

(1) We essentially already proved this theorem in §2.1 when wederived the formula for a solution using an integrating factor:

y(t) =1

µ(t)

[∫ t

t0

µ(s)g(s) ds+ C

]In this problem t0 corresponds to the time of our initial value,

and we would choose C = y(t0) = y0.

(2) This theorem tells us that if p and g are continuous for all val-ues of t, then we get a global solution to our IVP.

(3) Oftentimes, the largest interval on which p and g are continu-ous (containing t0) is called the interval of validity for theIVP, since it is the largest interval where we know there is aunique solution to the problem. For example: if p(t) = 1

t−1 ,g(t) = tan t, and the initial condition is given at t = 0, then theinterval of validity is (−π/2, 1). If instead the initial conditionis given at t = 3/2 (with the same choices of p and g), then theinterval of validity is (1, π/2).

(4) Finally, it is important that the coefficient in front of y′ in theODE is 1 in order for this result to hold. For example, considerthe IVP

xy′ + y = 1 y(0) = 0

30

As written, p(t) = g(t) = 1, so both are continuous. However,plugging in x = 0 to the ODE gives y(0) = 1, which contradictsthe given initial condition (so there can be no solution).

Next, we consider first-order ODEs that are possibly nonlinear. Inthis case, the ODE can be written in its most general form as

y′ = f(t, y)

In this case, we need stronger assumptions on f to ensure existenceand uniqueness of a solution. Rather than just being continuous, werequire that f and ∂f

∂yboth be continuous. Formalizing this in a theo-

rem:

Theorem 2. Let f and ∂f∂y

be continuous in some rectangle α < t < β,

γ < y < δ containing the point (t0, y0). Then in some interval t0−h <t < t0 + h contained in α < t < β there is a unique solution y = φ(t)of the initial value problem

y′ = f(t, y), y(t0) = y0

Comments:

(1) In this case, we may not have an explicit formula for the solu-tion as we did in the linear case. As a result, proving this resultis quite a bit more difficult (and we will return to it in greaterdetail later in the chapter).

(2) If we apply this theorem to the linear problem given before,then f(t, y) = g(t) − p(t)y, so saying f and ∂f

∂yare continuous

is exactly the same as saying p and g are continuous (just likein the other theorem).

(3) If we only assume that f is continuous (but not necessarily ∂f∂y

,)

then we get the existence of a solution (but we may lose unique-ness).

(4) Notice that in this case we only obtain existence and uniquenessover some time interval, not necessarily over the entire intervalα < t < β as we did before. This is because the solution y mayleave the interval γ < y < δ before t reaches the endpoints ofits interval.

A corollary of this results is that the graphs of two solutions thatsolve the same ODE cannot cross at any point. If they did cross atsome point (t1y1), then the IVP with initial value given by y(t1) = y1

31

would have more than one solution, contradicting the uniqueness por-tion of the theorem.

Practice: Find the interval of validity for the IVP

(4− t2)y′ + 2ty = 3t2, y(1) = −3

Example 1: Can we guarantee existence and uniqueness of a solu-tion to the IVP

y′ =√

1− t2 − y2 y(0) = 1

This ODE is nonlinear, so we need to apply Theorem 2. In this case,f(t, y) =

√1− t2 − y2. This function is continuous for t2 + y2 ≤ 1. In

addition ∂f∂y

= −y(1− t2 − y2)−1/2, which is continuous on t2 + y2 < 1.

However, the point (0, 1) in the ty-plane is excluded from the secondregion, so we cannot apply the existence and uniqueness result. (Infact we can’t even guarantee existence because we cannot enclose theinitial condition in an open rectangle on which f is continuous).

Example 2: Consider the IVP

dy

dt= y2, y(0) = y0 > 0

Notice that the hypotheses of Theorem 2 apply, since f(t, y) = y2 and∂f∂y

= 2y are continuous for all values of t and y. Let’s solve the IVP

explicitly. This is a separable equation, so we solve it using the methodsof §2.2. Rewriting:

dy

y2= dt

Integrating both sides, we obtain:

−1

y= t+ C

Solving for y:

y = − 1

t+ CPlugging in the initial condition we obtain C = −1/y0, and so the

solution to the IVP is

y =1

1/y0 − t

32

Notice that this solution only exists for 0 < t < y0, so we don’t haveglobal existence (even though f and ∂f

∂yare continuous everywhere).

This shows the importance of initial conditions in dictating the per-sistence of a solution of a nonlinear problem. More specifically, thesmaller the magnitude of the initial condition, the longer the solutionis likely to last.

Example 3: Let’s try to solve the IVP

y′ =√y, y(0) = 0

Again we have a nonlinear ODE, and in this case f is continuous but ∂f∂y

is discontinuous at y = 0. Again, this equation is separable, so solvingit explicitly gives

dy√y

= dt

2√y = t+ C

y =

(t+ C

2

)2

Using the initial condition we see that C = 0, so the solution is

y =t2

4

However, another solution is given by y ≡ 0 (this can be checkedby plugging in this solution to the differential equation explicitly). Soin this case we lose uniqueness of the solution, and in fact there is aninfinite family of solutions to this particular IVP. This illustrates animportant point: when solving linear first-order ODEs, we obtain asolution with a constant in it, giving us the general solution: i.e. allpossible solutions of the ODE. However, with nonlinear problems thisis not necessarily possible: in the preceding example there are manyother solutions besides the general solution we obtained by separatingvariables.


33

§2.5: Autonomous Equations andPopulation Dynamics

This sections considers another special class of first-order ODEs.They are known as autonomous equations, and they are characterizedby the fact that the independent variable does not appear explicitly.This means they have the general form

dy

dt= f(y)

These equations are separable, so the methods of §2.2 can be ap-plied to them, but instead we will consider more qualitative methodsto study the behavior of autonomous equations without necessarilysolving them explicitly. Such equations are commonly seen in modelsof populations, so we will also investigate some of these applications.

Example: Suppose we have a population that grows at a rate pro-portional to the current number of organisms (now without any preda-tors as in the example from §1.1). Then the differential equation gov-erning the behavior of this population is

dy

dt= ry

where r is the growth rate (or decline rate if r < 0). If our initialpopulation is given by y(0) = y0, then the solution to this IVP is

y(t) = y0ert

As discussed previously, this model is unrealistic over long timescales. What can we do to obtain a better model?

One questionable assumption we made is that the growth rate isconstant regardless of the current population. It would make moresense that as a species becomes overpopulated its growth rate wouldslow due to competition for limited resources. Thus we can modify thepreceding differential equation by writing

dy

dt= h(y)y

where h is some given function depending on the species, environ-ment, etc. What will constitute a good choice for h? Ideally, we wouldstill like it to approximate the fixed growth rate r when the popula-tion is relatively small, but as the population grows we would like it to

34

decrease towards 0, and when we exceed the carrying capacity (i.e.the max allowable population) we would like h(y) to become negativeto bring the population back down below said carrying capacity.

A particularly simple function satisfying the above requirements ish(y) = r − ay for some a > 0. In this case the differential equationbecomes

dy

dt= (r − ay)y

and this equation is known as the logistic equation. If we writeK = r

awe can equivalently write this ODE as

dy

dt= r

(1− y

K

)y

In this case we call r the intrinsic growth rate, because it is thepopulation’s growth rate prior to any limiting factors (i.e. scarcity ofresources). To determine the meaning of K, we must first analyze thelogistic equation a bit more. First, the equilibrium solutions: they aregiven by y = 0 and y = K.

To determine behavior of other solutions to this equation, we usequalitative graphical methods. The first step for a general autonomousequation is to plot the function f(y) as a function of y. In our case:

35

Figure 7. Plot of f(y) = r(1− y

K

)y for r = 0.5 and

K = 100

Notice that the graph intersects the x-axis precisely at the two equi-librium solutions. In addition, by symmetry the maximum occurs aty = K/2, and plugging in we see that f(K/2) = rK/4. In addition,points where f > 0 corresponds to values of y for which y is increasing,and points where f < 0 correspond to points where y is decreasing. Inaddition, the rate of increase is fastest at the maximum of f , whichoccurs when y = K/2. Using this information, we can qualitativelysketch some trajectories in the ty-plane. Solutions with initial valuey0 < K will increase until they approach (but do not exceed) K, andsolutions with y0 > K will decrease until they approach K (but alwaysremain above the line y = K). Note that we ignore solutions withy0 < 0 since they are not physically meaningful. However, a mathe-matical question: what would happen if y0 < 0 based on the graph?

Comment: Note that solutions starting with y0 6= K never actuallyintersect the equilibrium solution y = K, they just come very close toit. In fact they cannot intersect it without contradicting the uniquenessportion of the theorem from §2.4.

Some other information: the graph of y(t) will be concave up wheny′′ > 0. But

36

d2y

dt2=

d

dt

dy

dt=

d

dtf(y) = f ′(y)f(y)

so the graph of y(t) is concave up precisely when f is increasing, andconcave down when f is decreasing. In addition, the inflection pointsoccur when f ′ = 0 (in this case at y = K/2).

Next, let’s solve the logistic equation exactly to make these state-ments more precise. Since it is separable, we can write (for y 6= 0, K)

dy

(1− y/K)y= rdt

and integrate both sides (using partial fractions on the left-handside). Doing this (and some algebra):(

1

y+

1/K

1− y/K

)dy = rdt

ln |y| − ln |1− y/K| = rt+ C

If 0 < y < K, we can remove the absolute value signs and write

y

1− y/K= Cert

Plugging in the initial condition now gives

C =y0

1− y0/K=

Ky0K − y0

Solving for y:

y = (1− y/K)Cert

y

(1 +

C

kert)

= Cert

y =Cert

1 + CKert

=Ky0e

rt

(K − y0) + 1Kert

=Ky0

(K − y0)e−rt + y0

Notice that for any initial value y0 > 0 we have limt→∞ y(t) = K. Inthis situation, we say that the equilibrium solution y ≡ K is asymp-totically stable. In contrast, the equilbrium solution y ≡ 0 is an un-stable equilibrium solution, since even for starting values y0 veryclose to 0 and positive, we have that y(t) converges to K as t → ∞.The only way the solution stays near 0 is if y0 is exactly equal to 0.

37

Comment: We can also analyze the stability of an equilibrium so-lution without even looking at the graph of f . Suppose the equilibriumpoint occurs at y = y∗. Then if f ′(y∗) < 0 the equilibrium solutiony = y∗ is asymptotically stable. If f ′(y∗) > 0, it is unstable. (Note: iff ′(y∗) = 0, the results are inconclusive).


§2.7: Numerical Approximations andEuler’s Method

Recall from §2.4 that even if a first-order nonlinear ODE is guaran-teed to have a solution (say by the existence and uniqueness theorem)there is no guarantee of an explicit solution, or even an analyticalmethod for determining some sort of implicit solution. As a result, it isimportant to be able to solve these problems in other ways. Though wemay not be able to obtain an analytical solution that solves the equa-tion exactly, in many real-world applications it is sufficient to solve theequation up to a very small error (very small relative to the problemat hand). Using this principle, we can solve differential equation withvery high accuracy on a computer using Euler’s method as well as othernumerical methods.

The main idea behind Euler’s method originates from the slope fieldswe drew back in Chapter 1. In that case we could use the slope fieldto draw qualitatively what solution curves might look like. We wouldlike to make this method more accurate, and also be able to calculateour solutions in a straightforward manner.

For our purposes, we assume we are considering a general first-orderODE having the form

dy

dt= f(t, y)

Suppose we are given an exact solution y(t) of this differential equa-tion up to time t0. In addition, suppose that we have the value y(t0) =y0. How can we use this value to get a good approximation of thesolution near the point (t0, y0) without having to explicitly solve theequation? The answer comes from the definition of the derivative. Re-call that

38

dy

dt

∣∣∣∣t=t0

= limh→0

y(t0 + h)− y(t0)

h

Suppose instead of taking the limit, we fix h to be some very smallvalue (our choice of h can depend on how accurate we want our ap-proximation to be). Then we will have

dy

dt

∣∣∣∣t=t0

≈ y(t0) + h− y(t0)

h

But by our differential equation, dydt|t=t0 = f(t0, y0), so we can write

f(t0, y0) ≈y(t0 + h)− y(t0)

hor rearranging

y(t0 + h) ≈ y(t0) + hf(t0, y0) ≈ y0 + hf(t0, y0)

Thus using the value of y at the point (t0, y0) and our differentialequation is enough to “step forward” in time to obtain the value ofy at a slightly later time t0 + h. In addition, if we denote our aboveapproximation for y(t0 + h) by y1, then we can repeat the process tocontinue our approximate solution even further by writing

y(t0 + 2h) ≈ y1 + hf(t0 + h, y1)

Thus by repeating this process many times, we can continue thesolution as far as we would like (assuming the behavior of f is not toobad). If we write the nth point in the iteration process as (tn, yn) andf(tn, yn) = fn, then the formula for the n+ 1th point is given by

yn+1 = yn + fn · (tn+1 − tn)

Notice that in this case tn+1− tn = h for all n. In addition, given anIVP

dy

dt= f(t, y), y(t0) = y0

we may begin the process at the initial value to obtain an approxi-mate solution without any additional information.

There is a more geometric way to think of this process: as a tangentline approximation (sometimes referred to as linear approximation). Insolving for y(t0 + h) we have really just drawn the tangent line to thegraph of y at (t0, y0) (which has slope dy

dt|t=t0 = f(t0, y0)) and solved for

39

y(t0 + h) using the point-slope formula. (Draw a picture). Notice thatzooming in on any individual point always makes a graph look mostlylike a straight line, this is why the tangent-line approximation is a goodone for small h. However, as you zoom out, the curves in the graphappear once more, corresponding to higher-order derivatives. For thisreason, the tangent line approximation becomes inaccurate if h is notsmall enough.

Figure 8. Tangent line approximation of y = lnx atx = 1

40

Next, let’s use Euler’s method to find an approximate solution to adifferential equation.

Example: Consider the IVP

y′ = 3 cos t− 2y y(0) = 0

Find and plot the solution at t = 0.1, 0.2, 0.3, 0.4 using both a step sizeof h = 0.1 and h = 0.05. In this case f(t, y) = 3 cos t− 2y. First, withh = 0.1:

y(0.1) = y(0) + 0.1f(0, 0) = 0.3

y(0.2) = y(0.1) + 0.1f(0.1, 0.3) ≈ 0.5385

y(0.3) = y(0.2) + 0.1f(0.2, 0.5385) ≈ 0.7248

y(0.4) = y(0.3) + 0.1f(0.3, 0.7248) ≈ 0.8664

Next, with h = 0.05:

y(0.05) = y(0) + 0.05f(0, 0) = 0.15

y(0.1) = y(0.05) + 0.05f(0.05, 0.15) ≈ 0.2848

y(0.15) = y(0.1) + 0.05f(0.1, 0.2848) ≈ 0.4056

y(0.2) = y(0.15) + 0.05f(0.15, 0.4056) ≈ 0.5134

y(0.25) = y(0.2) + 0.05f(0.2, 0.5134) ≈ 0.6091

y(0.3) = y(0.25) + 0.05f(0.25, 0.6091) ≈ 0.6935

y(0.35) = y(0.3) + 0.05f(0.3, 0.6935) ≈ 0.7675

y(0.4) = y(0.35) + 0.05f(0.35, 0.7675) ≈ 0.8317

Let’s plot the approximations against the analytical solution to seehow they compare:

41

Figure 9. Euler approximations vs. analytical solution

What conclusions can we draw from this example? We can look atthe maximum error in each approximation, viewed relative to the ex-act solution. With the larger step size h = 0.1, the maximum erroris about 8%. With the h = 0.05 step size, the maximum error is only4%. This illustrates an important phenomenon: the smaller the stepsize, the more accurate our approximate solution. However, we mustalso perform more calculations. This means there is a tradeoff betweenaccuracy and computation time when using the Euler method. In ap-plications to real-world problems, it is essential to choose a step sizethat will provide the necessary level of accuracy while still running ina reasonable amount of time on a computer.

42

Example 2: Let’s illustrate the use of computers to numericallysolve differential equations. Using MATLAB, we can implement a pro-gram that will solve an IVP with just one for loop. Consider

y′ =y2 + 2ty

3 + t2y(1) = 2

To solve this using the Euler method, we write a for loop using theequation derived previously:

Figure 10. MATLAB Code for Euler’s Method

This function takes in step size h, starting time t0, ending time t1,and initial condition ic. The right-hand side is given in a separateMATLAB file called ex.m, which is called in the for loop.

Figure 11. MATLAB Code for Euler’s Method: right-hand side of ODE

In addition, we can write a separate script to loop through severalvalues for the step size h and plot the resulting solutions to this ODE,which yields

43

Figure 12. Solutions to dydt

= y2+2ty3+t2

, y(1) = 2

Notice the behavior of the numerical solutions near t = 3. It isreasonable to guess that blowup is occurring near this value of t. Of-tentimes, standard numerical solvers lose effectiveness near such singu-larity points, and a great deal of study in numerical analysis is devotedto maintaining accurate solutions at points such as this.

2.7 Suggested Problems: 2a), 2b), 3a), 3b)

§2.8: The Existence and UniquenessTheorem

In this section, we return to the (second) existence and uniquenesstheorem stated in §2.4 and discuss its method of proof. First, noticethat it is sufficient to consider the case when the initial condition isy(0) = 0. To see why this is the case, suppose we are given the initialvalue problem

y′(t) = f(t, y), y(t0) = y0

44

with the hypotheses of the theorem holding on the rectangle givenby t0 − a ≤ t ≤ t0 + a, y0 − b ≤ y ≤ y0 + b. If the result holdsfor rectangles centered at the origin with initial condition y(0) = 0,then we can define t = t − t0, y(t) = y(t + t0) − y0 = y(t) − y0, andg(t, y) = f(t+t0, y(t)+y0). In this case g(t, y(t)) satisfies the hypothesesof the theorem on the rectangle [−a, a]×[−b, b], so by assumption thereexists a unique solution to the IVP

y′(t) = g(t, y(t)), y(0) = 0

However, since y′(t) = y′(t), we have

y′(t) = g(t, y(t)) = g(t+ t0, y(t) + y0) = f(t, y(t))

and in addition y(t0) = y(0) + y0 = y0. With this fact in mind, wecan restate the theorem from §2.4 in the following way:

Theorem 3. Suppose f and ∂f∂y

are continuous in a rectangle R: |t| ≤a, |y| ≤ b, then there is some interval |t| ≤ h ≤ a in which there is aunique solution to the IVP

y′ = f(t, y), y(0) = 0(4)

Comment: In fact, we can weaken the hypotheses of this theorema bit and still obtain existence and uniqueness. Specifically, instead ofassuming that both f and ∂f

∂yare continuous, we can assume that only

f is continuous, but that f is also Lipschitz continuous rather than∂f∂y

being continuous. This means that there exists some C > 0 so that

we have

|f(y1, s)− f(y2, s)| ≤ C|y1 − y2|for any |s| ≤ a, |y1| ≤ h, |y2| ≤ h. For example, the function

f(t, y) = |y| is Lipschitz continuous with C = 1, but ∂f∂y

is not contin-uous.

Theorem 3 Proof Sketch: First, we convert the differential equa-tion (4) into an equivalent integral equation. By applying the Funda-mental Theorem of Calculus, a function φ(t) solves (4) if and only if itsolves

φ(t) =

∫ t

0

f(s, φ(s))ds

Instead of proving the theorem using (4), we will show that thereexists a unique solution to this integral equation, which by their equiv-alence will give us the desired result. How do we show this? We use

45

Picard iteration (also known as the method of successive approxi-mations). The process is as follows: first we must provide a startingguess at a solution to the integral equation (it doesn’t have to be agood guess, but we should at least choose it to satisfy the initial con-dition). Thus a reasonable choice is φ0(t) ≡ 0. Next, we plug in φ0

to the integral equation, and define φ1 to be the resulting function.Explicitly,

φ1(t) =

∫ t

0

f(s, φ0(s)) ds

We then repeat this process, defining

φ2(t) =

∫ t

0

f(s, φ1(s)) ds

and more generally

φn+1(t) =

∫ t

0

f(s, φn(s)) ds

Put another way, if we define the map T (where T sends a continuousfunction to another continuous function) by

Tφ(t) =

∫ t

0

f(s, φ(s)) ds

then our goal is to find a fixed point of T : that is a function φ withTφ = φ. In addition, we can rewrite φn = T nφ0.

Now, we have a sequence of functions {φn} = {T nφ0}. Note thateach φn automatically satisfies the initial condition, and if for some nwe have φn = φn+1, then φn is a solution to the IVP. This may nothappen, so instead we should try to consider

limn→∞

φn

Ideally, we would like this limit to exist and solve our IVP. To showthis, we must check a few things.

First, how do we know φn exists for all n? Our assumption is that fand ∂f

∂yare continuous on |t| ≤ a, |y| ≤ b. Since this is a closed, bounded

region we know f is bounded on that rectangle: say |f(t, y)| ≤ M forall (t, y) in [−a, a]× [−b, b]. Using the differential equation, this meansthat |y′| ≤ M on the same rectangle. Since y(0) = 0, we know that

46

|y| ≤ b as long as |t| ≤ b/M (this comes from the Mean Value Theo-rem). Thus if we define our map T for |t| ≤ h where h = min{a, b/M},we can ensure that t and y stay inside the rectangle where f and ∂f

∂y

are defined, thus guaranteeing that each φn exists.

The next question: does {φn(t)} converge? (Maybe skip this part)Since ∂f

∂yis continuous on a closed bounded region, we can say that

|∂f∂y| ≤ K. In addition, by the Mean Value Theorem we have

|f(t, y1)− f(t, y2)| ≤ K|y1 − y2|(5)

for any points (t, y1) and (t, y2) in our rectangle with the same t-coordinate. Applying this to y1 = φn(t) and y2 = φn−1(t) gives theinequality

|f(t, φn(t))− f(t, φn−1(t))| ≤ K|φn(t)− φn−1(t)|(6)

Next, we have

|φ1(t)| =∫ t

0

f(s, φ0(s)) ds ≤M |t|(7)

since |f(t, y)| ≤M (from the last question). In addition,

|φ2(t)−φ1(t)| ≤∫ t

0

|f(t, φ1(s))−f(t, φ0(s))| ds ≤∫ t

0

KM |s| ds =KM |t|2

2

by (6) and (7). Next, we use induction: Suppose that |φn(t) −φn−1(t)| ≤ Kn−1M |t|n

n!. Then we have (using (6), (7), and the induction

hypothesis)

47

|φn+1(t)− φn(t)| ≤∫ t

0

|f(s, φn(s))− f(s, φn−1(s)| ds

≤∫ t

0

K|φn(s)− φn−1(s)| ds

≤∫ t

0

KnM |t|n

n!ds

=KnM |t|n+1

(n+ 1)!

≤ MKnhn+1

(n+ 1)!

(8)

since |t| ≤ h. The triangle inequality says that |a+ b| ≤ |a|+ |b| forany numbers a and b. To use this, we write

φn(t) = φ1(t) + (φ2(t)− φ1(t)) + ...+ (φn(t)− φn−1(t))and so

|φn(t)| ≤ |φ1(t)|+ |φ2(t)− φ1(t)|+ ...+ |φn(t)− φn−1(t)|

≤M |t|+ KM |t|2

2+ ...+

Kn−1M |t|n

n!

=M

K

(Kh+

(Kh)2

2!+ ...+

(Kh)n

n!

)using (7) and (8). The infinite sum

∞∑n=1

(Kh)n

n!

converges (either by the ratio test, or the fact that it is the powerseries for eKh). As a result, we can also say that the series

φ1(t) +∞∑n=2

φn(t)− φn−1(t)

converges. Since {φn(t)} can be written as the sequence of partialsums for a convergent series, it follows that {φn(t)} converges as well.Let’s call the limit function φ, so that

φ(t) = limn→∞

φn(t)

48

The next question: does the limit function φ satisfy the desired in-tegral equation? First, note that the function φ is continuous becauseeach φn was continuous and the sequence {φn} converges uniformlyto φ (uniform convergence means that the rate of convergence does notdepend on t, and it follows from something known as the WeierstrassM-test).

Aside: For an example of a sequence of continuous functions thatconverges to a discontinuous function, consider the functions fn definedon [0,1] by fn(x) = xn. Then fn converges to f , where f(x) = 0 for xin [0,1), and f(1) = 1. This function is not continuous at x = 1, andone can check that the sequence fn does not converge uniformly.

Returning to the task at hand: we know for each n that we have

φn+1(t) =

∫ t

0

f(s, φn(s)) ds

Taking the limit as n → ∞ on both sides and using the continuityof f gives

φ(t) = limn→∞

∫ t

0

f(s, φn(s)) ds

=

∫ t

0

limn→∞

f(s, φn(s)) ds

=

∫ t

0

f(s, φ(s)) ds

so that φ does in fact satisfy the integral equation, and thus solvesthe IVP. (Note: the exchange of the limit with the integral above is notpermissible in general, but again because of the uniform convergenceof {φn} to φ and the boundedness of f we are allowed to do it).

Finally, is our solution unique? Suppose ψ is another solution. Thenby the integral equation

φ(t)− ψ(t) =

∫ t

0

f(s, φ(s))− f(s, ψ(s)) ds

so that

49

|φ(t)− ψ(t)| ≤∫ t

0

|f(s, φ(s))− f(s, ψ(s))| ds

≤ K

∫ t

0

|φ(s)− ψ(s)| ds

where we have used (5). Let’s define F (t) =∫ t0|φ(s) − ψ(s)| ds.

Then by the preceding calculation we have

F ′(t)−KF (t) ≤ 0, F (0) = 0

Multiplying this equation by e−Kt gives

d

dt(e−KtF ) ≤ 0

Since this holds for all |t| ≤ h, we can integrate it from 0 to t andkeep the inequality (here we use that F (0) = 0)

e−KtF (t) ≤ 0

Since e−Kt ≥ 0, it follows that F (t) ≤ 0. But by definition F (t) ≥ 0,so F ≡ 0 and φ ≡ ψ. This shows our solution is unique.

Example: Let’s demonstrate how to find the first few iterations inthe above process explicitly. Consider the IVP:

y′ = −y/2 + t, y(0) = 0

Defining φ0(t) = 0, we have

φ1(t) =

∫ t

0

s ds =t2

2

φ2(t) =

∫ t

0

−φ1(s)

2+ s ds =

∫ t

0

−s2

4+ s ds = − t

3

12+t2

2

φ3(t) =

∫ t

0

−φ2(s)

2+s ds =

∫ t

0

s3

24−s

2

4+s ds =

t4

96− t

3

12+t2

2=

3∑k=1

(−1)k+1tk+1

(k + 1)!2k−1

More generally, if

φn−1(t) =n∑k=1

(−1)k+1tk+1

(k + 1)!2k−1

then we have

50

φn(t) =

∫ t

0

n∑k=1

(−1)k+2sk+1

(k + 1)!2k+s ds =

n∑k=1

(−1)k+2tk+2

(k + 2)!2k+t2

2=

n+1∑k=1

(−1)k+1tk+1

(k + 1)!2k−1

so our limit function is given by

φ(t) =∞∑k=1

(−1)k+1tk+1

(k + 1)!2k−1= 4

∞∑k=0

(−1)ktk

k!2k− 4 + 2t = 4e−t/2 + 2t− 4


§2.9: First Order Difference EquationsDifferential equations provide continuous models of physical phe-

nomena. However, in this section, we instead explore discrete models:specifically we look at some physical problems where they come up andwe also study some of their properties.

Recall the population model studied in §2.5 as well as the model forcontinuously compounded interest seen in §2.3. Each of these modelshas a discrete analogue, which can make for a more realistic modeldepending on the problem at hand. The general discrete model we areinterested in is given by

yn+1 = f(n, yn) n = 0, 1, 2, ...

Thus instead of a solution that is a continuous function of the variablet, we look for a solution sequence indexed by natural numbers n. Anequation of this form is called a first order difference equation. Itis first order because yn+1 only depends on yn, not on any of the earlierterms in the sequence. As with differential equations, the differenceequation will be linear if f is a linear function of yn; otherwise it willbe nonlinear. In addition, we may supplement a difference equationby an initial condition y0 = α. Suppose that f only depends on yn,that is our difference equation takes the form

yn+1 = f(yn), n = 0, 1, 2, ...

Then we have

y1 = f(y0)

51

y2 = f(y1) = f(f(y0))

and so on, so that

yn = fn(y0)

Thus given an initial condition y0, we can find yn by iterating themap f n times. Most often, the questions that concern us in such amodel are whether yn approaches a limit as n→∞, and if it does, wewould like to find that limit. Note that we can redefine the notion ofan equilibrium solution to a difference equation by saying yn+1 = yn,or equivalently f(yn) = yn, meaning yn is a fixed point of f (since thisnecessarily implies that yk = yn for all k > n by the above formula)

Example 1: Suppose we measure the population of a certain speciesonce every year, and we find that in year n + 1 the population is aconstant multiple of the population in year n. If we denote this constantmultiple by ρn, we obtain the linear first-order difference equation

yn+1 = ρnyn, n = 0, 1, 2, ...

Using the previous comments, we can solve for yn in terms of y0:

yn = ρn−1ρn−2...ρ1ρ0y0(9)

In a population model, we assume that ρn > 0 for all n, but thisfirst-order difference equation may also be studied allowing for ρn ≤ 0(note that if ρN = 0 for any N , then yn = 0 for all n > N by (9)).

However, studying (9) is too broad for our purposes, as we need to begiven an entire sequence ρn. To simplify matters, if ρn is independentof n, that is ρn = ρ for all n, then we obtain the equation

yn = ρny0

In this model, there are three possible scenarios: if |ρ| < 1, thesequence converges to 0, if ρ = 1, the sequence stays fixed at y0. Oth-erwise the limit does not exist. That is, the equilibrium solution yn ≡ 0is stable for ρ < 1 and unstable for ρ > 1.

Example 1 (continued): Suppose we modify the preceding modelto allow for immigration/emigration of a species. If the net change inpopulation due to immigration/emigration in year n is bn, then ourmodel becomes

52

yn+1 = ρyn + bn, n = 0, 1, 2, ...

Iterating this map:

y1 = ρy0 + b0

y2 = ρ(ρy0 + b0) + b1 = ρ2y0 + ρb0 + b1More generally:

yn = ρny0 + ρn−1b0 + ρn−2b1 + ...+ ρbn−2 + bn−1 = ρny0 +n−1∑k=0

ρn−1−kbk

As before with ρ, simplify things by supposing bn ≡ b 6= 0 for all n.Plugging this in to the solution above:

yn = ρny0 + bn−1∑k=0

ρk = ρny0 +1− ρn

1− ρb = ρn

(y0 −

b

1− ρ

)+

b

1− ρ

where we assume ρ 6= 1. In this form, we determine that yn → b1−ρ

if ρ < 1. If ρ = −1 or |ρ| > 1 the limit does not exist. Note that theequilibrium solution is y = b

1−ρ , so if |ρ| < 1 it is asymptotically stable.

If ρ = 1, then the iteration becomes

yn = y0 + nb

which diverges for b 6= 0.

Example 2: Modeling loans with difference equations. Revisit theprevious problem from §2.3 about equine massage: we have an $8000loan that we would like to pay off in three years at a 10% annual rate,only now the interest is accrued each month and the payments occurmonthly. In this case, we can use a difference equation to model theamount remaining on the loan. Let p be the amount of the monthlypayment and notice that the monthly interest is given by 0.1/12 =1/120. Then our difference equation for month n+1 in terms of monthn is

yn+1 =121

120yn − p, n = 0, 1, 2, ... y0 = 8000

This equation takes the same form as the second population modelabove (with ρ = 121/120, b = −p), so we can plug in the generalformula to find the amount remaining on the loan after three years:

53

y36 =

(121

120

)36

(8000− 120p) + 120p

Setting this equal to 0 and solving for y we find that y ≈ $258.14gives the necessary monthly payment. Thus the amount paid per yearis $3097.68, slightly more than the $3086.64 annual rate when every-thing accrued continuously.

So far, we have only considered linear difference equations. As withdifferential equations, nonlinear difference equations in general exhibitmuch more complicated behavior than linear ones. As an example, let’slook at

Example: Logistic Difference Equation:The logistic differenceequation (the discrete version of the differential equation studied in§2.5) is given by

yn+1 = ρyn

(1− yn

k

)Compare this to

dy

dt= r

(1− y

K

)y

To simplify the logistic difference equation, we can write un = yn/k,which gives the simpler difference equation

un+1 = ρun(1− un)

with ρ our only parameter (in fact the value of ρ is very important forthe behavior of this system). First, let’s find the equilibrium solutions.Setting un+1 = un:

un = ρun(1− un)

We can see un = 0 is one equilibrium solution. If un 6= 0, we candivide by it, thus obtaining the other equilibrium solution

un = 1− 1

ρ=ρ− 1

ρNext, we would like to determine whether these equilibrium solutions

are stable or unstable. To do this, we must linearize about eachsolution. First, let’s look at un = 0. Recall that our equation is

un+1 = ρun(1− un) = ρun − ρu2n

54

Now if un is very near 0, u2n will be much smaller than un, so weneglect it. Thus our linearization at 0 is given by

un+1 = ρun

In this case, we can see that if |ρ| > 1 solutions starting near zero willget pushed away, so it will be unstable. If |ρ| < 1, linearized solutionswill converge to the equilibrium solution at 0, and 0 will be stable.More generally, we can determine whether the difference equation

yn+1 = f(yn)

is stable or not at the fixed point (equilibrium solution) y = y∗ byevaluating f ′(y∗). If |f ′(y∗)| > 1, the fixed point is unstable, but if|f ′(y∗)| < 1 it is stable. Let’s evaluate the stability of our other fixedpoint using this method: here f(un) = ρun−ρu2n and f ′(un) = ρ−2ρun.Evaluating at our fixed point un = ρ−1

ρgives

f ′(ρ− 1

ρ

)= ρ− 2(ρ− 1) = 2− ρ

Thus the other fixed point will be stable if 1 < ρ < 3. If ρ > 3or ρ < 1, it will be unstable. We could also obtain this result vialinearization about the fixed point. To illustrate this: for un near ρ−1

ρ

we write

un =ρ− 1

ρ+ vn

where vn is small. Plugging in to the difference equation:

un+1 = ρ− 1 + ρvn −(ρ− 1)2

ρ− 2(ρ− 1)vn − ρv2n

Again we neglect the higher-order v2n terms, and we obtain

un+1 = ρ+ (2− ρ)vn

or

vn+1 = (2− ρ)vn

This yields the same requirement: |2− ρ| < 1.

What have we learned? As discussed previously, we are mostly in-terested in long-term behavior of solutions, so we would like to knowwhere most of our solutions end up (i.e. if there is an attracting fixedpoint). If we treat ρ as a variable in its own right, gradually increasing

55

it and examining the equilibrium solutions (and their stability) as afunction of ρ, we have described the long-term behavior of solutionsfor ρ < 3. Note that we see an exchange of stability at ρ = 1,where the attracting fixed point jumps from 0 to ρ−1

ρ. In addition,

for ρ > 3 both equilibrium solutions become unstable. A reasonablequestion: what is the long-term behavior of solutions for these valuesof ρ? In fact, the behavior of solutions becomes increasingly complex:for 3 < ρ < 3.5 (roughly) solutions settle into a repeating two-cycle(that is yn+2 = yn for n sufficiently large). If we increase ρ further,this two-cycle becomes a four-cycle, an eight-cycle, etc. This phenom-enon is knows as period doubling. It occurs ad infinitum at smallerand smaller intervals, ending around ρ = 3.57. At this point, chaostakes over. Long-term behavior of solutions is very unpredictable andextremely sensitive to small changes in the initial value.

Let’s see this via some pictures: in all simulations we start withy0 = 0.3 and run up to yn for various values of ρ

Figure 13. ρ = 0.5, n = 10: 0 is an attracting fixed point

56

Figure 14. ρ = 2.1, n = 10: ρρ−1 is an attracting fixed point

57

Figure 15. ρ = 2.99, n = 100: very near onset of two-cycle

58

Figure 16. ρ = 3.1, n = 100: Two-cycle appears

59

Figure 17. ρ = 3.5, n = 50: Appearance of a four-cycle

60

Figure 18. ρ = 3.7, n = 50: y0 = 0.3 and y0 = 0.301have radically different behavior

61

Figure 19. Bifurcation map for logistic equation

62


§3.1: Homogeneous Equations withConstant Coefficients

Chapter 3 focuses on second-order ODEs. The general second orderODE takes the form

d2y

dt2= f

(t, y,

dy

dt

)In general, these are very hard to solve (even harder than general

first-order ODEs). Thus in this chapter, we will focus only on linearsecond-order ODEs, which we can either write as

P (t)y′′ +Q(t)y′ +R(t)y = G(t)

or, if we assume P (t) 6= 0 and divide by it:

y′′ + p(t)y′ + q(t)y = g(t)

In general, we are interested in the case where p, q, and g are con-tinuous functions. Any equation not taking one of these forms is non-linear, and we will ignore such equations for the most part in thischapter. Given a second-order ODE, we can pose an IVP just as wedid with first order ODEs.

Important Difference: For an IVP involving a second-order equa-tion, we now need two initial conditions rather than one. Intuitively,this is because we are solving a second-order ODE by integrating twice,so we obtain two integration constants rather than one. Thus we needtwo initial conditions in order to solve for both constants. A typicalpair of initial conditions for a second-order ODE will be

y(t0) = y0, y′(t0) = y′0for y0 and y′0 some given numbers.

We say that a second-order linear ODE is homogeneous if the func-tion g(t) (or G(t)) ≡ 0. If not, we say it is nonhomogeneous. Oncewe solve the homogeneous problem, we can always build solutionsto the nonhomogeneous problems with an appropriate integral (we’llsee this later). Thus for now, we’ll just consider homogeneous equa-tions. In addition, in this section we consider only constant-coefficient

63

linear equations. Thus we are interested in solving equations of theform

ay′′ + by′ + cy = 0

where a, b, and c are constants.

Example: Solve the ODE

y′′ − y = 0

One function that clearly satisfies this is y1 = et. In addition, we cansee that y2 = e−t satisfies it as well. In fact, any function of the form

c1et + c2e

−t

where c1 and c2 are constants, will solve the ODE. This is an impor-tant property of linear homogeneous equations: linear combinations ofsolutions are again solutions. Suppose we supplement the ODE withthe initial conditions

y(0) = 3, y′(0) = −5

Then plugging these in to the general equation above we obtain thetwo equations

c1 + c2 = 3

c1 − c2 = −5

Solving these equations simultaneously gives c1 = −1 and c2 = 4.Thus the solution to the IVP

y′′ − y = 0, y(0) = 3, y′(0) = −5

is given by

y(t) = −et + 4e−t

More generally, suppose we aim to solve

ay′′ + by′ + cy = 0

for a,b, and c given. If we plug in y(t) = ert (so that y′(t) = rert andy′′(t) = r2ert), we get

64

(ar2 + br + c)ert = 0

Since ert 6= 0, we can solve the given ODE by solving the charac-teristic equation:

ar2 + br + c = 0

In attempting to solve this equation for suitable values of r, we findthat there are three possible cases:

(1) It has two distinct real roots r1 6= r2

(2) It has one repeated real root r1

(3) It has two complex roots, which are complex conjugates of oneanother

For now, we only consider case (1), but we will come back to theother two possibilities. If r1 and r2 are the two real roots, then thegeneral solution to the ODE is

y(t) = c1er1t + c2e

r2t

Suppose that we again want to solve an IVP with the initial condi-tions

y(t0) = y0, y′(t0) = y′0

Then evaluating our general solution and its derivative at t = t0:

c1er1t0 + c2e

r2t0 = y0

c1r1er1t0 + c2r2e

r2t0 = y′0

Solving for c1 and c2:

c1 =r2y0 − y′0

(r2 − r1)er1t0

c2 =r1y0 − y′0

(r1 − r2)er2t0and so the specific solution to the IVP is given by

y(t) =r2y0 − y′0r2 − r1

er1(t−t0) +r1y0 − y′0r1 − r2

er2(t−t0)

65

Note that we require r1 6= r2 in order for these solutions to makesense.

Example: Find the general solution of

y′′ − 2y′ − 2y = 0

In this case, the characteristic equation is

r2 − 2r − 2 = 0

Using the quadratic formula we find that the roots are

r = 1±√

3

so the general solution is

y(t) = c1e(1+√3)t + c2e

1−√3t

Example: Solve the IVP

y′′ + 8y′ − 9y = 0, y(1) = 1, y′(1) = 0

First, we obtain the general solution. The characteristic equation is

r2 + 8r − 9 = 0

This has roots r = 1, r = −9. Thus the general solution is

y(t) = c1et + c2e

−9t

In addition,

y′(t) = c1et − 9c2e

−9t

Plugging in the initial conditions:

y(1) = c1e+ c2e−9 = 1

y′(1) = c1e− 9c2e−9 = 0

The second equation gives c1 = 9c2e−10. Plugging this back into the

first equation we find 10c2e−9 = 1, or c2 = e9/10. This also means

c1 = 9e−1/10. Thus the solution to our IVP is given by

66

y(t) =9

10et−1 +

1

10e9−9t

3.1 Suggested Problems: 4,6,11,14,17,20,21

§3.2: Solutions of Linear HomogeneousEquations; the Wronskian

In this section, we discuss some important properties of second-orderlinear ODEs: namely their existence and uniqueness properties. In ad-dition, we discuss in further detail the process of constructing linearcombinations of solutions to IVPs.

First, some important notation: so far, we have been describingdifferential equations by explicitly writing the equation. However, wemay think of differential equations more abstractly as well. In fact,given a second-order linear homogeneous ODE taking the form

y′′ + p(t)y′ + q(t)y = 0

we can think of the left-hand side of the ODE as an operator thattakes in functions that are twice differentiable on some time intervalI : α < t < β (where α and β may also be infinite), and gives us a newfunction. If we write

L[φ](t) = φ′′(t) + p(t)φ′(t) + q(t)φ(t)

then solving the preceding ODE is equivalent to solving L[φ] ≡ 0(note that L[φ] is itself a function).

Example: Suppose p(t) = et, q(t) = sin t, and φ(t) = t2. Then

L[φ](t) = (t2)′′ + et(t2)′ + sin t(t2) = 2 + 2te2 + t2 sin t

Now, we can recast an IVP in the following way:

L[y](t) = 0, y(t0) = y0, y′(t0) = y′0where t0 is in the interval I. A natural question: what can we say

about existence and uniqueness of solutions to this IVP? We have the

67

following theorem (stated for possibly nonhomogeneous second-orderlinear ODEs)

Theorem 4. Suppose p(t), q(t), and g(t) are continuous on the intervalI containing t0. Then there is a unique solution to the IVP

y′′ + p(t)y′ + q(t)y = g(t), y(t0) = y0, y′(t0) = y′0

and the solution exists on all of I

Comments: Compare this to the existence and uniqueness theoremfor first-order linear ODE: it is very similar. Just like in that case, weget existence and uniqueness on the entire interval that p, q, and g arecontinuous. However, in this case there is no general formula for thesolution and the proof of the theorem is very involved, so we’ll skip it.

Example: Find the longest time interval on which the solution tothe IVP

(x− 2)y′′ + y′ + (x− 2)(tanx)y = 0, y(3) = 1, y′(3) = 2

is certain to exist. To do this, we must first put the ODE in the rightform to apply the theorem:

y′′ +1

x− 2y′ + tanxy = 0

In this case p(x) = 1x−2 is continuous for x 6= 2 and q(x) is continuous

for x 6= (k + 1/2)π, where k is any integer. Thus the largest intervalon which p and q are both continuous containing the initial conditionis (2, 3π

2).

Now, returning to the homogeneous case: given two solutions toL[y] = 0 we would like to know whether we can generate more solutions(as we did in the last section). In fact, we can:

Theorem 5. (Principle of superposition) Suppose y1 and y2 both solvethe ODE

L[y] = y′′ + p(t)y′ + q(t)y = 0(10)

Then for any c1 and c2 the function c1y1 + c2y2 also solves (10).

Put another way, the operator L is a linear differential operator.Let’s verify this explicitly.

68

Proof.

L[c1y1 + c2y2] = [c1y1 + c2y2]′′ + p[c1y1 + c2y2]

′ + q[c1y1 + c2y2]

= c1y′′1 + c2y

′′2 + pc1y

′1 + pc2y

′2 + qc1y1 + qc2y2

= c1[y′′1 + py′1 + qy1] + c2[y

′′2 + py′2 + qy2]

= c1L[y1] + c2L[y2]

= 0

�

What did we show? Essentially, given two solutions to a linearsecond-order homogeneous ODE we can construct an entire family ofsolutions given by linear combinations of them. A natural question:does the family we just constructed include all solutions, or are thereothers? In addition, if this is our general solution to an ODE can wealways choose c1 and c2 to satisfy the initial conditions of an IVP?Suppose we have found two solutions to the ODE (and hence a wholefamily of solutions) and we want to solve the IVP with y(t0) = y0 andy′(t0) = y′0. Plugging in y = c1y1 + c2y2 we search for a solution for c1and c2 satisfying the initial conditions:

c1y1(t0) + c2y2(t0) = y0

c1y′1(t0) + c2y

′2(t0) = y′0

Recall linear algebra facts: there is a unique solution to this systemfor c1 and c2 precisely when the determinant of the matrix∣∣∣∣y1(t0) y2(t0)

y′1(t0) y′2(t0)

∣∣∣∣ = y1(t0)y′2(t0)− y2(t0)y′1(t0) 6= 0

In this case, we can solve for c1 and c2 explicitly using the precedingsystem of equations:

c1 =y0y′2(t0)− y′0y2(t0)

y1(t0)y′2(t0)− y2(t0)y′1(t0)=

∣∣∣∣y0 y2(t0)y′0 y′2(t0)

∣∣∣∣∣∣∣∣y1(t0) y2(t0)y′1(t0) y′2(t0)

∣∣∣∣c2 =

y′0y1(t0)− y0y′1(t0)y1(t0)y′2(t0)− y2(t0)y′1(t0)

=

∣∣∣∣y1(t0) y0y′1(t0) y′0

∣∣∣∣∣∣∣∣y1(t0) y2(t0)y′1(t0) y′2(t0)

∣∣∣∣Thus as long as

69

W =

∣∣∣∣y1(t0) y2(t0)y′1(t0) y′2(t0)

∣∣∣∣ 6= 0

both the above constants will exist and the solution c1y1 + c2y2 willsolve the IVP. The determinant W is called the Wronskian. Let’ssummarize what we just found:

Theorem 6. Suppose y1 and y2 solve the ODE

L[y] = y′′ + p(t)y′ + q(t)y = 0

Then we can find a solution of the IVP with initial conditions

y(t0) = y0, y′(t0) = y′0

having the form

y(t) = c1y1(t) + c2y2(t)

if and only if the Wronskian

W (y1, y2)(t0) = y1(t0)y′2(t0)− y′1(t0)y2(t0) 6= 0

Example: Find the Wronskian of the functions y1 = x, y2 = xex.Since y′1 = 1 and y′2 = xex + ex, the Wronskian is given by

W =

∣∣∣∣ x xex

−1 xex + ex

∣∣∣∣ = x2ex + xex − xex = x2ex

Thus the Wronskian is nonzero for any x 6= 0, so if y1 and y2 solveL[y] = 0 then we can always solve the usual IVP as long as the initialconditions are not given at x = 0.

Finally, we can answer our previous question, showing that in factall solutions to the ODE L[y] = 0 do take the form c1y1 + c2y2, as longas the Wronskian is not identically zero.

Theorem 7. Suppose y1 and y2 solve

L[y] = y′′ + p(t)y′ + q(t)y = 0

Then the family of solutions

y = c1y1(t) + c2y2(t)

where c1, c2 are arbitrary includes all solutions of the ODE if and onlyif the Wronskian of y1 and y2 is not identically zero. (i.e. there existssome t0 with W (y1, y2)(t0) 6= 0)

70

One half the theorem follows from the existence and uniqueness re-sult at the beginning of the section. Specifically, if φ solves the ODEL[φ] = 0 and t0 denotes the point where W (y1, y2)(t0) 6= 0, then by thelast theorem there are c1 and c2 so that c1y1 + c2y2 solves the IVP

L[y] = 0, y(t0) = φ(t0), y′(t0) = φ′(t0)

By the uniqueness portion of Theorem 4, φ = c1y1 + c2y2.

In view of this most recent theorem, the terminology of the last sec-tion involving general solutions having the form c1y1 + c2y2 is nowfully justified (provided the Wronskian is not identically zero). In ad-dition, if y1 and y2 solve the ODE and W (y1, y2) is not identically zero,then we call y1 and y2 a fundamental set of the solutions.

Summary: If we want to find the general solution of a second-orderlinear ODE, all we have to do is find two solutions, and ensure theirWronskian is not identically zero. Let’s see some examples:

Example: Returning to the problems in the last section: if y1 = er1t

and y2 = er2t both solve the ODE

y′′ + p(t)y′ + q(t)y = 0

(in the last section p and q were constants, so that r1 and r2 were foundby solving the characteristic equation). Then the Wronskian is

W =

∣∣∣∣er1t r1er1t

er2t r2er2t

∣∣∣∣ = (r2 − r1)e(r1+r2)t

Since e(r1+r2)t 6= 0 we conclude that er1t and er2t form a fundamentalset of solutions precisely when r1 6= r2 (which is the case we handledin the previous section).

Example: Do the functions y1 = x and y2 = xex form a fundamentalset of solutions of the ODE

x2y′′ − x(x+ 2)y′ + (x+ 2)y = 0, x > 0

First, we must check that y1 and y2 actually solve the given ODE.Plugging in y1:

−x(x+ 2) + (x+ 2)x = 0

For y2, we have y′2 = ex + xex and y′′2 = 2ex + xex. Plugging in:

71

x2(2ex + xex)− x(x+ 2)(ex + xex) + (x+ 2)xex

= 2x2ex + x3ex − x2ex − x3ex − 2xex − 2x2ex + x2ex + 2xex

= 0

Next, let’s find the Wronskian (wait we already did that in a pre-vious example: it’s W = x2ex). This is nonzero, so y1 and y2 are afundamental set of solutions.

Another question: given any differential equation, can we alwaysfind a set of fundamental solutions? The answer is yes for linear ho-mogeneous second-order equations with continuous coefficients. As atheorem:

Theorem 8. Consider the ODE

L[y] = y′′ + p(t)y′ + q(t) = 0

with p and q continuous on some open interval I. Let t0 be some pointin I. If y1 solves the ODE with the initial conditions

y1(t0) = 1, y′1(t0) = 0

and y2 solves the ODE with initial conditions

y2(t0) = 0, y′2(t0) = 1

then y1 and y2 form a fundamental set of solutions of the ODE.

First, such y1 and y2 exist by the existence and uniqueness theoremat the beginning of the section. To show they’re a fundamental set ofsolutions, check the Wronskian W (y1, y2)(t0) (it equals 1).

Comment: Essentially, this theorem tells us that we can decomposesolutions to an ODE into those with nonzero initial position (y(t0) 6= 0)and those with nonzero initial velocity (y′(t0) 6= 0).

Example: Let’s find the fundamental set of solutions mentioned inthe preceding theorem to the equation

y′′ + 4y′ + 3y = 0

at t0 = 1. The characteristic equation is r2 + 4y + 3 = 0, which hassolutions r = −3, r = −1, with corresponding solutions e−3t and e−t.To find the desired fundamental solutions, we write the general solutionas

y(t) = c1e−3t + c2e

−t

72

For y1: we want y1(1) = 1 and y′1(1) = 0. Plugging in:

c1e−3 + c2e

−1 = 1

−3c1e−3 − c2e−1 = 0

The second equation gives c2 = −3c1e−2. Plugging back into the

first equation yields −2c1e−3 = 1 so that c1 = −1

2e3. Plugging this into

the second equation we see that c2 = 32e.

For y2: we want y2(1) = 0 and y′2(1) = 1. Plugging in:

c1e−3 + c2e

−1 = 0

−3c1e−3 − c2e−1 = 1

In this case, c1 = −c2e2, so that c2 = 12e and c1 = −1

2e3. Thus our

fundamental set of solutions is given by

y1(t) = −1

2e3−3t +

3

2e1−t

y2(t) = −1

2e3−3t +

1

2e1−t

As mentioned previously, it is often the case that we may not be ableto explicitly represent solutions to the equation

L[y] = y′′ + p(t)y′ + q(t)y = 0

However, we may determine (up to a constant) the Wronskian ofany two solutions y1 and y2 to this equation without knowing y1 or y2explicitly: all we need is the equation. This result is known as Abel’sTheorem:

Theorem 9. (Abel’s Theorem) Suppose y1 and y2 solve the ODE

L[y] = y′′ + p(t)y′ + q(t)y = 0

where p and q are continuous on the open interval I. Then the Wron-skian is given by

W (y1, y2)(t) = ce−∫p(t) dt

where c depends on y1 and y2, but not on t.

Proof. Since y1 and y2 solve the ODE we have

y′′1 + p(t)y′1 + q(t)y1 = 0

y′′2 + p(t)y′2 + q(t)y2 = 0

73

Multiplying the first equation by y2, the second by y1, and subtract-ing the first from the second yields

[y1y′′2 − y′′1y2] + p(t)[y1y

′2 − y′1y′2 = 0

But W = y1y′2 − y′1y2 and W ′ = y1y

′′2 − y′′1y2, so this equation is in

fact

W ′(t) + p(t)W (t) = 0

We can solve this equation for W using the method from §1.1: andwe obtain the general solution

W (t) = ce−∫p(t) dt

�

Benefits: We can now find the Wronskian up to a constant with-out actually solving the equation. In addition, since e−

∫p(t)dt 6= 0, it

follows that the Wronskian is only zero at a point when c = 0, and inthis case it is zero everywhere. Thus all we need is to show that theWronskian is nonzero at one point to determine that y1 and y2 form afundamental set of solutions.

Example: Legendre’s equation, which arises when trying to solveLaplace’s equation in spherical coordinates, is given by

(1− x2)y′′ − 2xy′ + α(α + 1)y = 0

where α is some constant. We can rewrite this as

y′′ − 2x

1− x2y′ +

α(α + 1)

1− x2y = 0

We can now use Abel’s Theorem to find the Wronskian of any twosolutions up to a constant:

W (y1, y2)(t) = ce−

∫ −2x

1−x2dx

Using u-substitution, we let u = 1 − x2 so that du = −2x dx, and weobtain

W (y1, y2)(t) = ce−∫

1udu = ce− lnu = ce− ln(1−x2) =

c

1− x2Without knowing more about the specific solutions y1 and y2, this is

the most we can say about their Wronskian.

Looking ahead to the next section: if we have a second-order con-stant coefficient homogeneous ODE and the characteristic equation hascomplex roots, then our ODE will also have complex-valued solutions.

74

If a complex function y solves the ODE L[y] = 0, then both the realpart and imaginary part of y satisfy the ODE as well. Formally:

Theorem 10. Suppose y(t) = u(t) + iv(t) solves the equation

L[y] = y′′ + p(t)y′ + q(t)y = 0

where p and q are real-valued equations. Then u and v are also solutionsof the ODE: that is L[u] = L[v] = 0.

This theorem follows immediately from the linearity of L:

L[y] = L[u+ iv] = L[u] + iL[v] = 0

Since u, v, p, and q are all real, so too are L[u] and L[v]. ThusL[u] = Re(L[z]) and L[v] = Im(L[z]). Since L[z] = 0, it follows thatL[u] = L[v] = 0.

Comment: If y = u + iv is a solution, then so too is y = u − iv,since y = y− 2iv so y is a linear combination of y and v, both of whichsolve the equation (by the previous theorem).

3.2 Suggested Problems: 3,5,9,11,14,16,18,22,24,30,36,39

§3.3: Complex Roots of theCharacteristic Equation

In this section, we return to constant-coefficient homogeneous second-order ODEs taking the form

ay′′ + by′ + cy = 0

In §3.1 we dealt with the case where the characteristic equation

ar2 + br + c = 0

had two real roots. In this section, we consider what happens whenthe characteristic equation has complex roots (which must be com-plex conjugates of one another). This happens precisely when the dis-criminant b2 − 4ac < 0. In this case, we can write the roots of thecharacteristic equation as

r1 = λ+ iµ, r2 = λ− iµwhere λ and µ are real numbers. If we use these roots to form the

general solution as in §3.1, we obtain

75

y(t) = c1e(λ+iµ)t + c2e

(λ−iµ)t

This raises the question: what does it mean to raise a number to animaginary power? The answer comes from something called Euler’sformula. Recall the power series for ex:

ex =∞∑k=0

xk

k!

Suppose we instead consider an imaginary argument: that is weinvestigate the the power series for eiθ where θ is real. Substituting:

eiθ =∞∑k=0

(iθ)k

k!

=∞∑k=0

(iθ)2k

(2k)!+∞∑k=0

(iθ)2k+1

(2k + 1)!

=∞∑k=0

(−1)kθ2k

(2k)!+ i

∞∑k=0

(−1)kθ2k+1

(2k + 1)!

= cos θ + i sin θ

Also, since cos(−θ) = cos(θ) and sin(−θ) = − sin(θ), we have e−iθ =cos(θ)− i sin θ, so that e−it is the complex conjugate of eit.

Euler’s formula holds for any real θ, so it holds for θ = µt. Thismeans

e(λ+iµ)t = eλteiµt = eλt[cos(µt) + i sin(µt)]

Note that the exponential for complex values of r satisfies the usuallaws of exponents, and it also satisfies

d

dtert = rert

Next, let’s solve some ODEs and IVPs involving complex exponen-tials. But first, a comment: since all our ODEs have real-valued coeffi-cients, we would like our fundamental set of solutions to be real-valuedas well. In the earlier example, we chose

y1 = e(λ+iµ)t = eλt[cos(µt) + i sin(µt)]

y2 = e(λ−iµ)t = eλt[cos(µt)− i sin(µt)]

76

but notice that we could instead choose the fundamental set of solu-tions

y1 = eλt cos(µt)

y2 = eλt sin(µt)

(recall the theorem from §3.2 stating that the real and imaginaryparts of the solution of linear homogeneous second-order ODE withreal coefficients are also solutions). Comment: technically, we need toevaluate the Wronskian of each of these pairs of solutions to officiallyverify that they are a fundamental set of solutions, but we’ll do thatin the examples.

Example: Find the general solution of the ODE

y′′ + 2y′ + 2y = 0

Then solve the IVP with initial conditions

y(π/4) = 2, y′(π/4) = −2

The characteristic equation is r2 + 2r + 2 = 0, which using thequadratic formula gives the roots r = −1± i. This means two solutionsare given by y1 = e−t cos t and y2 = e−t sin t. To check that the form afundamental set of solutions, let’s evaluate the Wronskian:

W (y1, y2)(t) =

∣∣∣∣ e−t cos t e−t sin t−e−t cos t− e−t sin t −e−t sin t+ e−t cos t

∣∣∣∣= e−2t(cos2 t+ sin2 t) = e−2t

Thus W is nonzero, so our general solution is given by

y = c1e−t cos t+ c2e

−t sin t

Next, let’s solve the IVP. Plugging in:

y(π/4) = c1e−π/4√

2

2+ c2e

−π/4√

2

2= 2

and

y′(t) = −c1e−t(cos t+ sin t) + c2e−t(cos t− sin t)

so that

y′(π/4) = −c1e−π/4√

2 = −2

77

Solving for c1 and c2: we find that c1 = c2 =√

2eπ/4 and so oursolution to the IVP is

y(t) =√

2eπ/4−t[cos t+ sin t]

Comments on the behavior of the solution: The cosines and sinescause the solution to oscillate with a period of 2π. The e−t term causesdamping of the oscillations over time. More generally, ODEs that havecharacteristic equations with complex roots λ± iµ can fall into one ofthree categories:

(1) λ < 0: oscillations decrease in magnitude over time (damping)

(2) λ > 0: oscillations grow larger over time

(3) λ = 0 (roots are purely imaginary): oscillations remain at afixed magnitude for all time

Thus the real part of the roots of the characteristic equation deter-mines the magnitude of solutions, and in contrast the imaginary partdetermines the length of the period of oscillation.

In the previous example, we saw that the Wronskian of the sine andcosine solutions was nonzero. In fact, this is true in general: if theroots of our characteristic equation are λ± iµ then the correspondingsolutions are y1 = eλt cos(µt) and y2 = eλt sin(µt). Evaluating theWronskian:

W (y1, y2)(t) =

∣∣∣∣ eλt cos(µt) eλt sin(µt)λeλt cos(µt)− µeλt sin(λt) λeλt sin(µt) + µeλt cos(µt)

∣∣∣∣= µe2λt(cos2(µt) + sin2(µt)) = µe2λt

Thus if µ 6= 0, the Wronskian is nonzero, so y1 and y2 form a fun-damental set of solutions (if µ = 0 we’re choosing the wrong solutionssince the roots of the characteristic equation are real). It follows thatthe general solution for a characteristic equation with complex rootsλ± iµ always takes the form

y = c1eλt cos(µt) + c2e

λt sin(µt)


y′′ + 4y = 0, y(0) = 0, y′(0) = 1

78

The characteristic equation is r2 + 4 = 0, which has roots r = ±2i(note that in this case the roots are purely imaginary). Then ourgeneral solution takes the form

y(t) = c1 cos(2t) + c2 sin(2t)

In this case, we have

y′(t) = −2c1 sin(2t) + 2c2 cos(2t)

Plugging in the initial conditions gives c1 = 0 and c2 = 1/2. Thusour solution is

y(t) =1

2sin(2t)

(Note that this is one of the fundamental solutions of the ODEy′′ + 4y = 0 as discussed in the last section). In this case our solutionoscillates with a period of π and its modulus does not change over time.

3.3 Suggested Problems: 3,6,8,18,19,32

§3.4: Repeated Roots; Reduction ofOrder

So far we have determined methods for solving the equation

ay′′ + by′ + cy = 0

in the case that the characteristic equation has distinct real roots(b2 − 4ac > 0) or two complex roots (b2 − 4ac < 0). However, ifb2 − 4ac = 0, then the characteristic equation has a repeated rootat r = −b/2a. This gives one solution: y1 = e−bt/2a. However, asecond-order ODE should have two distinct solutions, so we need tofind another one somehow.

Example: If we try to solve the ODE

y′′ − 4y′ + 4y = 0

then the characteristic equation has a repeated root at r = 2. Thus onesolution is y1(t) = e2t. Suppose we look for a second solution havingthe form y(t) = f(t)y1(t) = f(t)e2t. Then

y′ = f ′e2t + 2fe2t and y′′ = f ′′e2t + 4f ′e2t + 4fe2t

79

Plugging into the differential equation gives

([f ′′ + 4f ′ + 4f)− 4(f ′ + 2f) + 4f ]e2t = f ′′e2t = 0

Since e2t 6= 0 we must have f ′′(t) = 0, which means

f(t) = c1t+ c2

But if we plug this into y, we find that

y = c1te2t + c2e

2t

Thus the c1 term gives us a new solution. Let’s check that the twoform a fundamental set of solutions:

W =

∣∣∣∣ e2t te2t

2e2t e2t + 2te2t

∣∣∣∣ = e4t 6= 0

so we have a fundamental set of solutions.

More generally: if we have a repeated root of the characteristic equa-tion for the ODE

ay′′ + by′ + cy = 0

at r = −b/2a, then following the same process: let y(t) = f(t)e−bt/2a,then

y′ = f ′e−bt/2a−bf2ae−bt/2a and y′′ = f ′′e−bt/2a−bf

′

ae−bt/2a+

b2f

4a2e−bt/2a

Plugging into the ODE:

[(af ′′ − bf ′ + b2f

4a

)+

(bf ′ − b2f

2a

)+ cf

]e−bt/2a = af ′′+

(c− b2

4a

)e−bt/2a = 0

However, since b2 − 4ac = 0, this simplifies to f ′′ = 0. Thus againf = c1 + c2t, and our solution is given by

y(t) = c1e−bt/2a + c2te

−bt/2a

Again, we can check that they form a fundamental set of solutionsby evaluating the Wronskian:

W =

∣∣∣∣ e−bt/2a te−bt/2a

− b2ae−bt/2a e−bt/2a − bt

2ae−bt/2a

∣∣∣∣ = e−bt/a 6= 0

80

Thus we now know the formula for a general solution to second-orderlinear homogeneous ODEs when the characteristic equation has a re-peated root.


9y′′ − 12y′ + 4y = 0, y(0) = 2, y′(0) = −1

The characteristic equation is 9r2 − 12r + 4 = 0, which has a repeatedroot at r = 2/3. Thus we know the general solution takes the form

y(t) = c1e2t/3 + c2te

2t/3

y′(t) =2

3c1e

2t/3 + c2(1 + 2t/3)e2t/3

Plugging in the initial conditions gives c1 = 2 and c2 = −7/3, so oursolution is

y(t) = 2e2t/3 − 7

3tet2/3

Notice that as t → ∞, this solution approaches −∞. However, ifwe change the initial condition for y′(0) to be y′(0) = 3 we find thatc2 = 5/3 and our solution instead approaches +∞. Thus the initialslope is very important in determining long-time behavior of solutionswhen repeated roots are present.

The process we used to derive the general solution to equations withrepeated roots is an example of a more general technique known asreduction of order. If we consider the general second-order linearhomogeneous ODE:

y′′ + p(t)y′ + q(t) = 0

and we know that y1 solves the ODE, then we can generate anothersolution, much as we did before. Specifically, write y = f(t)y1(t), then

y′(t) = f ′(t)y1(t)+f(t)y′1(t) and y′′(t) = f ′′(t)y1(t)+2f ′(t)y′1(t)+f(t)y′′1(t)

Plugging into the ODE (and suppressing the t dependence of allfunctions):

(f ′′y1 + 2f ′y′1 + fy′′1) + p(f ′y1 + fy′1) + qfy1 = 0

If we collect terms of the same order in f , we obtain

81

y1f′′ + (2y′1 + py1)f

′ + (y′′1 + py′1 + qy)f = 0

By assumption the coefficient of f is zero, since y1 solves the equa-tion. Thus we have the equation for g = f ′:

y1g′ + (2y′1 + py1)g = 0

Rewriting:

g′

g= −2y′1 + py1

y1

ln |g| = −2 ln |y1| −∫p(t) dt+ C

g =C

e∫p(t)|y1(t)|2

f =

∫C

e∫p(t)|y1(t)|2

Thus our new solution is given by

y(t) = f(t)y1(t) = y1(t)

∫C

e∫p(t)|y1(t)|2

In general, this formula is a bit unwieldy: let’s demonstrate theprocess with an example:

Example: Given that the function y1(t) = 1/t solves the equation

t2y′′ + 3ty′ + y = 0, t > 0

find a second independent solution to the ODE. To do this, let y(t) =f(t)y1(t) = f(t)/t. Then

y′(t) =f ′(t)

t− f(t)

t2and y′′(t) =

f ′′(t)

t− 2f ′(t)

t2+

2f(t)

t3


tf ′′(t)− 2f ′(t) +2f(t)

t+ 3f ′(t)− 3f(t)

t+f(t)

t= 0

Simplifying and solving for f :

f ′′(t)

f ′(t)=−1

t

ln |f ′(t)| = − ln t+ C

82

f ′(t) = Ct−1

f(t) = C ln t

y(t) =C ln t

t3.4 Suggested Problems: 6,14,16,22,24,28

§3.5: Nonhomogeneous Equations;Method of Undetermined CoefficientsIn the next two sections, we study two different methods for solving

nonhomogeneous second-order linear ODEs. In general, the equationswe are consider will take the form

y′′ + p(t)y′ + q(t)y = g(t)(11)

which can also be written as L[y] = g(t) using the terminology of§3.2. We will call

y′′ + p(t)y′ + q(t)y = 0

(or L[y] = 0) the corresponding homogeneous equation (or associatedhomogeneous equation). To solve the nonhomogeneous equation, wewill see that we still need the general solution to the correspondinghomogeneous equation, but we must also find a particular solution tothe nonhomogeneous problem. To make this more precise, we need thefollowing:

Great Fact: If Y1 and Y2 both solve the nonhomogeneous equationthen their difference Y1 − Y2 is a solution of the corresponding homo-geneous problem.

Why is this true? It follows from the fact that L is a linear operator,so

L[Y1 − Y2] = L[Y1]− L[Y2] = g(t)− g(t) = 0

We can say even more than this though. Namely, we can alwaysfind a fundamental set of solutions y1 and y2 to the correspondinghomogeneous problem (from the stuff we did in §3.2). In addition, the

83

uniqueness portion of the existence and uniqueness theorem says thatsince Y1−Y2 solves the homogeneous problem it must be the case that

Y1 − Y2 = c1y1 + c2y2

as c1y1 + c2y2 is the general solution to the corresponding homoge-neous problem, which we will often write as yc What can we concludefrom this? If we want to find the general solution of a nonhomoge-neous second-order linear ODE, all we need is the general solution ofthe associated homogeneous problem yc and one particular solutionof the nonhomogeneous equation, which we often write as yp. Thenany other solution of the nonhomogeneous equation can be accountedfor by simply changing the constants in our general solution of the as-sociated homogeneous problem (this is what we just showed).

This gives the following general procedure for solving nonhomoge-neous equations:

(1) Find the general solution of the corresponding homogeneousequation yc

(2) Find one particular solution of the nonhomogeneous equation yp

(3) Solution y = yc + yp

We’ve already investigated step 1 of this process in some detail, andthe next two sections give two different methods of completing step2. While the method in the next section works more generally, themethod in this section is more likely to give an explicit answer. Themethod we will study now is known as the method of undeterminedcoefficients.

The method of undetermined coefficients depends heavily on thenonhomogeneous term g(t). We examine g(t) and guess a solution tothe equation that is similar to g(t), then use this guess to find an exactsolution. Let’s illustrate with an example:

Example: Find a particular solution of

y′′ − 2y′ − 3y = 3e2t

In this case, we have g(t) = e2t, so the solution we guess should look likethis. In fact, we will guess y(t) = Ae2t where A is an undeterminedcoefficient. By plugging into the ODE, we should be able to find anappropriate value for A that will solve the equation. Computing:

y′(t) = 2Ae2t and y′′(t) = 4Ae2t

84


4Ae2t − 4Ae2t − 3Ae2t = 3e2t

Next, we balance coefficients of e2t, which gives the equation −3A = 3,so the appropriate choice for A = −1. Thus a particular solution tothis equation is given by yp(t) = −e2t (you can check by plugging itback in). We also know that the general solution to the associatedhomogeneous problem is given by

yc(t) = c1e3t + c2e

−t

Combining these facts, the general solution to the given nonhomoge-neous ODE is

y(t) = c1e3t + c2e

−t − e2t

This example shows that when g is an exponential, the method ofundetermined coefficients does well. Another time we can use thismethod is when g involves sines and cosines. Let’s see an example ofthis:

Example: Find a particular solution of the ODE

y′′ + 2y′ + 5y = 3 sin 2t

Again, we want to guess a solution, this time we expect it should involveA sin 2t. However, we must make a change: the y′ term in the ODEwill yield a cos term, which shouldn’t be in the final answer. Thus weneed some way of canceling this term. Instead of just guessing a sineterm, we guess y(t) = A sin 2t+B cos 2t. Computing derivatives gives

y′(t) = 2A cos 2t− 2B sin 2t and y′′(t) = −4A sin 2t− 4B cos 2t


−4A sin 2t−4B cos 2t+4A cos 2t−4B sin 2t+5A sin 2t+5B cos 2t = 3 sin 2t

In this case, we have two undetermined coefficients, but we now alsohave two equations that must balance: namely, the coefficient of sin 2tand the coefficient of cos 2t. Thus we obtain

A− 4B = 3

4A+B = 0

The second equation gives B = −4A, plugging this into the first equa-tion gives A = 3/17, so B = −12/17. Thus the particular solution tothe nonhomogeneous ODE is

yp(t) =3

17sin 2t− 12

17cos 2t

85

For your health: solve the associated homogeneous problem to formthe general nonhomogeneous solution.

Polynomials also work well for undetermined coefficients problems.

Example: Find a particular solution to the nonhomogeneous ODE

y′′ − y′ − 2y = −2t+ 4t2

In this case, the function we should guess is y(t) = At2 +Bt+C (eventhough the right-hand side doesn’t contain any constant terms, theywill come up when we take derivatives of terms involving t). Takingtwo derivatives:

y′(t) = 2At+B and y′′(t) = 2A

Plugging these into the ODE:

2A− 2At−B − 2At2 − 2Bt− 2C = −2t+ 4t2

Note that by balancing coefficients of t2, t, and 1 we again obtain thesame number of equations as undetermined coefficients we guessed:

−2A = 4

−2A− 2B = −2

2A−B − 2C = 0

The first equation gives A = −2, and plugging this into the secondequation gives B = 3. Plugging both of these into the third equationgives C = −7/2. Thus our particular solution is

yp(t) = −2t2 + 3t− 7

2

Again, this can be combined with the general solution to the associatedhomogeneous problem to give the general solution of the nonhomoge-neous problem:

y(t) = c1e2t + c2e

−t − 2t2 + 3t− 7

2

Using the previous three examples, we can solve a lot of nonhomoge-neous ODEs formed by products and sums of exponentials, sines andcosines, and polynomials. For example, to find a particular solution of

y′′ − y′ − 2y = et − 2t+ 4t2

86

we can simply find a particular solution to

y′′ − y′ − 2y = et

by guessing an exponential, then add this answer to the particularsolution we got for

y′′ − y′ − 2y = −2t+ 4t2

in the previous example. It is often much more convenient to break upan undetermined coefficient problem in this manner when possible.

We can also deal with products of the aforementioned functions. Forexample,

If g(t) = 2t cos t, guess y(t) = (At+B) cos t+ (Ct+D) sin t

If g(t) = t2e3t cos 8t, guess y(t) = (At2+Bt+C)e3t cos 8t+(Dt2+Et+F )e3t sin 8t

In these cases, the algebra is more tedious but the process is thesame. However, there is one important issue that can arise when usingthe method of undetermined coefficients. Let’s see with an example:

Example: Find a particular solution of

y′′ − 2y′ − 3y = 12e3t

Based on what we have discussed so far, an appropriate guess wouldbe y(t) = Ae3t. Let’s see what happens:

y′(t) = 3Ae3t and y′′(t) = 9Ae3t


9Ae3t − 6Ae3t − 3Ae3t = 12e3t

which leaves the equation 0A = 12. What went wrong? Notice that thegeneral solution of the corresponding homogeneous problem is given by

yc(t) = c1e3t + c2e

−2t

Thus the solution we guessed already solves the homogeneous problem,so it’s impossible for it to solve the nonhomogeneous equation. How canwe fix this? The solution is to multiply our guess by t. This is indicativeof a more general procedure: anytime one of your terms contains aterm that solves the homogeneous equation, you need to multiply yourusual guess by t (and even possibly t2, but always try multiplying byt first). For example, if g(t) = 12te3t, then the usual guess would bey(t) = (At + B)e3t, but because e3t solves the homogeneous problem

87

you should instead guess y(t) = (At2 +Bt)e3t. Back to the example athand: we will now guess y(t) = Ate3t. In this case,

y′(t) = Ae3t + 3Ate3t and y′′(t) = 6Ae3t + 9At33t


6Ae3t + 9Ate3t − 2Ae3t − 6Ate3t − 3Ate3t = 12e3t

Notice that the terms involving tet all cancel (in essence because we’vepushed the solution of the homogeneous problem up by a power of t),and we are left with the equation 4A = 12, so we have A = 3, and ourgeneral solution is

y(t) = c1e3t + c2e

−2t + 3te3t

Summary: We use the method of undetermined coefficients forfunctions involving sines, cosines, exponentials, and polynomials. Toapply the method, the general procedure is as follows:

Suppose we want to solve

ay′′ + by′ + cy = g(t)

where g(t) involves sines, cosines, exponentials, and polynomials.

(1) Solve the associated homogeneous problem

ay′′ + by′ + cy = 0

(2) If g = g1 + g2 + ..., break up the problem by separately findingparticular solutions to

ay′′ + by′ + cy = g1(t), ay′′ + by′ + cy = g2(t), ...

for each term in the sum.(3) To find the particular solution yi corresponding to each gi, make

sure that no part of gi is a solution of the associated homoge-neous problem.(a) If no part of gi solves the homogeneous problem, and gi

takes the form

gi(t) = Pn(t)eαt cos(βt)

where Pn(t) is a polynomial of degree n (and cos can alsobe replaced by sin, then your guess should be

yi(t) = (Antn+...+A1t+A0)e

αt cos(βt)+(Bntn+...+B1t+B0)e

αt sin(βt)

88

where the Ai and Bi are your undetermined coefficients (ifgi is simpler than this, just remove the terms that it ismissing from your guess).

(b) If some part of gi does solve the homogeneous problem,then multiply the above guess by t (you only have to mul-tiply by t2 when the characteristic equation has a repeatedroot).

(4) Once you find each yi, add them all together, along with thegeneral solution to the associated homogeneous equation (whichyou found in step 1) to obtain the general solution to the non-homogeneous problem.

3.5 Suggested Problems: 6,7,9,17,23a),27a),30

§3.6: Variation of ParametersIn this section, we learn a more general method for solving nonhomo-

geneous second-order linear ODEs, assuming that we know the solutionto the corresponding homogeneous problem. Suppose we are trying tosolve

y′′ + p(t)y′ + q(t)y = g(t)

for given p, q, and g. Then the corresponding homogeneous equationis given by

y′′ + p(t)y′ + q(t)y = 0

Note that we may not know how to solve this: so far we have onlydealt with solving constant-coefficient equations. However, this methodworks for the larger class of linear equations. Suppose the generalsolution to the homogeneous equation is given by

yc(t) = c1y1(t) + c2y2(t)

In the last section, we added a term to the equation to solve the non-homogeneous equation: in this section we will multiply y1 and y2 byappropriate functions to solve the problem. Specifically, let’s look fora solution to the nonhomogeneous equation taking the form

y(t) = u1(t)y1(t) + u2(t)y2(t)

where y1 and y2 are the fundamental set of solutions to the homoge-neous problem. We would like to find out what u1 and u2 must be tosolve the nonhomogeneous equation. To do this, we must substituteour guess into the nonhomogeneous equation. In this case, we obtainone equation for two unknowns, u1 and u2, which is an underdeter-mined system. Thus we can impose an additional condition on u1 and

89

u2, chosen to simplify our calculations as much as possible. In thiscase, the condition we would like to impose is that

u′1y1 + u′2y2 = 0

The reason for doing this: to plug in our guess to the equation, we musttake two derivatives of u1y1 + u2y2. By assuming this, we simplify ourcalculations considerably. Using this assumption, let’s compute:

y′(t) = u′1y1 + u1y′1 + u′2y2 + u2y

′2 = u1y

′1 + u2y

′2

y′′(t) = u′1y′1 + u1y

′′1 + u′2y

′2 + u2y

′′2

Plugging both of these into the ODE yields

u′1y′1 + u1y

′′1 + u′2y

′2 + u2y

′′2 + p[u1y

′1 + u2y

′2] + q[u1y1 + u2y2] = g

If we collect like terms of u1 and u2, we obtain

u1[y′′1 + py′1 + qy1] + u2[y

′′2 + py′2 + qy2] + u′1y

′1 + u′2y

′2 = g

Since y1 and y2 both solve the homogeneous problem, the first twoterms are both zero. Thus we now have the system of two equationsfor u′1 and u′2:

u′1y′1 + u′2y

′2 = g

u′1y1 + u′2y2 = 0

Solving this system for u′1 and u′2 gives

u′1 = − gy2/y′2

y1 − y2y′1y′2

= − gy2y1y′2 − y′1y2

= − gy2W (y1, y2)

u′2 = − gy1/y′1

y2 − y1y′2y′1

= − gy1y2y′1 − y1y′2

=gy1

W (y1, y2)

By integrating these equations, we obtain the functions u1 and u2:

u1(t) = −∫

y2(t)g(t)

W (y1, y2)(t)dt+ c1, u2(t) =

∫y1(t)g(t)

W (y1, y2)(t)dt+ c2

This means that a particular solution yp(t) is given by

yp(t) = −y1(t)[∫

y2(t)g(t)

W (y1, y2)(t)dt+ c1

]+ y2(t)

[∫y1(t)g(t)

W (y1, y2)(t)dt+ c2

]Notice that when we add yp(t) to the general solution we can assume

that c1 and c2 above are zero, since they only add terms of the formc1y1 + c2y2, which can be absorbed into the constants for the general

90

solution to the corresponding homogeneous problem. Thus the generalnonhomogeneous solution is

y(t) = c1y1(t)+c2y2(t)−y1(t)∫

y2(t)g(t)

W (y1, y2)(t)dt+ y2(t)

∫y1(t)g(t)

W (y1, y2)(t)dt


y′′ − 2y′ + y =et

1 + t2

First, we find the general solution to the corresponding homogeneousproblem

y′′ − 2y′ + y = 0

This is given byyc(t) = c1e

t + c2tet

To find the nonhomogeneous solution, we can simply plug into theformula

yp(t) = −y1(t)∫

g(t)y2(t)

W (y1, y2)(t)dt + y2(t)

∫g(t)y1(t)

W (y1, y2)(t)dt

with y1 = et, y2 = tet, and g(t) = et

1+t2.

W (y1, y2)(t) =

∣∣∣∣et tet

et (t+ 1)et

∣∣∣∣ = e2t

yp(t) = −et∫

t

1 + t2ds+ tet

∫1

1 + t2ds

Using u-substitution to evaluate the first integral (u = 1 + t2, du =2t dt):

yp(t) = −et∫

1

2udu+ tet arctan(t)

= −1

2et ln(1 + t2) + tet arctan(t)

This means that the general solution to the nonhomogeneous equationis given by

y(t) = c1et + c2te

t − 1

2et ln(1 + t2) + tet arctan(t)

Example: Find the general solution of the nonhomogeneous equa-tion

y′′ + 4y = 3 csc 2t, 0 < t < π/2

91

First, the solution to the corresponding homogeneous equation:

yc(t) = c1 cos 2t+ c2 sin 2t

This means that

W (y1, y2)(t) =

∣∣∣∣ cos 2t sin 2t−2 sin 2t 2 cos 2t

∣∣∣∣ = 2

and so the particular solution is given by

yp(t) = − cos 2t

∫1

2sin 2t csc 2t dt + sin 2t

∫cos 2t csc 2t

= −3

2t cos 2t+ sin 2t

∫1

2udu

= −3

2t cos 2t+

3

4sin 2t ln(sin 2t)

where we have used u-substitution to evaluate the second integral withu = sin 2t, du = 2 cos 2t dt. Thus the general solution to the nonhomo-geneous equation is given by

y(t) = c1 cos 2t+ c2 sin 2t− 3

2t cos 2t+

3

4sin 2t ln(sin 2t)


§3.7: Mechanical and ElectricalVibrations

In this section, we investigate some applications of second-order con-stant coefficient ODEs to physical systems. Specifically, we will bemodeling the behavior of these systems with an IVP of the form

ay′′ + by′ + cy = g(t), y(0) = y0, y(0)′ = y′0We will see that multiple physical systems fit within the framework

of this IVP, so by being able to solve one class of IVP we can in factdescribe a wide array of different physical problems. First, we willinvestigate the behavior of a mass on a spring:

Mass on a Spring: Suppose we have a spring having length l hang-ing vertically, and we suspend an object of mass m from the bottomof the spring. The mass elongates the spring by some length, call itL (draw a picture). If we orient our coordinates so that the down-ward direction is positive and attempt to calculate the forces acting onthe spring, we can find two opposite forces. First, gravity acts on theobject with a force of mg. In addition, the spring exerts an upward

92

force Fs on the object to restore the mass to an equilibrium position.By Hooke’s Law, this force is proportional to the displacement of thespring from its natural position without the mass (this is valid for smallL, and comes about as a consequence of linearization). In math speak

Fs = −kL

for some k > 0 (known as the spring constant), since the force pulls inthe negative direction. If the mass is in equilibrium, these forces willbalance, which gives

mg − kL = 0 or k =mg

L

Moving away from the spring in equilibrium state, suppose we displacethe spring by some distance. Then we can write

u(t) = Displacement of the spring from its equilibrium state at time t

where u > 0 when the spring is elongated. Using Newton’s second law,we have

F = ma = mu′′(t)

What forces are acting on the object? The first two are the same aswhen the system is in equilibrium:

(1) The weight of the object pulls downward with a constant forceof w = mg

(2) The spring force Fs is proportional to the elongation of thestring L + u (with proportionality constant k) and it pulls thespring towards its natural position. As a result,

Fs = −k(L+ u)

(3) There is always a force pulling in the opposite direction of thecurrent direction the spring is moving, and this force is due todamping. This can be due to friction between the spring andother portions of the apparatus, resistance due to the mediumin which the spring is moving, etc. We assume that this forceis proportional to |u′(t)| and we write it as

Fd(t) = −γu′(t)

where γ is a positive constant. The assumption that the damp-ing force is proportional to the velocity is not necessarily a validone in all circumstances, but it is important in that it allows usto try to solve a linear ODE rather than a nonlinear one.

(4) Possible applied external forces, either due to the motion ofthe mount to which the spring is attached, or a force applied

93

directly to the mass. Often this force is periodic and is denotedby F (t).

Combining all the above forces and using Newton’s second law gives

mu′′(t) = mg + Fs(t) + Fd(t) + F (t)

= mg − k[L+ u(t)]− γu′(t) + F (t)

Using that mg = kL, we can rewrite the above as

mu′′(t) + γu′(t) + ku(t) = F (t)

In addition, to formulate the problem completely, we need two initialconditions, which will be

u(0) = u0, u′(0) = v0

Then by our existence and uniqueness theorem, we can always solvethe IVP (i.e. we can always determine the position and velocity of themass at a later time). Let’s see how to set up one of these problems:

Example 1: A mass weighing 10 lb stretches a spring 3 in. Supposewe pull the mass downwards by a foot and then release it. In addition,the spring is in a medium exerting a viscous resistance of 8 poundswhen the mass has a velocity of 2 ft/s. Let’s set up the initial valueproblem to describe the behavior of the mass. First, we should find m,γ, and k.

m =w

g=

10 lb

32 ft/s2

γ =8 lb

2 ft/s= 4 lb · ft/s

k =10 lb

1/4 ft= 40 lb/ft

In addition, we have the initial conditions

u(0) = 1, u′(0) = 1

which supplement the IVP

5

16u′′ − 4u′ − 40u = 0

Note that there are no external forces present, so F (t) = 0.

Let’s make some simplifying assumptions to study the behavior ofthe system. Continuing to assume there are no external forces, let’s

94

additionally assume that the damping in the system is negligible, inwhich case we can rewrite our ODE as

mu′′ + ku = 0

This has characteristic equation mr2+k = 0, with roots r = ±i√k/m.

It follows that the general solution of this ODE is given by

u(t) = A cosω0t+B sinω0t

whereω20 = k/m

Recall the great fact

cos(θ − φ) = cos θ cosφ+ sin θ sinφ

We can use this to rewrite our solution u as

u = R cos(ω0t− δ) = R cos δ cosω0t+R sin δ sinω0t

Using the second equality, we can see that

R cos δ = A, R sin δ = B

which can be solved for R and δ for any A and B:

R =√A2 +B2, tan δ = B/A

(where we must make sure to choose the proper branch of tangent basedon the signs of sin δ and cos δ). We can plot the solution, and we willget a cosine wave beginning at R cos δ, with an amplitude (maximumdisplacement) of R. In addition, the period of the solution is given by

T =2π

ω0

= 2π√m/k

The constant δ is known as the phase, and measures the displace-ment of the wave from a standard cosine wave. Finally, ω0 =

√k/m

(which has units of radians/time) is often referred to as the naturalfrequency of the vibration. Let’s see an example:

Example 2: Suppose a mass weighing 20 lb stretches a spring by 6in. In addition, we displace the mass an additional 6 in and set it intomotion with an initial upward velocity of 1 ft/s. Find the position ofthe mass at any later time, and find the period, amplitude, and phaseof the motion.

In this case, the mass of the object is m = 20 lb/32 ft/s2 and thespring constant is k = 20 lb/0.5 ft = 40 lb/ft. Combining this with theinitial conditions, we obtain the IVP

95

u′′ + 64u = 0, u(0) = 0.5, u′(0) = −1

The general solution is given by

u(t) = A cos 8t+B sin 8t

and plugging in the initial conditions yields

A = 0.5

8B = −1

Thus the position at time t is given by

u(t) =1

2cos 8t− 1

8sin 8t

Figure 20. Example 2 solution plot (no damping)

We can also rewrite this solution as

u(t) = R cos(8t− δ)

where

R =√A2 +B2 =

√17

64tan δ = −1/4

96

Since A > 0 and B < 0, the appropriate choice for δ is in the fourthquadrant, and δ ≈ −0.24 radians. The amplitude is given by R, thephase is given by delta, and the period is

2π

ω0

=2π

8=π

4

In this case, the oscillations do not decay over time (since there isno damping). Let’s see what happens if we include damping. In thiscase, our general equation becomes

mu′′ + γu′ + ku = 0

with corresponding characteristic equation mr2+γr+k = 0. The rootsare given by

r =−γ ±

√γ2 − 4km

2m

As in previous sections, we can consider three cases depending on γ2−4km: in any case we can see that our solution u converges to zero ast→∞ (this occurs because γ, m, and k are all positive). However, weare most interested in the case where oscillation is occurring, i.e. whenγ2 − 4km < 0. In this case, the general solution is given by

u(t) = e−γt/2m(A cosµt+B sinµt), µ =

√4km− γ2

2m

In this case, we can again rewrite our solution as

u(t) = Re−γt/2m cos(µt− δ)

In this case, our solution is bounded by

|u(t)| ≤ Re−γt/2m

The motion is not technically periodic, but we still often call µ thequasi frequency. Let’s compare µ to the frequency in the case wherethere is no damping:

µ

ω0

=

√4km− γ2/2m√

k/m=

√1− γ2

4km

Now, one can check explicitly that the first two terms of the Taylorseries for f(x) =

√1− x centered at x = 0 is given by

√1− x = 1− x

2+ ...

97

This is an accurate approximation if x is small. Applying this to µω0

,

we conclude that if γ2

4kmis small, we have

µ

ω0

≈ 1− γ2

8km

In other words, small damping slightly reduces the frequency of theoscillation. Based on the inverse relationship between the two, weshould expect that it slightly increases the period. Let’s check thisexplicitly: if Td = 2π/µ is the quasi period, then

TdT

=ω0

µ=

(1− γ2

4km

)−1/2≈ 1 +

γ2

8km

where this time we use the first two terms of the Taylor series forf(x) = (1− x)−1/2 centered at x = 0.

Let’s focus more on the ratio γ2

4km. Since γ has units of lb·s/ft and

k has units of lb/ft, and m has units of lb·s2/ft, we see that this ratiois in fact dimensionless. In addition, by rewriting the roots of thecharacteristic equation as

r =γ

2m

(−1±

√1− 4km

γ2

)we see that the three different types of solution can be categorized by

whether the ratio γ2

4kmis less than, greater than or equal to one. In addi-

tion, by our formula for the quasi period µ, we see that as γ → 2√km,

µ → 0 and Td → ∞, meaning that in this limit, we lose oscillationand the general behavior of the solution changes (transitioning to an

exponential with a negative real power). At γ = 2√km we say the

solution is critically damped, and for γ > 2√km the system is over

damped, as the mass will only pass through the equilibrium positionat most one time.

Example 3: A spring is stretched 10 cm by a force of 3 N. A mass of2 kg is hung from the spring and is also attached to a viscous damperthat exerts a force of 3 N when the velocity of the mass is 5 m/s. Ifthe mass is pulled down 5 cm below its equilibrium position and givenan initial downward velocity of 10 cm/s, determine its position u(t).Find the quasi frequency µ and the ratio of µ to the natural frequencyof the corresponding undamped motion.

98

In this case: m = 2kg, γ = 3N/5(m/s), and k =3N/0.1m, so ourIVP is

2u′′ +3

5u′ + 30u = 0, u(0) = 0.05, u′(0) = 0.1

The characteristic equation 2r2 + 35r + 30 = 0 has roots

r =−3

5±√

925− 240

4= − 3

20±√

5991

20i

so our general solution is

u(t) = e−3t/20

[A cos

(√5991

20t

)+B sin

(√5991

20t

)]Omitting tedious algebra, we solve for A and B using the initial con-ditions to find

u(t) = e−3t/20

[0.0277772 cos

(√5991

20t

)+ 0.05 sin

(√5991

20t

)]We can also write this in the form R cos(ω0t−δ) with R and δ as before,in which case we obtain

u(t) = 0.571977e−3t/20 cos

(√5991

20t− 0.50709

)In this case, the quasi frequency is µ =

√599120≈ 3.87007. The frequency

with no damping is ω0 =√k/m ≈ 3.87298. Thus the ratio between

the two isµ

ω0

=3.87007

3.87298≈ 0.99925

99

Figure 21. Example 3 plot (damping) with exponentialdecay envelope


§3.8: Forced VibrationsIn this section, we consider systems in which a periodic external force

is applied to a mass-spring system using the techniques of §3.5. First,we consider an example where damping is present:

Example 1: Suppose we have a mass on a spring whose behavior isgoverned by the IVP

u′′ + u′ + 1.25u = 3 cos t, u(0) = 2, u′(0) = 3

In this case, the damping is given by γ = 1 (in whatever the appropriateunits are for the problem) and F (t) = 3 cos t is the external force. Thecorresponding homogeneous equation is given by

u′′ + u′ + 1.25u = 0

with characteristic equation r2 + r + 1.25 = 0, which has roots atr = −1/2± i. Thus the general solution of the homogeneous equationis given by

uc(t) = e−t/2 [c1 cos t+ c2 sin t]

100

We find a particular solution to the nonhomogeneous equation using themethod of undetermined coefficients. More precisely, we guess u(t) =A cos t+B sin t. Computing:

u′(t) = −A sin t+B cos t and u′′(t) = −A cos t−B sin t


−A cos−B sin t− A sin t+B cos t+ 1.25A cos t+ 1.25B sin t = 3 cos t

Balancing terms:

0.25A+B = 3

−A+ 0.25B = 0

The second equation gives A = 0.25B, plugging this into the first:B = 48/17, and so A = 12/17. Thus a particular solution is

up(t) =12

17cos t+

48

17sin t

Combining the two: the general solution of the nonhomogeneous prob-lem is

u(t) = e−t/2 [c1 cos t+ c2 sin t] +12

17cos t+

48

17sin t

To find c1 and c2 we plug in the initial conditions. First,

u′(t) = e−t/2[−1

2(c1 cos t+ c2 sin t) + (c2 cos t− c1 sin t)

]−12

17sin t+

48

17cos t

Plugging in:

c1 +12

17= 2

−1

2c1 + c2 +

48

17= 3

The solution to this system is given by c1 = 22/17, c2 = 14/17, so thatthe solution of the IVP is

u(t) = e−t/2[

22

17cos t+

14

17sin t

]+

12

17cos t+

48

17sin t

Comments: We can make a distinction between the solution of thecorresponding homogeneous problem and the particular solution of thenonhomogeneous problem. Note that the homogeneous solution decaysexponentially, so its influence only lasts for a short time. Because ofthis, we often call the solution uc(t) the transient solution. As the

101

transient solution fades away, the system approaches its steady state(also called forced response, which is given by the particular solutionof the nonhomogeneous equation. Notice that the transient solution isthe portion of the solution that allows us to satisfy the initial condi-tions. In this case, the transient solution dissipates, so the effect of theinitial conditions does not persist over long time scales.

Figure 22. Example 1 plot (damping): blue is steadystate solution, red is transient solution

More generally, we can consider the equation

mu′′(t) + γu′(t) + ku = F0 cosωt

In this case, we can find the general solution to the associated homoge-neous equation (much as in the last section); suppose the fundamentalset of solutions is u1, u2. Then the general solution of the nonhomoge-neous problem looks like

u(t) = c1u1(t) + c2u2(t) + A cosωt+B sinωt = uc(t) + up(t)

As in the last section, we can rewrite up(t) as

up(t) = R cos(ωt− δ)and in this case we can find R and δ using the equations

R =F0

∆, cos δ =

m(ω20 − ω2)

∆, sin δ =

γω

∆

where

∆ =√m2(ω2

0 − ω2)2 + γ2ω2

102

and ω0 =√k/m is the natural frequency of the unforced system in

the absence of damping. Suppose we are interested in the behavior ofthe amplitude of our solution: specifically, how does it behave when wehave low-frequency forcing? It turns out that as ω → 0 we have R →F0/k, which represents the static displacement the spring experiencesunder a force F0. For very high frequency excitations (ω → ∞) wehave R → 0. For intermediate values of R, the maximum amplitudeoccurs at

Rmax ≈F0

γω0

(1 +

γ2

8mk

)ω2max = ω2

0

(1− γ2

2mk

)However, if γ2/mk > 2, then R is a monotone decreasing function ofω (note that this change occurs at half the critical damping value ofγ2/mk).

Notice that if γ is small, then

Rmax ≈F0

γω0

Put another way, when ω is very near ω0 (meaning that γ is verysmall) even a small initial force may have a very high value of Rmax.Thus forcing near the resonant frequency can lead to wild oscillationsin many cases, and this phenomenon is known as resonance.

In addition, we could consider the way that the phase δ depends onω. We can see that when ω is small, δ is close to 0, so the solution isclose to being in phase with the forcing function. When ω = ω0, thesolution lags behind the forcing function by π/2, and for ω very large,the solution becomes completely out of phase with the forcing function,differing by π.

Next, we consider forced vibrations when there is no damping present.In this case, our equation takes the form

mu′′ + ku = F0 cosωt

The form of our general solution depends on whether ω = ω0 (i.e.whether the forcing frequency is the same as the natural frequency). Ifthey are different, then by guessing the solution y = A cosωt we obtain(−ω2m + k)A = F0, which can be rewritten as A = F0

m(ω20−ω2)

. This

means the general solution is

u(t) = c1 cosω0t+ c2 sinω0t+F0

m(ω20 − ω2)

cosωt

103

Suppose that the mass is initially at rest, so that u(0) = u′(0) = 0.Why do this? This way, all the energy in the system is coming fromthe external driving force. When this happens, we find that c2 = 0 andc1 = − F0

m(ω20−ω2)

. This means our solution is

u(t) =F0

m(ω20 − ω2)

(cosωt− cosω0t)

Let’s use trig identities to rewrite this. Since

cos(A+B) = cosA cosB−sinA sinB, cos(A−B) = cosA cosB+sinA sinB

we have that

cos(A−B)− cos(A+B) = 2 sinA sinB

Applying this to A = 12(ω0 + ω)t and B = 1

2(ω0 − ω)t gives

u(t) =2F0

m(ω20 − ω2)

sin

((ω0 − ω)t

2

)sin

((ω0 + ω)t

2

)Specifically, we are interested in what happens when the natural fre-

quency is near the forcing frequency. In this case, |ω0−ω| is very smallcompared to |ω0 +ω|, so the first sine term in the above solution variesmuch more slowly than the second sine term. Thus we can describethe motion of our solution by two quantities: the “fast” frequency(ω0 + ω)/2 and the slowly varying amplitude, given by

2F0

m(ω20 − ω2)

∣∣∣∣sin((ω0 − ω)t

2

)∣∣∣∣Oftentimes such motion is described as a pulse or a beat, and we will

see why when studying the next problem.

Example 2: Solve the IVP

u′′ + u = 0.5 cos 0.8t, u(0) = 0, u′(0) = 0

We can plug in the more general solution we just derived: here F0 = 0.5,ω = 0.8, ω0 = 1, and m = 1, so our solution is given by

u(t) =1

0.36sin 0.1t sin 0.9t

Plotting the solution:

104

Figure 23. Example 2 Plot: Forcing below resonantfrequency (no damping)

If we were to increase ω = 0.8 closer and closer to the resonant fre-quency, then the amplitude and rate of (fast) oscillation would increase,and each pulse would spread out. But what happens when ω = ω0?

Example 3: Solve the IVP

u′′ + u = 0.5 cos t, u(0) = 0, u′(0) = 0

This is the same ODE as the previous example, but now the forc-ing term is at the resonant frequency. We will see that this leads todrastically different behavior. The homogeneous problem has generalsolution

uc(t) = c1 sin t+ c2 cos t

and the particular solution to the nonhomogeneous problem takes theform up(t) = At cos t+Bt sin t, so that

u′p(t) = −At sin t+ A cos t+Bt cos t+B sin t

u′′p(t) = −At cos t− 2A sin t+ 2B cos t−Bt sin t


−At cos t− 2A sin t+ 2B cos t−Bt sin t+At cos t+Bt sin t = 0.5 cos t

105

Balancing coefficients, we see that A = 0 and B = 1/4, so the generalsolution of the nonhomogeneous problem is

u(t) = c1 sin t+ c2 cos t+1

4t sin t

Plugging in the initial conditions gives c2 = 0 and c1 = 0, so that

u(t) =1

4t sin t

Let’s plot the solution:

Figure 24. Example 3 Plot: Forcing at resonant fre-quency (no damping)

We see that a spring described by this model will experience oscil-lations with amplitudes growing linearly in time. This occurs becausethe forcing term is at the resonant frequency of the spring. In addi-tion, this means that such a model cannot be valid over very long timescales, as the oscillations become arbitrarily large. One method forfixing this problem involves the introduction of a second independenttime variable to describe the behavior of the solution over long times.This is a technique in the field of asymptotic analysis known as themethod of multiple scales, and allows us to construct a solution to theODE that will remain bounded.


106

§6.1: Definition of the LaplaceTransform

Oftentimes in practice, the differential equations we would like tosolve involve functions with discontinuities (or in extreme cases objectsthat are not even-well defined as functions). For example, suppose wesend an electrical impulse of magnitude 1 on the time interval 0 ≤ t ≤ 1and then shut it off instantaneously. In this case, the impulse can berepresented using the step function

f(t) =

{1 0 ≤ t ≤ 10 t > 1

which is clearly discontinuous at t = 1. In this case, none of themethods from Chapter 3 would apply to solve ODEs involving thisfunction, as we assumed from the start that all functions were contin-uous. As another example, in physics the delta function is definedby

δ(x) =

{∞ x = 00 x 6= 0

This is often used to represent a point source at the origin. Supposewe want to solve −u′(x) = δ(x): then none of the methods we havelearned so far will be sufficient, but this is still an important problemto understand from the standpoint of physics (we are saying that wehave a point source potential and would like to find the conservativeforce generating it).

Enter the Laplace transform: an integral transform that we canapply to piecewise continuous functions that will help us in solvingODEs when the constituent functions are not necessarily continuous.Piecewise continuous means that a function f is continuous except forat finitely many points, and the only types of discontinuity f has arejump discontinuities (draw a picture). Before discussing the Laplacetransform and other integral transforms at length, we need a review ofimproper integrals.

An improper integral over an unbounded region is defined as a limitof integrals over finite regions. Explicitly,∫ ∞

a

f(t) dt := limA→∞

∫ A

a

f(t) dt

107

provided that the limit on the right-hand side exists. If the limit ex-ists, then we say the improper integral converges and if not we say itdiverges.

Example: Check if the integrals∫ ∞0

ert dt,

∫ ∞0

1 dt,

∫ ∞1

1

tpdt

converge or diverge (here r 6= 0). If they converge, what do theyconverge to? First,

limA→∞

∫ A

0

ert dt = limA→infty

1

rert∣∣A0

= limA→∞

1

r

(eAr − 1

)Based on this, we see that the improper integral exists provided r < 0and equals −1

r. Otherwise, it diverges. In the second case,

limA→∞

∫ A

0

ert dt = limA→∞

t∣∣A0

= limA→∞

A

so the improper integral diverges. For the final improper integral: ifp = 1 we have

limA→∞

∫ A

0

1

tpdt = lim

A→∞ln t∣∣A1

= limA→∞

ln(A)− ln(1)

and this limit does not exist. If p 6= 1,

limA→∞

∫ A

0

1

tpdt = lim

A→∞

1

(1− p)tp−1∣∣A1

= limA→∞

1

1− p

(1

Ap−1− 1

)If p > 1, this limit exists and so the integral converges to 1

p−1 . If p ≤ 1

the integral diverges. (Compare this to the convergence of the series∑∞n=1

1np

: it’s the same).

Back to piecewise continuous functions: how do we integrate one?If the function f is piecewise continuous on the interval [α, β] withdiscontinuities at the points t1, ..., tn (with α < t1 < t2 < ... < tn < β),then we have∫ β

α

f(t) dt =

∫ t1

α

f(t) dt+

∫ t2

t1

f(t) dt+ ...+

∫ tn

tn−1

f(t) dt+

∫ β

tn

f(t) dt

In other words, to integrate a piecewise continuous functions, we justbreak it up into a bunch of continuous functions that we know how tointegrate, then add all the pieces together.

108

Combining the previous two concepts: suppose we would like tocheck if the improper integral of a piecewise continuous function f(t)converges. If we don’t know anything else about f(t) (or can’t evalu-ate its integral directly) then it can be hard to answer this question.However, if we can compare f(t) to a function that we know the be-havior of, then we can determine whether the improper integral of fconverges. Making this precise:

Theorem 11. Suppose f(t) is piecewise continuous for t ≥ a. If|f(t)| ≤ g(t) for some positive function g whenever t ≥ M for somepositive number M , and

∫∞ag(t) dt converges, then

∫∞af(t) dt also

converges.Conversely, if f(t) ≥ g(t) for some positive g whenever t ≥ M and∫∞ag(t) dt diverges, then so does

∫f(t) dt

Comment: Notice that this is basically the comparison theoremfor infinite series but applied to improper integrals instead of infinitesums and piecewise continuous functions instead of sequences of terms.(Again, draw a picture)

Finally, we can return to integral transforms. An integral transformis informally a map T : {functions} → {functions}. Explicitly, we canwrite an integral transform as

F (s) = (Tf)(s) =

∫ β

α

K(s, t)f(t) dt

where the function K(s, t) is known as the kernel of the integraltransform. In this chapter, we will examine the Laplace transform,which has kernel K(s, t) = e−st, but there are many other integraltransforms that can help in solving differential equations that we won’tconsider: the Fourier transform and the Hilbert transform are just acouple examples. The Laplace transform is defined as

L{f(t)} = F (s) =

∫ ∞0

e−stf(t) dt

provided that the improper integral converges.

How does the Laplace transform help us in solving ODEs? We haveseen that solutions of linear ODEs very often come from exponentialfunctions, and in essence taking the Laplace transform of a functionis decomposing it into its constituent exponential pieces. To solve anODE using the Laplace transform, the process is usually

109

(1) Given an IVP for f(t), apply the Laplace transform to obtaina new equation to solve for F (s), the Laplace transform of f(in fact, this equation will be an algebraic one, rather than adifferential equation, so it is easier to solve)

(2) Solve the algebraic equation to find F

(3) Apply the inverse Laplace transform to find f(t) from F (s).

To use this process, we need F (s) := L{f(t)} to exist, meaning thegiven improper integral converges. How can we ensure this happens?The answer is the following:

Theorem 12. Suppose f is piecewise continuous on the interval 0 ≤t ≤ A for any A > 0, and in addition suppose that for some constantsK, M , and a (where K and M are positive) we have that |f(t)| ≤ Keat

whenever t ≥ M . Then the Laplace transform L{f(t)} = F (s) existsfor s > a.

Why is this true? It follows from the comparison theorem for indef-inite integrals given before. More specifically, for t ≥ M we have that|e−stf(t)| ≤ Ke(a−s)t. If s > a, then this exponential is to a negativepower. Choosing g(t) = Ke(a−s)t we see that the integral of g converges(from our first example), and also |e−stf(t)| ≤ g for t ≥ M . Then thecomparison theorem tells us that

∫∞0f(t)e−st dt must also converge.

What functions satisfy these conditions? For those we are consider-ing: the answer is almost all of them. However, there are plenty of func-tions whose Laplace transform is undefined. For example: f(t) = et

2

grows too fast. Functions whose Laplace transform is defined can alsobe called piecewise continuous functions of exponential order.

Let’s compute some Laplace transforms:

Example: Compute the Laplace transform of f(t) = 1.

F (s) = L{1} = limA→∞

∫ A

0

e−st dt = limA→∞

−1

se−st

∣∣A0

= limA→∞

−1

s

(e−sA − 1

)=

1

s

Thus F (s) = 1s

for s > 0.

110

Example: Compute the Laplace transform of f(t) = eat

F (s) = L{eat} = limA→∞

∫ A

0

e−steat dt = limA→∞

∫ A

0

e(a−s)t dt = limA→∞

1

a− se(a−s)t

∣∣A0

= limA→∞

1

a− s(e(a−s)A − 1

)=

1

s− aprovided that s > a (otherwise the integral diverges).

Example: Let’s compute the Laplace transform of a piecewise con-tinuous function. Let

f(t) =

{t 0 ≤ t < 11 t ≥ 1

Computing L{f(t)}:

F (s) =

∫ ∞0

e−stf(t) dt =

∫ 1

0

te−st dt+ limA→∞

∫ A

1

e−st dt

Using integration by parts on the first integral (with u = t, dv =e−st dt) we obtain

F (s) = −1

ste−st

∣∣10

+

∫ 1

0

1

se−st dt− lim

A→∞

1

se−st

∣∣A1

= −1

se−s − 1

s2e−st

∣∣10− lim

A→∞

1

s

(e−sA − e−s

)=

1− e−s

s2

as long as s > 0.

Example: Let

f(t) =

1 0 ≤ t < 1k t = 10 t > 1

where k is some constant (draw a picture). This choice of f can oftenrepresent a pulse of voltage in an electrical circuit. Then

F (s) = limA→∞

∫ A

0

e−stf(t) dt =

∫ 1

0

e−st dt = −1

se−st

∣∣10

=1− es

s

Note that F does not depend on k: this means that we can change afunction at a single point without affecting its Laplace transform (thisis true of all integral transforms, since integration ignores the behaviorof a function at an individual point)

111

Example: Find the Laplace transform of f(t) = cos at for any a.

F (s) = L{cos(at)} = limA→∞

∫ A

0

e−st cos at dt

Integrating by parts (u = e−st, dv = cos at dt):

F (s) = limA→∞

1

ae−st sin at

∣∣A0

+s

a

∫ A

0

e−st sin at dt = limA→∞

s

a

∫ A

0

e−st sin at dt

for s > 0. Integrating by parts again with u = e−st, dv = sin at dt:

F (s) = − limA→∞

s

a2e−st cos at

∣∣A0− s2

a2

∫ A

0

e−st cos at =s

a2− s2

a2F (s)

Solving for F (s):

F (s)

[1 +

s2

a2

]=

s

a2

F (s) =s

s2 + a2

Finally, note that the Laplace transform is a linear transform (asare all integral transforms having the form given in this section). Thismeans that for any functions f and g and any number c we have

L{f(t) + cg(t)} = L{f(t)}+ cL{g(t)}

This can often be used to make calculations involving the Laplacetransform much simpler. For example:

Example: Compute the Laplace transform of f(t) = 3e2t + 5 cos 6t.Using what we computed already:

F (s) = L{3e2t+5 cos 6t}(s) = 3L{e2t}(s)+5L{cos 6t}(s) =3

s− 2+

5s

s2 + 36

for s > 2 (note that we need each piece to be defined, so we must takethe smaller domain of definition between the two component parts).


112

§6.2: Solution of Initial ValueProblems

This section shows in detail how the Laplace transform can be usedto solve IVP for constant coefficient ODEs. The key property of theLaplace transform is that it converts the operation of differentiationinto multiplication (up to a constant). In this way, we can representthe Laplace transform of f ′ using the Laplace transform of f , whichmeans we can convert an ODE for f into an algebraic equation forL{f}. Making this precise, we need the theorem:

Theorem 13. Suppose f is continuous and f ′ is piecewise continuouson any interval 0 ≤ t ≤ A. In addition, suppose that there existconstants K, a, and M with |f(t)| ≤ Keat whenever t ≥ M . ThenL{f ′(t)} exists for s > a and

L{f ′(t)} = sL{f(t)} − f(0)(12)

Proof. We would like to evaluate

limA→∞

∫ A

0

e−stf ′(t) dt

To do this, we first fix A. Then if the points of discontinuity of f ′ aregiven by t1, t2, ..., tn we have∫ A

0

e−stf ′(t) dt =

∫ t1

0

e−stf ′(t) dt+

∫ t2

t1

e−stf ′(t) dt+ ...+

∫ A

tn

e−stf ′(t)

If we integrate each term by parts using u = e−st, dv = f ′(t) dt, we get∫ A

0

e−stf ′(t) dt = e−stf(t)∣∣t10

+ e−stf(t)∣∣t2t1

+ ...+ e−stf(t)∣∣Atn

+ s

[∫ t1

0

e−stf(t) dt+

∫ t2

t1

e−stf(t) dt+ ...+

∫ A

tn

e−stf(t) dt

]= e−sAf(A)− f(0) + s

∫ A

0

e−stf(t) dt

where we use the continuity of f to cancel all the evaluation terms (andalso combine all the piecewise integrals into one). Taking the limit asA→∞ in the above equation gives

L{f ′(t)} = sL{f(t)} − f(0) + limA→∞

e−sAf(A)

But if A ≥M then |f(A)| ≤ KeaA, so that |e−sAf(A)| ≤ KeA(a−s) andthe limit exists and equals zero for s > a, in which case we have

L{f ′(t)} = sL{f(t)} − f(0)

113

�

By repeatedly applying this formula, we can obtain similar formulasfor higher-order derivatives of f . For example:

L{f ′′(t)} = sL{f ′(t)}−f ′(0) = s [L{f(t)} − f(0)]−f ′(0) = s2L{f(t)}−sf(0)−f ′(0)

More generally, we could show using induction that if f, f ′, ..., f (n−1)

are all continuous and f (n) is piecewise continuous with the bounds|f (j)(t)| ≤ Keat for 0 ≤ k ≤ n − 1 (and K, M , a as in the previoustheorem), then L{f (n)(t)} exists for s > a and is given by

L{f (n)(t)} = snL{f(t)} − sn−1f(0)− ...− sf (n−2)(0)− f (n−1)(0)

Keeping these formulas in mind, we can now use the Laplace trans-form to solve some IVPs.

Example 1: Let’s use the Laplace transform to solve the IVP

y′′ − y′ − 6y = 0, y(0) = 1, y′(0) = −1

Notice that we already know how to solve this using the methods ofChapter 3. In fact, the solution is

y(t) =1

5e−2t +

4

5e3t+

Let’s make it harder than it has to be. Applying the Laplace transformto the whole equation (remember that it’s linear):

L{y′′} − L{y′} − 6L{y} = 0

Since

L{y′} = sL{y} − y(0) = sL{y} − 1

L{y′′} = s2L{y} − sy(0)− y′(0) = s2L{y} − s+ 1

we can rewrite this equation as

(s2 − s− 6)L{y} − s+ 2 = 0

L{y} =s− 2

s2 − s− 6=

s− 2

(s− 3)(s+ 2)

Using partial fractions: write

L{y} =A

s− 3+

B

s+ 2

114

which yields the equation for A and B

A(s+ 2) +B(s− 3) = s− 2

Plugging in s = −2 gives B = 4/5 and plugging in s = 3 gives A = 1/5.Thus we have

L{y} =1/5

s− 3+

4/5

s+ 2But recall that

L{eat} =1

s− aIf we use L−1 to denote the inverse of the Laplace transform (i.e.L−1{g} = f if and only if L{f} = g), then L−1 is also linear, so

y = L−1L{y} = L−1[

1/5

s− 3+

4/5

s+ 2

]=

1

5L−1

{1

s− 3

}+

4

5L−1

{1

s+ 2

}=

1

5e−2t +

4

5e3t

which matches with the solution we gave at the beginning.

More generally, we can take the Laplace transform of any constantcoefficient second-order ODE having the form

ay′′ + by′ + cy = f(t)

to obtain

a[s2Y (s)− sy(0)− y′(0)] + b[sY (s)− y(0)] + cY (s) = F (s)

where Y (s) = L{y(t)} and F (s) = L{f(t)}. This can be solved for Y :

Y (s) =(as+ b)y(0) + ay′(0)

as2 + bs+ c+

F (s)

as2 + bs+ c

At this point, all that remains is to invert the Laplace transform tofind y(t).

Benefits: The Laplace transform makes solving linear constant co-efficient ODEs easier for several reasons. First, we don’t have to dealwith initial conditions separately. As in the last example, we just plugthem in when evaluating our algebraic equations, rather than usingthem to solve for unknown constants. In addition, we can handle anonhomogeneous equation using the exact same process (as was shownabove); there is no need to learn a whole new technique to deal withforcing terms. We can also use the method for higher-order equations

115

by capitalizing on the general formula for L{f (n)(t)} given earlier.

In a way, both processes of solving the IVP in Example 1 are equiva-lent. When using the Laplace transform, we obtained the characteristicequation of the ODE in the denominator, which will always happen.Thus no matter what we still factor the characteristic equation (in or-der to apply the method of partial fractions). This can be one sourceof difficulty in solving higher-order equations using the Laplace trans-form (though we can always get a computer to approximate roots ofalgebraic equations for us).

What else can cause trouble when using this method of solvingODEs? In the example, we solved for L{y} and obtained a particularfunction we already knew as the Laplace transform of an exponential.However, many other ODEs have solutions that cannot be representedin terms of such simple functions. In these cases the final step of invert-ing the transform can prove extremely difficult. (There is an explicitformula for L−1{F} but we won’t go into it here).

In addition, in our method of using the Laplace transform to solvethe ODE, at no point did we show that our solution was actually unique(which is something we were able to do using the other methods forsecond order ODEs in Chapter 3). This leads to the question: if f andg are continuous functions with L{f} = L{g}, is it necessarily truethat f = g? What about if f and g are only piecewise continuous? Itturns out that the answer to the first question is yes, while the answerto the second question is no (example in §6.1).


y′′ + ω2y = cos 2t, ω2 6= 4, y(0) = 1, y′(0) = 0

First, we take the Laplace transform of the equation:

L{y′′}+ ω2L{y} = L{cos 2t}Using the Laplace transform of cos at computed in the last section andwriting Y (s) = L{y}

s2Y (s)− sy(0)− y′(0) + ω2Y (s) =s

s2 + 4

(s2 + ω2)Y (s)− s =s

s2 + 4

Y (s) =s3 + 5s

(s2 + 4)(s2 + ω2)

116

Using partial fractions, we write

Y (s) =As+B

s2 + 4+Cs+D

s2 + ω2

This gives

(As+B)(s2 + ω2) + (Cs+D)(s2 + 4) = s3 + 5s

which yields the system of four equations

A+ C = 1

B +D = 0

Aω2 + 4C = 5

Bω2 + 4D = 0

The solution to this system is A = 1ω2−4 , B = 0, C = ω2−5

ω2−4 , D = 0.Then computing the inverse Laplace transform gives

y(t) =1

ω2 − 4L−1

{s

s2 + 4

}+ω2 − 5

ω2 − 4L−1

{s

s2 + ω2

}=

1

ω2 − 4cos 2t+

ω2 − 5

ω2 − 4cosωt

Let’s see how to solve a higher-order ODE using the Laplace trans-form:


y(4) − y = 0, y(0) = 1, y′(0) = 0, y′′(0) = 1, y′′′(0) = 0

Taking the Laplace transform of the ODE with Y (s) = L{y(t)} gives

s4Y (s)− s3y(0)− s2y′(0)− sy′′(0)− y′′′(0)− Y (s) = 0

Plugging in the initial conditions and rearranging:

Y (s) =s3 + s

s4 − 1=

s(s2 + 1)

(s2 + 1)(s2 − 1)=

s

s2 − 1

Using partial fractions, write

Y (s) =A

s+ 1+

B

s− 1

A and B must satisfy B(s + 1) + A(s − 1) = s. Plugging in s = −1gives A = 1/2. Plugging in s = 1 gives B = 1/2. It follows that

y(t) =1

2

[L−1

(1

s+ 1

)+ L−1

(1

s− 1

)]=e−t + et

2= cosh t

117

6.2 Suggested Problems: 4,8,12,14,19,23,34,38

118

§7.1: Introduction to Systems of FirstOrder Linear Equations

In this chapter, we learn how to solve systems of ODEs with morethan one equation. This is a very general class of equations, as we canalways transform higher-order ODEs into systems of first-order ODEs.Aside from this, we can model a wide variety of physical problems us-ing systems of ODEs. In particular, many problems involving electricalcircuits can be modeled in this way. In addition, many mechanics prob-lems can be cast as systems of ODEs:

Example: A more complicated spring-mass system. Consider twomasses m1 and m2 sitting on a frictionless surface, acted on by externalforces F1(t) and F2(t), with three springs: one holding m1 to a wall, oneattaching m1 to m2, and one holding m2 to an opposite wall. Supposethese springs have spring constants k1, k2, and k3 respectively. Here m1

is to the left of m2, and we orient our coordinates so that the directionof increasing x is to the right (draw a picture). If we would like todescribe the position of m1 by using x1(t) and the position of m2 usingx2(t), then a similar process to §3.7 will yield

mx′′1(t) = −k1x1 + k2(x2 − x1) + F1(t) = −(k1 + k2)x1 + k2x2 + F1(t)

mx′′2(t) = −k3x2 − k2(x2 − x1) + F2(t) = k2x1 − (k2 + k3)x2 + F2(t)

Example: Suppose we return to the original mass spring systemfrom Chapter 3, which is given by the second-order ODE

mu′′ + γu′ + ku = F (t)

Then we can rewrite this single second-order ODE as the system of twofirst-order ODEs. Letting v = u′ denote the velocity of the mass, weget the following system for u and v:

u′ = v

v′ = − kmu− γ

mv +

1

mF (t)

More generally, if we have the nth order ODE

y(n) = F (t, y, y′..., y(n−1))

119

we can follow a similar process: let x1 = y, x2 = y′, x3 = y′′, ...xn = yn−1. In this case we obtain the system for x1, ..., xn

x′1 = x2

x′2 = x3...

x′n−1 = xn

x′n = F (t, x1, x2, ..., xn)

This shows that we can always go from an nth order ODE to a systemof n first-order ODEs (provided we can solve for the highest derivativeexplicitly as we did here). Though this may seem somewhat contrivedand not necessarily useful, this is not the case. Writing higher-orderODEs in this way has many advantages. For one, most numericalsolvers are only equipped to deal with first-order ODEs. However, thismethod allows us to recast any higher-order ODE as a system of first-order equations that programs like MATLAB can solve.

Another advantage: we saw in going from Chapter 2 to Chapter 3that as we raise the order of the ODE we’re studying, we have a muchmore difficult time obtaining solutions to some of the more generalequations of that order. However, by treating higher-order ODEs assystems of first-order equations, we now only need the techniques offirst-order ODEs, along with ... linear algebra ... to solve such ODEs.

Note that in the first spring example given, the ODEs still weren’tfirst-order, they were second-order. However, each second-order equa-tion is in turn two first-order equations, so we can recast the two-mass,three-spring problem as a system of four first-order ODEs if we want.

Let’s give a general system of n ODEs, so we know what they looklike in their natural form. The general system looks like n copies of asingle nth order equation:

x′1 = F1(t, x1, x2, ..., xn)

x′2 = F2(t, x1, x2, ..., xn)

...

x′n = Fn(t, x1, x2, ..., xn)

120

As in Chapter 1, we can make precise the notion of a solution of thesystem of ODEs given above. In this case, we require n differentiablefunctions of t defined on some time interval I : α < t < β. We can alsoform an IVP in a similar way: in this case requiring n initial conditionscorresponding to the n first-order ODEs, usually taking the form

x1(t0) = x01, x2(t0) = x02, . . . xn(t0) = x0n(13)

A geometric interpretation of systems: before we could plot solutionsy(t) in the plane, treating t as the independent variable. However,with n distinct equations, we may now think of a solution as an n-dimensional vector (x1(t), x2(t), ..., xn(t)) that traces out a curve in n-dimensional space parametrized by t (so that t is no longer explicit inthe graph of our solution). However, we can still trace the flow of timeby moving along a particle path in the forward direction. If solving anIVP, the initial value gives us a starting point for such a trajectory.

As always, we are concerned about existence and uniqueness. Whendoes the general system with initial conditions given in (13) have aunique solution? The answer is very similar to the general case ofa single first-order ODE. First, we require that each of the functionsF1, ..., Fn be continuous. However, now each Fi is defined on (n + 1)-dimensional space, so instead of a 2-D rectangle, we require the hy-potheses to hold on some (n + 1)-dimensional box (i.e. α < t < β,α1 < x1 < β1, ... , αn < xnβn) containing the initial condition. In ad-dition, the partial derivatives of each Fi with respect to all the xi mustbe continuous on the same region: explicitly ∂Fi

∂xj, i, j = 1, ..., n. In this

case, there is a unique solution on some time interval t0−h ≤ t ≤ t0+h.

Note that this means we can also find a unique solution of any higher-order ODE by first converting it to a system of first-order ODEs andthen applying this result.

In the above case, it is possible that we are dealing with nonlinearsystems of ODEs, which is why the theorem says nothing explicit aboutthe existence time. However, in this chapter we refine our attention tolinear systems of ODEs, which take the form

121

x′1 = p11(t)x1 + ...+ p1n(t)xn + g1(t)

x′2 = p21(t)x1 + ...+ p2n(t)xn + g2(t)

...

x′n = pn1(t)x1 + ...+ pnn(t)xn + gn(t)

As before, all the coefficients of every xi depend on t alone to ensurethe system is linear. The functions gi(t) determine whether or not thesystem is homogeneous. If all of them are zero, we have a homoge-neous system of ODEs. If any of them are nonzero, the system isnonhomogeneous.

In the case of linear systems, we again have a theorem for existenceand uniqueness (similar to the existence and uniqueness theorem for asingle linear 1st-order ODE from Chapter 2). It says:

Theorem 14. Suppose the functions pij for i, j = 1, ..., n and gi fori = 1, ..., n are continuous on some open interval I : α < t < β. Thenthere is a unique solution on all of I to the system of ODEs given abovethat also satisfies the initial conditions (13), where we can choose t0 tobe any point in I.

Just as before, for linear equations we can explicitly find the inter-val of existence and uniqueness by checking where all the componentfunctions are continuous, so we can make sense of the interval of ex-istence in the same way as before.


§7.2: Remember Matrices?In this section, we’ll review some great facts from 22A. A general

mxn matrix A looks like

A =

a11 a12 . . . a1na21 a22 . . . a2n...

......

am1 am2 . . . amn

122

In general, we can consider matrices whose elements are complexnumbers, and in this case we can define the conjugate of a matrix A bytaking the complex conjugate of each element of A, i.e. if ajk = u+ iv,then ajk = u− iv. In addition, we can define the transpose of A, de-noted AT , as the nxm matrix satisfying (AT )jk = (A)kj. The adjoint

of A is given by the conjugate transpose, i.e. AT

, and is often denotedby A∗.

Example: If

A =

(2− i 34 + i −1− i

)then

A =

(2 + i 34− i −1 + i

), AT =

(2− i 4 + i

3 −1− i

)A∗ =

(2 + i 4− i

3 −1 + i

)Since we care about systems of linear ODEs, it follows from the

representation in the last section that we will mostly be consideringsquare nxn matrices (which will represent the functions pij), as well asnx1 column vectors (these will represent the nonhomogeneous termsgi, i = 1, ..., n as well as the solution x1, ..., xn). When we transpose annx1 column vector x we obtain a 1xn row vector.

Potentially Important Facts:

(1) When are two matrices (or vectors) equal? When they’re thesame size and each of their corresponding elements are equal.

(2) The zero matrix 0 has zeroes for all its entries (not much moreto say than that)

(3) We can add two matrices A and B by adding each of their ele-ments. Explicitly: (A + B)jk = (A)jk + (B)jk

(4) We can multiply a matrix by a number c, and (cA)jk = c(A)jk.

(5) Subtraction: to find A−B, multiply B by c = −1 above, thentake A + (−B).

(6) Multiplication of two matrices: suppose we want to find theproduct AB of the matrices A and B. We can do this wheneverthe number of columns of A is the same as the number of rows

123

of B. Explicitly, if A is an mxn matrix and B is an nxp matrix,then C = AB is an mxp matrix defined by

cij =n∑k=1

aikbkj

Matrix multiplication is associative (so A(BC) = (AB)C) anddistributive (A(B+C)=AB+AC), but not necessarily commu-tative: AB 6= BA unless A and B are simultaneously diago-nalizable (probably not going to get into that)

Example: Suppose

A =

−2 −1 14 1 14 −1 −3

, B =

2 1 −10 5 −23 −3 −2

Then

AB =

−1 −10 211 6 −8−1 8 4

6= −4 0 6

12 7 11−26 −4 6

= BA

(7) Multiplication of vectors xand yeach having n components: thisvaries depending on whether we’re considering a real or com-plex vector space.

(a) If real: we have the usual dot product:

xTy =n∑i=1

xiyi

which is a symmetric bilinear form (meaning xTy = yTx,and it is linear as a function of both xand yindividually).

(b) If complex: we have the inner product (or scalar product)

(x,y) =n∑i=1

xiyi

which can be related to the dot product by (x,y) = xTy.This is sometimes referred to as a sesquilinear form, mean-ing that it is linear in the first argument, but linear up to acomplex conjugate in the second argument. Its propertiesare

(x,y) = (y,x), (x,y + z) = (x,y) + (x, z)

(αx,y) = α(x,y), (x, αy) = α(x,y)

124

The point of the inner product: the inner product of avector with itself is always a nonnegative number:

(x,x) =n∑i=1

|xi|2

The norm of a vector: ||x|| =√

(x,x) measures the mag-nitude of x. In addition, if (x,y) = 0, then we say x andy are orthogonal.

Example: Suppose

x =

i2 + i−3

, y =

2i− 1i− 3

4

Then

xTy = (i)(2i− 1) + (2 + i)(i− 3) + (−3)(4) = −21− 2i

(x,y) = (i)(−2i− 1) + (2 + i)(−i− 3) + (−3)(4) = −15− 6i

xTx = i2 + (2 + i)2 + (−3)2 = 11 + 4i

(x,x) = (i)(−i) + (2 + i)(2− i) + (−3)(−3) = 15

(8) Identity matrix: the nxn identity matrix is given by

I =

1 0 . . . 00 1 . . . 0...

......

0 0 . . . 1

For any matrix A we have AI = IA = A

(9) Inverses: If we have an nxn matrix A, and there is anothernxn matrix B with AB = BA = I, then A is invertible (ornonsingular) and we write B = A−1. If A has an inverse, itis unique. If no such matrix exists, we say A is singular (ornoninvertible). When does a square matrix have an inverse?Precisely when its determinant is nonzero. Ways of computingthe inverse of a matrix: using cofactor matrices (we won’t dothis), using elementary row operations on A. Elementary rowoperations include:(a) Swapping two rows

(b) Multiplying a row by some nonzero constant

125

(c) Adding a multiple of one row to another rowThe process of applying elementary row operations is called rowreduction or Gaussian elimination. A matrix A is invert-ible precisely when we can row reduce it to the identity matrix.This gives a way of computing A−1:

Example: Find the inverse of the matrix

A =

1 1 −12 −1 11 1 2

To do this, we augment A with the 3x3 identity matrix, thenreduce the left-hand side to the identity matrix using elemen-tary row operations (here we use ri to denote the ith row of ourmatrix) What remains in the right half will be our inverse.1 1 −1 1 0 0

2 −1 1 0 1 01 1 2 0 0 1

r2 7→r2−2r1−−−−−−→

1 1 −1 1 0 00 −3 3 −2 1 01 1 2 0 0 1

r2 7→− 1

3r2−−−−−→

1 1 −1 1 0 00 1 −1 2

3−1

30

1 1 2 0 0 1

r3 7→r3−r1−−−−−−→

1 1 −1 1 0 00 1 −1 2

3−1

30

0 0 3 −1 0 1

r3 7→ 1

3r3−−−−→

1 1 −1 1 0 00 1 −1 2

3−1

30

0 0 1 −13

0 13

r2 7→r2+r3−−−−−−→

1 1 −1 1 0 00 1 0 1

3−1

313

0 0 1 −13

0 13

r1 7→r1+r3−−−−−−→

1 1 0 23

0 13

0 1 0 13−1

313

0 0 1 −13

0 13

r1 7→r1−r2−−−−−−→

1 0 0 13

13

00 1 0 1

3−1

313

0 0 1 −13

0 13

Now that the left-hand half is the identity matrix, we see that

A−1 =

13

13

013−1

313

−13

0 13

Matrix Functions: In solving linear systems of ODEs, we will at

times be considering vectors and matrices that are functions of theindependent variable t. For this reason, it is important to make senseof the notion of derivatives of vectors and matrices that are functionsof t. Suppose we are considering

126

x(t) =

x1...xn

, A(t) =

a11(t) a12(t) . . . a1n(t)a21(t) a22(t) . . . a2n(t)

......

...am1(t) am2(t) . . . amn(t)

First, A is continuous at t = t0 if all the aij are continuous there

(the same is true for x, which is just a special case of A having onlyone column). When is A differentiable at t = t0? Same answer: whenall the aij are differentiable there. In that case, we have(

dA

dt

)ij

=daijdt

Similarly, we can integrate a matrix function:(∫ b

a

A(t) dt

)ij

=

∫ b

a

aij(t) dt

Important point in all of this: the derivative of a matrix is still a ma-trix. The integral of a matrix is also a matrix. Otherwise, we computeeverything one element at a time using the normal process of differen-tiation and integration for scalar-valued functions.

Useful differentiation rules for matrices: suppose A and B are ma-trices that depend on t, but C is a constant matrix (it doesn’t dependon t). Then we have

d

dt(CA) = C

dA

dtd

dt(A + B) =

dA

dt+dB

dtd

dt(AB) = A

dB

dt+dA

dtB

Note that in these cases, the order of multiplication matters since ma-trix multiplication is not commutative.

7.2 Suggested Problems: 4,8,16,20,21c),21d),22,25

127

§7.3: Systems of Linear AlgebraicEquations; Linear Independence,

Eigenvalues, EigenvectorsIn this section: more linear algebra review. First, a discussion on

solving systems of linear equations. In general, a system of n equationsfor n unknowns takes the form

a11x1 + a12x2 + ...+ a1nxn = b1...

an1x1 + an2x2 + ...+ annxn = bn

which can also be written more compactly in vector-matrix notationas Ax = b. Here A and b are given, and the aim is to solve for x. Ifb= 0, the system is homogeneous, otherwise it is nonhomogeneous.

When can we solve such a system? Precisely when the matrix A isinvertible. In this case, the solution is given by x = A−1b. In partic-ular, this means the only solution to Ax =0 for A invertible must bex =0.

If A is singular, then there are nonzero solutions x to the equationAx =0. The equation Ax =b is only solvable if b is orthogonal to thenullspace of A∗. In this case we can write the solutions of Ax =b asx = x(0) + ξ, where x(0) is a particular solution of Ax =b and ξ is thegeneral solution of the homogeneous system.

One method of solving systems of linear equations involves usingaugmented matrices. Demonstrating by example:

Example: Solve the system of equations

x1 − x3 = 0

3x1 + x2 + x3 = 1

−x1 + x2 + 2x3 = 2

First, we can recast this as the matrix-vector equation Ax = b with

A =

1 0 −13 1 1−1 1 2

, b =

012

128

Forming the augmented matrix and row reducing: 1 0 −1 03 1 1 1−1 1 2 2

r2 7→r2−3r1−−−−−−→

1 0 −1 00 1 4 1−1 1 2 2

r3 7→r3+r1−−−−−−→

1 0 −1 00 1 4 10 1 1 2

r3 7→r3−r2−−−−−−→

1 0 −1 00 1 4 10 0 −3 1

At this point, we can stop row reducing, as the final row of the matrixtells us that x3 = −1/3. From here we can back solve. Plugging intothe second row gives x2− 4/3 = 1, so x2 = 7/3. Finally, plugging bothof these into the first row yields x1 = −1/3, so our solution is

x =

−13

73−1

3

Example: Let’s see what happens when we try to solve the system

x1 + 2x2 − x3 = 2

2x1 + x2 + x3 = 1

x1 − x2 + 2x3 = −1

In this case, we will have more than just one solution, since there isa dependence between the equations. Specifically, if we subtract thefirst equation from the second one, we obtain the third one. Let’s usethe same process as in the previous example. Forming the augmentedmatrix: 1 2 −1 2

2 1 1 11 −1 2 −1

r2 7→r2−2r1−−−−−−→

1 2 −1 20 −3 3 −31 −1 2 −1

r3 7→r3−r1−−−−−−→

1 2 −1 20 −3 3 −30 −3 3 −3

r3 7→r3−r2−−−−−−→

1 2 −1 20 −3 3 −30 0 0 0

The second row allows us to solve for x2, by writing x2 = x3 + 1.Plugging this into the first equation gives x1 + 2(x3 + 1) − x3 = 2,or x1 = −x3. Thus if we write x3 = c we can represent the generalsolution to this equation as

x =

−cc+ 1c

129

where c is a free variable. In this case, we have a one-dimensional sub-space of solutions (since there is one free parameter), but in general wecan have two-dimensional subspaces or higher.

In the previous example, we found more than one solution to thegiven system of equations. This is because there was a nontrivial rela-tionship between the rows of the matrix A. More precisely, if we have aset of vectors {x(1), ...,x(k)}, then the vectors are linearly dependentif there are constants c1, ..., ck not all equalling zero so that

c1x(1) + ...+ ckx

(k) = 0

If no such choice of constants exists, then the vectors are linearlyindependent. Suppose we have n of these vectors, each having n

components, denoting the ith component of x(j) by x(j)i . Notice that

the above relationship holds if and only if we havex(1)1 x

(2)1 . . . x

(n)1

......

...

x(1)n x

(2)n . . . x

(n)n

c1c2...cn

=

00...0

Since the ci must not all be zero, this is equivalent to saying thatthe determinant of the above matrix is nonzero. Thus to find out ifx(1), ...,x(n) are linearly independent, we simply need to make a matrixwith columns x(i) and find its determinant.

Example: Are the vectors

x(1) =

210

, x(2) =

010

, x(3) =

−120

linearly independent? To check this, we first form the matrix from thex(i):

X =

2 0 −11 1 20 0 0

Since the last row of this matrix is zero, we can see that the determinantwill be zero without actually evaluating it. Thus these three vectorsare linearly dependent. Let’s find a linear dependence relation. Thisentails solving

c1x(1) + c2x

(2) + c3x(3) = 0

130

for c1, c2, c3, which yields c3 = 2c1. Plugging this into the equationc1 + c2 + 2c3 = 0 gives c2 = −5c1. Thus setting c1 = 1, we see that oneparticular dependence relation is given by

x(1) − 5x(2) + 2x(3) = 0

Some more useful facts we can glean from the determinant: thecolumns (or rows) of a matrix A are linearly independent if and only ifdet A 6= 0. In addition, we have the formula det AB = (det A)(det B).

We can also extend the notion of linear independence to functions ofan independent variable t, saying that the functions x(1)(t), ...,x(k)(t)are linearly dependent on α < t < β if there are constants c1, ..., ck notall equalling zero so that

c1x(1)(t) + ...+ ckx

(k)(t) = 0

for all t in the interval. If no such constants exist, they are linearlyindependent. In other words, to be linearly dependent on the entireinterval, they must be linearly dependent at every single point of theinterval (with the same choice of constants throughout).

Eigenvalues and Eigenvectors: Much of the analysis of systems ofODEs involves finding eigenvalues and eigenvectors of a matrix, whichwe can superpose to build general solutions. In essence, an eigenvectorof a matrix is a vector that is fixed by that matrix, and scaled bysome amount (the eigenvalue). Typically, given a matrix A we writeAx = λx to mean that x is an eigenvector with eigenvalue λ. Wecan rewrite this equation as (A − λI)x = 0, where I is the identitymatrix. This has nonzero solutions if and only if

det(A− λI) = 0

which is the main equation we use to find eigenvalues of a matrix A.Once we do this, we can find the corresponding eigenvectors. Let’s seethe process with an example:

Example: Find the eigenvalues and eigenvectors of

A =

(5 −13 1

)

131

To find the eigenvalues, we set det(A− λI) = 0 and solve for λ:

det(A−λI) =

∣∣∣∣5− λ −13 1− λ

∣∣∣∣ = (5−λ)(1−λ)+3 = λ2−6λ+8 = (λ−4)(λ−2)

Thus we see the eigenvalues are given by λ1 = 4 and λ2 = 2. To find theassociated eigenvectors, we simply plug in each λ to the above matrixand multiply by x. First with λ1:(

1 −13 −3

)(x1x2

)=

(00

)We can see from this equation that x1 = x2, so the eigenvector associ-ated with λ1 is

x(1) =

(11

)Notice that any multiple of this eigenvector will still be an eigenvectorwith eigenvalue λ1. For λ2:(

3 −13 −1

)(x1x2

)=

(00

)In this case, we obtain x2 = 3x1, so our second eigenvector is

x(2) =

(13

)or any nonzero multiple of this vector.

Comments: Eigenvectors of a matrix are determined only up to aconstant (this is a consequence of the linearity of the matrix A: if itstretches a particular eigenvector by some fixed number, then it will dothe same for any multiple thereof). Thus sometimes we can normalizean eigenvector x, scaling it so that ||x|| = 1.

Notice that in searching for eigenvalues of a 2x2 matrix, we obtaineda quadratic equation. More generally, to find the eigenvalues of an nxnmatrix we need to find the roots of an nth degree polynomial, whichis known as the characteristic equation. In doing this, it is possi-ble there will be repeated roots. If some eigenvalue λ appears k timeswhen factoring the characteristic equation, we say it has algebraicmultiplicity k. Every eigenvalue has an associated eigenvector, and ifλ has algebraic multiplicity k it can have between 1 and k (inclusive)linearly independent eigenvectors. The number of linearly independenteigenvectors of λ is referred to as its geometric multiplicity.

132

Example: Find the eigenvalues and eigenvectors of

A =

3 2 42 0 24 2 3

Computing:

det(A− λI) =

∣∣∣∣∣∣3− λ 2 4

2 −λ 24 2 3− λ

∣∣∣∣∣∣= (3− λ)

∣∣∣∣−λ 22 3− λ

∣∣∣∣− 2

∣∣∣∣2 24 3− λ

∣∣∣∣+ 4

∣∣∣∣2 −λ4 2

∣∣∣∣= (3− λ)(λ2 − 3λ− 4)− 2(−2− 2λ) + 4(4 + 4λ)

= −λ3 + 6λ2 + 15λ+ 8

Finding roots of cubics is hard in general, but in this case we can guessλ = −1 to find the first root. In addition, since the derivative of thispolynomial with respect to λ is −3λ2 + 12λ + 15, and -1 is a rootof that as well, we can conclude that it is a repeated root. Thus thecharacteristic polynomial factors as (λ+1)2(a−λ), and by matching upthe constant terms we see that a = 8, so we have an eigenvalue λ1 = −1of algebraic multiplicity 2, and another eigenvalue λ2 = 8 of algebraicmultiplicity 1 (which we sometimes call a simple eigenvalue). Next,we compute the eigenvectors associated with each eigenvalue. For λ1:4 2 4

2 1 24 2 4

x1x2x3

=

000

The general solution to this is 2x1 + x2 + 2x3 = 0, which is a subspaceof dimension two. If we write x2 = −2x1 − 2x3, then we can write thegeneral solution asx1x2

x3

=

x1−2x1 − 2x3

x3

= x1

1−20

+ x3

0−21

Thus λ1 has geometric multiplicity two, with the two distinct eigenvec-tors

x(1) =

1−20

, x(2) =

0−21

133

Finally, the eigenvector for λ2 is found by solving:−5 2 42 −8 24 2 −5

x1x2x3

=

000

Notice that the above matrix satisfies r2 = −2r1 − 2r3, so we onlyneed to consider r1 and r3. If we replace r3 with r3 − r1 we obtainr3 = (9 0 − 9). This gives x3 = x1, if we plug this into r1 we see that−x1 + 2x2 = 0 or x1 = 2x2. Thus we can represent our eigenvector as

x(3) =

212

Finally, we distinguish a special class of matrices: self-adjoint ma-trices, which have the property that A∗ = A. These matrices have anumber of useful properties:

(1) They only have real eigenvalues.

(2) They can always be diagonalized by a change-of-basis transfor-mation.

(3) They have n linearly independent eigenvectors, which form abasis for the underlying vector space.

(4) Eigenvectors associated to distinct eigenvalues are orthogonal.


§7.4: Basic Theory of Systems of FirstOrder Linear Equations

With all the linear algebra out of the way, we can finally return tothe original task of studying systems of first-order linear differentialequations. Note: this section is almost identical to §3.2: only nowinstead of considering one second-order ODE, we’re applying the re-sults to systems of n first-order ODEs. Recall that the system we’reinterested in takes the form

134

x′1 = p11(t)x1 + ...+ p1n(t)xn + g1(t)

...

x′n = pn1(t)x1 + ...+ pnn(t)xn + gn(t)

We will see that the behavior of a system of n linear equations is verysimilar to a single nth-order equation. Thus solutions of systems of twoequations should be linear superpositions of two (possibly complex)exponentials, as we saw previously in Chapter 3. First, we shouldrewrite the above system in matrix-vector form:

x′ = P(t)x + g(t)

where

x =

x1...xn

, P(t) =

p11(t) . . . p1n(t)...

...pn1(t) . . . pnn(t)

, g(t) =

g1(t)...gn(t)

As usual, we assume P and g are continuous (i.e. all their compo-

nents are continuous). As with our study of second-order ODEs, we willfind that it is sufficient to first consider the homogeneous case (g ≡ 0),then use this to help us build particular solutions to nonhomogeneoussystems.

Notation: In general, a system of n equations will have n linearlyindependent solutions, so we need to enumerate them in some fashion.However, we are already using subscripts for the components of a vec-tor. Thus we will often write x(1),x(2), etc. to denote different solution

vectors to a given system of ODEs. We can also write xij(t) = x(j)i (t)

to refer to the ith component of the jth solution vector, so that one ofour solutions looks like

xj(t) =

x1j(t)...xnj(t)

Principle of Superposition: Suppose we’re considering the homo-geneous problem x′ = Px. As in Chapter 3, we will find that linearcombinations of solutions are again solutions. Explicitly, suppose that

135

x(1), ...,x(k) are solutions of the system x′ = Px. Then for any con-stants c1, ..., ck we have that the vector

x = c1x(1) + ...+ x(k)

is also a solution of x′ = Px. Why is this? It is because both differenti-ation and matrix-vector multiplication are linear operators. Explicitly:

x′ = c1x(1)′+...ckx

(k)′ = c1Px(1)+...+ckPx(k) = P(c1x

(1) + ...+ ckx(k))

= Px

so that x also solves the system of ODEs. As in Chapter 3, we can askthe same question: can we find all solutions this way? The answer isyes.

Great fact: Suppose we find n solutions x(1), ...,x(n) to the equationx′ = Px (where x ∈ Rn and P is an nxn matrix), and in additionsuppose that x(1), ...,x(n) are linearly independent. Then any solutionof x′ = Px can be expressed as

x = c1x(1) + ...+ cnx

(n)

i.e. x(1), ...,x(n) form a fundamental set of solutions and the abovex is the general solution.

Why is this? To say that x(1), ...,x(n) are linearly independent at avalue of t is the same as saying that the matrix formed with x(i) in theith column has nonzero determinant at that t value. Explicitly,

det(X) =

∣∣∣∣∣∣x11(t) . . . x1n(t)

......

xn1 . . . xnn(t)

∣∣∣∣∣∣ 6= 0

The above is the Wronskian of x(1), ...,x(n), sometimes written asW (x(1), ...,x(n)).Notice that in the case n = 2, x2 = x′1, we obtain the Wronskian weare familiar with from Chapter 3.

To prove the great fact, we use the preceding observation about theWronskian. Suppose φ(t) is some solution of the ODE on α < t < βand fix some point t0 in this interval. We would like to show φ(t0)is a linear combination of the x(i): once we do this we can apply theuniqueness portion of the existence and uniqueness result for IVPs withlinear systems of ODEs. Suppose ξ = φ(t0). Then saying φ(t0) is alinear combination of the x(i) is the same as saying there are constantsc1, ..., cn so that

c1x(1) + ...+ cnx

(n) = ξ

136

However, this can be rewritten as the matrix-vector equation

X

c1...cn

= ξ

where X is the matrix given above when introducing the Wronskian.However, this equation is uniquely solvable for c = (c1, ..., cn)T given ξprecisely when det X 6= 0, i.e. when W [x(1), ...,x(n)] 6= 0, which we al-ready showed occurs given that the x(i) are linearly independent. Thuswe can find the constants c1, ..., cn to express φ as a linear combinationof the x(i) at the value t = t0. However, both φ and c1x

(1) + ...+ cnx(n)

solve the IVP x′ = P(t)x, x(t0) = ξ, so by the uniqueness theorem theymust be equal on the whole interval α < t < β. Since φ was arbitrary,it follows that any solution of x′ = P(t)x can be expressed as a linearcombination of x(1), ...,x(n).

Another Great Fact: If x(1), ...,x(n) are solutions of x′ = P(t)xon α < tβ, then either W [x(1), ...,x(n)] is identically zero or it nevervanishes. Note that this is just the result from Chapter 3 translatedto systems of equations. We can also obtain a higher-dimensional ana-logue of Abel’s formula from Chapter 3:

W (t) = ce∫tr(P)(t) dt

where c is a constant depending on the set of solutions being consid-ered, and tr(P) denotes the trace of the matrix P, given by tr(P)(t) =p11(t) + p22(t) + ...+ pnn(t)

More on Fundamental Sets of Solutions: Let

e(1) =

10...0

, e(2) =

01...0

, ... e(n) =

00...1

and suppose x(1), ...,x(n) solve x′ = Px with initial conditions

x(1)(t0) = e(1), x(n)(t0) = e(n)

Then x(1), ...,x(n) form a fundamental set of solutions.

Again, this is just like what we saw in Chapter 3. In general, such aset of solutions may not be easy to find. However, in essence this saysthat we may decompose any set of initial conditions into conditions on

137

each component then find the general solution by taking linear combi-nations. Note that the Wronskian of these solutions at t0 is 1, since inthis case X is just the identity matrix.

One final result that will be useful when we deal with matrices havingcomplex eigenvalues: if x solves x′ = P(t)x and x is complex-valued,say x = u + iv, and in addition each component of P(t) is continuousand real-valued, then u and v solve u = P(t)u and v = P(t)v. Theproof of this is almost identical to the proof seen in Chapter 3 of theanalogous result.

7.4 Suggested Problems: 2a),2b),2c),5,6

§7.5: Homogeneous Linear Systems withConstant Coefficients

In this section, we begin explicitly solving linear systems of ODEs,and we refine our focus to homogeneous systems with constant coeffi-cients. In this case, we can express the system in vector-matrix formas

x′ = Ax

where A is the matrix of coefficients (which do not depend on time).As it turns out, we can extend the techniques from Chapter 3 involv-ing linear combinations of exponentials to solve these systems, but wemust combine these techniques with some knowledge of the behaviorof eigenvectors and eigenvalues of A from linear algebra. Let’s startwith a simple example:

Example 1: Suppose we want to solve the system

x′ =

(3 00 4

)x

If we write x in component form, we obtain the system of two equations

x′1 = 3x1

x′2 = 4x2

This system if decoupled, because the equation for x1 doesn’t dependon x2 (and vice versa). Thus we can solve each equation independently.The general solution to the equation for x1 is

x1 = c1e3t

138

and likewise the general solution for x2 is

x2 = c2e4t

Thus the general solution of this system is given by

x =

(c1e

3t

c2e4t

)= c1

(e3t

0

)+ c2

(0e4t

)= c1

(10

)e3t + c2

(01

)e4t

We can check that we have found a fundamental set of solutions byevaluating the Wronskian:

W (x(1),x(2))(t) =

∣∣∣∣e3t 00 e4t

∣∣∣∣ = e7t 6= 0

The last form we’ve written the solution in is the most enlightening forfuture problems. In this particular case, the matrix A had eigenvalues

λ1 = 3 and λ2 = 4 with corresponding eigenvectors v1 =

(10

)and

v2 =

(01

). It will turn out that more generally we can always con-

struct solutions to systems of ODEs as linear combinations of eigen-vectors multiplied by exponentials.

Suppose we return to the general equation x′ = Ax. When is thevector x = ξert a solution to the equation (for ξ a fixed vector)? Inthis case we have x′ = rξert, so we’re searching for a vector ξ and ascalar r so that

Aξert = rξert

or equivalentlyAξ = rξ

However, this is the same as saying that ξ is an eigenvector of A withcorresponding eigenvalue r. Thus in the general case, we should try tofind the eigenvectors and eigenvalues of the matrix A, and then we canbuild our general solution as

x = c1v1eλ1t + ...+ cnvne

λnt

where the vi are eigenvectors with corresponding eigenvalues λi.

We will see that the behavior of these systems will depend heavilyon the eigenvalues of A: whether they are real or complex, as well aswhether their real parts are positive or negative. This begs the ques-tion: how can we visualize solutions of systems of differential equations?For systems of n equations, it becomes pretty much possible whenevern > 3. However, when n = 2 we can study the behavior of solutions by

139

plotting a slope field. However, unlike in Chapter 1, we will have not-axis: instead there will be an x1-axis and an x2-axis (the two compo-nents of our solution vector) and the time variable will be implicit inour plot: specifically, t serves as a parameter for the plot of the solu-tion (x1(t), x2(t)) in the x1x2-plane. We often call such a visualizationa phase plane. We may also plot families of solutions for given initialconditions as before in the phase plane to obtain what we call a phaseportrait. Let’s see this in action:

Example 2: Consider the 2-D system

x′ =

(1 14 −2

)x

First, let’s look at the phase plane. At some point (x1, x2) in the phaseplane the tangent vector to a solution will be given by

A

(x1x2

)=

(x1 + x2

4x1 − 2x2

)We can now plot this vector field to obtain the phase plane:

Figure 25. Example 2 Phase Plane

Based on this picture, we see that solutions to the equation seem tomove toward the line x1 = x2, which they then follow towards either

140

±∞. Let’s solve the system explicitly. Searching for eigenvalues of A:

det(A−λI) =

∣∣∣∣1− λ 14 −2− λ

∣∣∣∣ = (1−λ)(−2−λ)− 4 = λ2 +λ− 6 = 0

which has roots λ1 = 2, λ2 = −3. Next, we find the eigenvectors. First,for λ1: (

−1 14 −4

)(x1x2

)=

(00

)=⇒ x1 = x2 =⇒ v1 =

(11

)For λ2: (

4 14 1

)(x1x2

)=

(00

)=⇒ x2 = −4x1 =⇒ v2 =

(1−4

)Putting these facts together, we can conclude that the general solutionof the system is given by

x(t) = c1

(11

)e2t + c2

(1−4

)e−3t =

(c1e

2t + c2e−3t

c1e2t − 4c2e

−3t

)Let’s plot the eigenvectors on the phase plane:

Figure 26. Example 2 Phase Plane with eigenvectors

What can we learn from this plot? Starting at a given initial condi-tion in the phase plane, we move in the direction of v2 towards the linegiven by v1, at which point we follow v1 towards either ±∞ (depending

141

on where we start). This demonstrates several important points:

(1) Just as with a single ODE, we can search for equilibrium solu-tions x′ = 0, i.e. Ax = 0. As long as det A 6= 0, this can onlyhappen at x = 0. If A is singular, there may be other nonzeroequilibrium solutions (we’ll see an example of this).

(2) Again, just as with a single ODE we can classify the equilibriumsolution at 0 as stable or unstable. However, in two dimensionsthings become a little more complicated. In the previous exam-ple, the eigenvector v1 pulled solutions away from 0, whereasv2 pulled solutions towards 0. This is because of the expo-nentials associated with each eigenvector: along v1 solutionsbehave like e2t, while along v2 they behave like e−3t. Thuswhether the eigenvalues are positive or negative plays a majorrole in the stability of the equilibrium solution. If we focus on2-D systems for the time being, we find that there are severalpossible cases:

(a) Both eigenvalues are complex and are complex conjugatesof one another (we’ll come back to this later)

(b) There is a repeated eigenvalue (again, we’ll come back tothis)

(c) There are two distinct real eigenvalues, in which case wehave a few possibilities:

(i) Both eigenvalues are negative, in which case the ori-gin is a stable node (meaning all trajectories of so-lutions approach it)

(ii) Both eigenvalues are positive, in which case the ori-gin is an unstable node

(iii) One eigenvalue is negative and one is positive, inwhich case the origin is a saddle node, as in theprevious example. In this case, solutions follow theeigenvector with positive eigenvalue to ±∞ unlessthey start exactly on the other eigenvector, in whichcase they will approach the origin.

142

(iv) One of the eigenvalues is zero, in which case the ma-trix is not invertible and we obtain an entire line ofequilibrium solutions, the stability of which dependson the sign of the other eigenvalue.

In the previous example, we needed a computer to plot the phaseplane. However, if we find the eigenvalues and eigenvectors of A firstwe can use the above knowledge to give a good general picture of thephase plane without using a computer.

Example 3: Find the general solution of

x′ =

(−2 11 −2

)x

How do solutions behave as t→∞? First, we find the eigenvalues andeigenvectors of the matrix A:

det(A− λI) =

∣∣∣∣−2− λ 11 −2− λ

∣∣∣∣ = λ2 + 4λ+ 3 = 0

which has roots λ1 = −1, λ2 = −3. Finding the eigenvectors:(−1 11 −1

)(x1x2

)=

(00

)=⇒ x1 = x2 =⇒ v1 =

(11

)(

1 11 1

)(x1x2

)=

(00

)=⇒ x1 = −x2 =⇒ v1 =

(1−1

)Thus the general solution is given by

x = c1

(11

)e−t + c2

(1−1

)e−3t =

(c1e−t + c2e

−3t

c1e−t − c2e−3t

)We can see explicitly in the above solution that x → 0 as t → ∞.However, we can also plot the eigenvectors and use this to analyze thebehavior of solutions. In this case, both eigenvalues are negative, sothe origin is a stable node. Solutions move towards the origin by firstmoving in the direction of v2, then moving along v1. This is becauseλ2 = −3 is the eigenvalue with larger absolute value, so the behav-ior of e−3t, which occurs along v2, initially dominates the behavior ofe−t. However, once the solution comes very close to v1, the weakereigenvector takes over in pulling the solution towards the origin. Theabove discussion is demonstrated in the following plot (which we cannow in theory reproduce with a fair amount of accuracy without usinga computer).

143

Figure 27. Example 3 Phase Plane

Notice that we now put arrows on the eigenvectors pointing towardsthe origin to indicate the direction of the flow along each eigenvector.If one of the eigenvalues were positive, we would place arrows point-ing away from the origin. We could also write the eigenvalues aboveeach eigenvector explicitly in this portrait to indicate which one has astronger pull on the solution.

Comment: Even though so far we have only found general solu-tions of systems, we can solve IVPs just as we did before. For example,in the previous problem suppose we are given the initial conditions

x(0) =

(10

). Then c1 and c2 must satisfy c1 + c2 = 1, c1 − c2 = 0,

which is a system that we can solve for c1 and c2 to obtain an exactsolution to the given IVP.

As discussed previously, the nature of our solution may depend onwhether A has a repeated eigenvalue. However, if A does have a re-peated eigenvalue (say of algebraic multiplicity n), then the form ofour solution only changes if the geometric multiplicity of A is strictlyless than n, otherwise the methods of this section still apply. Let’s seethis in an example:

144

Example 4: Find the general solution of the system

x′ =

3 2 42 0 24 2 3

x

Notice that we already found the eigenvalues and eigenvectors of thismatrix in the last example of §7.3: λ1 = −1 has algebraic and geometricmultiplicity 2 and its associated eigenvectors are given by

v1 =

1−20

, and v2 =

0−21

In addition, λ2 = 8 is a simple eigenvalue (multiplicity 1) with corre-sponding eigenvector

v3 =

212

In this case, we have a repeated root in our characteristic equation.However, since the geometric multiplicity of the repeated eigenvalue isthe same as the algebraic multiplicity, we still have a full set of linearlyindependent eigenvectors, and can write our general solution as

x = c1

1−20

e−t+c2

0−21

e−t+c3

212

e8t =

c1e−t + 2c3e

8t

(−2c1 − 2c2)e−t + c3e

8t

c2e−t + 2c3e

8t


§7.6: Complex EigenvaluesIn this section, we continue to consider the problem of solving a

constant-coefficient system of ODEs of the form x′ = Ax, where asbefore we consider A to be real-valued. In the previous section, weonly considered the case where A had real eigenvalues. However, if theeigenvalues of A are complex, then our solution will behave differently.Let’s introduce some of the basic ideas through an example:

Example 1: Suppose we consider the system

x′ =

(−1 −41 −1

)x

145

Finding the eigenvalues of A:∣∣∣∣−1− λ −41 −1− λ

∣∣∣∣ = λ2 + 2λ+ 5 = 0

Using the quadratic equation, we find that the roots are given by λ1 =−1+2i and λ2 = −1−2i. We can also find the associated eigenvectors:for λ1 (

−2i −41 −2i

)(x1x2

)=

(00

)=⇒ x1 = 2ix2 =⇒ v1 =

(2i1

)and for λ2 (

2i −41 2i

)=⇒ x1 = −2ix2 =⇒ v2 =

(−2i

1

)This means that our fundamental set of solutions is given by (we cancheck the Wronskian to verify this explicitly if we wish)

x(1)(t) =

(2i1

)e(−1+2i)t, x(2)(t) =

(−2i

1

)e(−1−2i)t

One useful thing to notice: when we have two complex eigenvalues(which are automatically complex conjugates of one another), theireigenvectors will also be complex conjugates of one another providedA is real. Why is this? If λ is an eigenvalue of A with eigenvectorx, then λx = λx = Ax = Ax, so that λ is an eigenvalue of A witheigenvector x.

As we did in Chapter 3, we can choose the real and imaginary partsof x(1) and x(2) as our solutions. Computing:

x(1)(t) =

(2i1

)e−t [cos 2t+ i sin 2t] =

(−2e−t sin 2te−t cos 2t

)+ i

(2e−t cos 2te−t sin 2t

)Notice that x(2)(t) = x(1)(t), so we don’t gain anything new by lookingat the real and imaginary parts of x(2)(t). Now, we can apply thetheorem from §7.4 to instead choose the fundamental set of solutions

y(1)(t) =


), y(2)(t) =


)Thus just as with second-order ODEs, in the case of complex roots ofthe characteristic equation it is best to express our solution as a linearcombination of sines and cosines, rather than complex exponentials.We could check explicitly using the Wronskian that y(1) and y(2) do infact form a fundamental set of solutions to the system (left as an exer-cise for the reader). Thus the general solution can be more conveniently

146

written as

x(t) = c1


)+ c2


)How do solutions of systems with complex eigenvalues behave? Com-plex exponentials cause rotation in the phase plane, and the sign of thereal part of λ determines the stability of the origin (which still remainsan equilibrium solution). Just as before, if the real part is negative, wehave a stable spiral. If the real part is positive, it’s an unstable spi-ral. If the roots are purely imaginary, then we have a center where allsolutions oscillate without any growth or decay (similar to the behav-ior of a mass on a spring in the absence of damping or forcing terms).Let’s plot the phase plane of our solution along with a few trajectories:

Figure 28. Example 1 Phase Plane with Solution Trajectories

Clearly, the solutions are all converging towards the origin while ro-tating in a counterclockwise motion, hence why this type of solution isreferred to as a stable spiral. If we wanted to, we could also plot x1or x2 individually as functions of t to see the rate of convergence to 0.Note that the motion could also be clockwise depending on the matrixA.

As before, we can recreate a rough sketch of this phase portrait with-out using a computer. We simply need to know: a) that the eigenvalues

147

are complex, b) whether their real part is positive, negative, or zero,and c) the direction of rotation. To find the direction of rotation, wecan just plug in a simple choice of x and see what Ax is. In the above

example, if we plug in x =

(10

)we see that Ax =

(−11

), which shows

that the motion is counterclockwise.

More generally, let’s find the fundamental solutions for a system withtwo complex eigenvalues r = λ + iµ and r = λ − iµ and eigenvectorsξ and ξ. If we write the eigenvector in terms of its real and imaginaryparts as ξ = a + ib then one of our fundamental solutions is

x(1)(t) = ξert = (a + ib)e(λ+iµ)t = (a + ib)eλt [cosµt+ i sinµt]

= eλt[a cosµt− b sinµt] + ieλt[a sinµt+ b cosµt]

so we can in general write our fundamental set of solutions as

y(1)(t) = eλt[a cosµt− b sinµt], y(2)(t) = eλt[a sinµt+ b cosµt]

We now have a fairly good description of the different types ofequilibrium solutions that may occur at the origin in a 2x2 constant-coefficient homogeneous system:

(1) Eigenvalues are real with opposite signs =⇒ saddle

(2) Eigenvalues are real with same signs =⇒ stable/unstable node

(3) Eigenvalues complex with nonzero real part =⇒ stable/unstablespiral

These are the most frequently occurring, but we have left a few off.Repeated eigenvalues, zero eigenvalues, and purely imaginary eigenval-ues are all possibilities, but occur primarily as special cases (and astransitions between those states listed above)

We can study the behavior of systems as functions of the entries ofthe matrix A, as the next example illustrates:

Example: How does the behavior of the system

x′ =

(0 −51 α

)x

depend on α? In this case, the characteristic polynomial is given by−λ(α − λ) + 5 = λ2 − αλ + 5 = 0. We can use the quadratic formula

148

to find the roots:

λ1,2 =α±√α2 − 20

2

Thus if |α| >√

20 we have two distinct real roots. If α < −√

20,then both eigenvalues are negative, so 0 is a stable node. If α >

√20

then both eigenvalues are positive, so 0 is an unstable node. Whenα = ±

√20 we have a repeated eigenvalue, which we haven’t yet de-

scribed (but we will). For −√

20 < α <√

20 the eigenvalues arecomplex. If −

√20 < α < 0, their real part is negative and 0 is a stable

spiral. If 0α <√

20 their real parts are positive, so 0 is an unstablespiral. Finally, at α = 0 the eigenvalues are purely imaginary, resultingin a center.

Now, let’s return to the two-mass, three-spring system from before.In the absence of damping and external forces, the equations for themotions of the masses are given by

m1u′′1 = −(k1 + k2)u1 + k2u2

m2u′′2 = k2u1 − (k2 + k3)u2

We can convert this into a system of 4 first-order ODEs if we writex1 = u1, x2 = u2, x3 = u′1, x4 = u′2, in which case we have

x′1 = x3

x′2 = x4

x′3 = −k1 + k2m1

x1 +k2m1

x2

x′4 =k2m2

x1 −k2 + k3m2

x2

or in vector-matrix form

x′ =

0 0 1 00 0 0 1

−k1+k2m1

k2m1

0 0k2m2

−k2+k3m2

0 0

x

Since this is a 4x4 matrix, finding eigenvalues and eigenvectors withoutthe assistance of a computer is painful. Suppose we choose the valuesm1 = 2, m2 = 9/4, k1 = 1, k2 = 3, and k3 = 15/4. In this case, after

149

having a computer find the eigenvalues and eigenvectors we arrive atthe general solution

x = c1

3 cos t2 cos t−3 sin t−2 sin t

+ c2

3 sin t2 sin t3 cos t2 cos t

+ c3

3 cos 2t−4 cos 2t−6 sin 2t8 sin 2t

+ c4

3 sin 2t−4 sin 2t6 cos 2t−8 cos 2t

Notice that the above solution tells us that the eigenvalues of the matrixfor these values of m1,m2, k1, k2, k3 are given by ±i and ±2i. Thus thebehavior of this solution is a higher-dimensional analogue of a center.Specifically, the trajectory of any solution (which would be representedin 4-dimensional phase space) is a closed loop since x(t + 2π) = x(t)for any value of t. Thus the trajectories of solutions are in essencefour-dimensional analogues of ellipses, which are extremely difficult tovisualize. However, a few comments are in order:

(1) The two pairs of complex eigenvalues correspond to the two dif-ferent rates of oscillation present in the solution: one oscillationoccurs with a frequency 1 (which is represented by the first twoterms) and the other oscillation occurs with a frequency of 2.In general, solutions of the system will contain a superpositionof these two rates of oscillation. However, for particular sets ofinitial conditions we can obtain more coherent behavior.

(2) Notice that the first two terms in the general solution are theterms oscillating with frequency 1, whereas the final two termsoscillate at the higher frequency. If c3 = c4 = 0, so that theentire system oscillates at the lower frequency, we notice a cou-ple interesting traits. For one, since the components of each ofthe first two terms satisfy x2 = (2/3)x1 and x4 = (2/3)x3, itfollows the amplitude of the second spring’s motion is alwaystwo-thirds the amplitude of the first spring’s motion. On topof that, the two springs oscillate in phase with one another.

(3) The final two terms oscillate at the higher frequency, so sup-posing c3 = c4 = 0 we have x2 = −(4/3)x1 and x4 = −(4/3)x3.Thus in this case, the second spring has the larger amplitude,and the two springs are completely out of phase with one an-other.

Tangent: In each of the previous two cases we considered, we madesimplifying assumptions on the constants in our general solution. When

150

do these assumptions actually occur? It turns out that c3 = c4 = 0only when x2(0) = (2/3)x1(0) and x4(0) = (2/3)x3(0) (and similarlyfor the higher frequency oscillation). In these particular cases, the sys-tem begins in a fundamental mode of vibration, and once initiallyexcited in this state the system remains in it at all later times. In thecase of a vibrating string, these fundamental modes are referred to asharmonics. By shortening the wavelength of one of these modes, onecan excite higher modes, which have shorter wavelengths. This typeof behavior of a vibrating string is what allows guitar and bass play-ers to create artificial harmonics. For instance, an open A string onguitar vibrates at 110 Hz, half the frequency of a A note an octavehigher (really, this is true on any instrument). By placing one’s fingeron the right portion of a guitar fretboard (precisely halfway betweenthe ends of the string), one can halve the wavelength and thus doublethe frequency to play a note one octave higher. Artificially shorteningthe string by other intervals can create even higher-order harmonics.For instance, by shortening the string to a third its normal length, youcan obtain a note that is an octave plus a perfect fifth higher than thefundamental. This is because the frequency ratio between a note andthe note a perfect fifth higher is almost exactly 2:3, but there is a slightdifference. In fact, the ratio used to be exact hundreds of years agoin the tuning system known as Pythagorean tuning, but led to somedisastrous consequences (google “wolf fifths” to see the horror). In the-ory, one can continue this process, shortening the string ad infinitumto obtain higher and higher harmonics, however the higher the modesthe harder they are to excite.


§7.7: Fundamental MatricesSo far, we have written general solutions to systems of ODEs just

as we did in Chapter 3: as linear combinations of the fundamentalsolutions. However, we can also express general solutions using matrix-vector notation. To illustrate this point: suppose we are interested inthe system

x′ = P(t)x

151

and find a fundamental set of solutions x(1), ...,x(n). Then our generalsolution is given by

x(t) = c1x(1)(t) + ...+ cnx

(n)(t) = c1

x(1)1 (t)

...

x(1)n (t)

+ . . .+ c1

x(n)1...

x(n)n

=

x(1)1 (t) . . . x

(n)1 (t)

......

x(1)n (t) . . . x

(n)n (t)

c1...cn

:= Ψ(t)c

Here, c is just the vector of constants. However, the matrix Ψ formedby the column vectors x(i) has a special purpose: so we call Ψ a fun-damental matrix. As long as the x(i) form a fundamental set ofsolutions, its columns are linearly independent, and hence it is invert-ible.

Example: Find a fundamental matrix of

x′ =

(2 −13 −2

)x

The eigenvalues of the above matrix are λ1 = 1, λ2 = −1. The as-

sociated eigenvectors are given by v1 =

(11

)and v2 =

(13

). Thus a

fundamental set of solutions is given by

x(1) =

(11

)et =

(et

et

), x(2) =

(13

)e−t =

(e−t

3e−t

)It follows that a fundamental matrix is given by

Ψ(t) =

(et e−t

et 3e−t

)

We can use the notion of a fundamental matrix to help us moreeasily solve IVPs. Returning to our previous discussion: we saw thatthe general solution of x′ = P(t)x was given by x = Ψ(t)c for c anarbitrary constant vector. However, suppose we supplement the systemwith initial condition x(t0) = x0. In this case, we must have

x(t0) = Ψ(t0)c = x0

152

Recall that Ψ must be invertible, so we can multiply this equation byΨ−1 on both sides to obtain

c = Ψ−1(t0)x0

If we plug this value of c into our equation for x we obtain the formulafor the solution of the IVP

x(t) = Ψ(t)Ψ−1(t0)x0

Thus in principle, given an IVP we only need to

(1) Find a fundamental set of solutions

(2) Form the fundamental matrix (and find its inverse)

(3) Solve for c using Ψ−1

(4) Multiply by Ψ to find x

Recall that we can make sense of differentiating a matrix (simply bydifferentiating each of its coefficients individually). In this case, differ-entiating Ψ and using the fact that its columns are made up of solutionsto the system x′ = P(t)x, we see that Ψ must also solve

Ψ′ = P(t)Ψ

Modification: Recall from §7.4 that given an IVP with initial con-dition given at t0 we can always find a fundamental set of solutionssatisfying the initial conditions

x(1)(t0) =

10...0

, . . . x(n)(t0) =

00...1

In this case, we can see that the fundamental matrix formed by thex(i) is just the identity matrix at t0. This is an especially nice offundamental matrix for solving an IVP, so we often distinguish it fromany other fundamental matrix by writing it as Φ(t), so Φ(t0) = I. Thissimplifies the formula for our solution x, allowing us to write

x(t) = Φ(t)Φ−1(t0)x0 = Φ(t)x0

since Φ−1(t0) is just the identity matrix. Note that if we compare thetwo different forms of the solution for x, we find a relationship between

153

the special fundamental matrix Φ and any other fundamental matrixΨ, namely

Φ(t) = Ψ(t)Ψ−1(t0)(14)

Example: Let’s return to the previous example. When searchingfor a solution to

x′ =

(2 −13 −2

)x

we found a fundamental matrix

Ψ(t) =

(et e−t

et 3e−t

)However, suppose we consider an IVP with initial time t = 0. In thiscase, we can find the matrix Φ(t) previously discussed by requiringthat Φ(0) = I. We can do this one of two ways: either search for linearcombinations y(1) and y(2) of the fundamental set of solutions we found

previously that satisfy y(1)(0) =

(10

), y(2)(0) =

(01

). Alternatively,

we can use formula (14). Proceeding using the second option, we have

Ψ(0) =

(1 11 3

)=⇒ Ψ−1(0) =

(3/2 −1/2−1/2 1/2

)

Φ(t) = Ψ(t)Ψ−1(0) =

(et e−t

et 3e−t

)(3/2 −1/2−1/2 1/2

)=

(32et − 1

2e−t −1

2et + 1

2e−t

32et − 3

2e−t −1

2et + 3

2e−t

)Exercise to the reader: check that the first method gives the sameresult. Now we can solve any IVP with initial condition x(0) = x0 veryeasily: the solution is given by

x(t) =

(32et − 1

2e−t −1

2et + 1

2e−t

32et − 3

2e−t −1

2et + 3

2e−t

)x0

Suppose we return to considering constant-coefficient systems. Sup-pose we want to solve the IVP

x′ = Ax, x(0) = x0

In the case n = 1, this is just the IVP

x′ = ax, x(0) = x0

where a is some constant. In this case, the solution would be given by

x(t) = x0eat

154

For n ≥ 1, we have seen that the solution to the above system is givenby

x = Φ(t)x0

By analogy with the one-dimensional case, it would be reasonable toguess that Φ(t) might be some sort of exponential function. But whatdoes this mean exactly? The answer comes from power series: recallthat

ex =∞∑n=0

xn

n!

where the above series converges for all values of x. Plugging in x = at(as in the case of the 1-dimensional example) gives

eat =∞∑n=0

antn

n!

What if instead of a constant a we consider the n × n matrix A? Inthis case, we could attempt to define the matrix exponential

eAt =∞∑n=0

Antn

n!

where we compute each term An using matrix multiplication (notethat we define A0 = I, so the first term in the sum is just the identitymatrix). Each term in the sum is an n × n matrix, so when we addthem all up we still have an n × n matrix. However, does the sumconverge? The answer is yes, and it relies on an estimate of the form∣∣∣∣∣

∣∣∣∣∣∞∑n=0

Antn

n!

∣∣∣∣∣∣∣∣∣∣ ≤

∞∑n=0

||An||tn

n!≤

∞∑n=0

||A||ntnn! = e||A||t

where ||A|| denotes the matrix norm of A (actually there are many ofthese, but in the above estimate it doesn’t matter which one we use).The point is, we can bound the matrix norm of the infinite sum bye||A||t, which is just some number, and it follows that the sum indeedconverges, so the matrix exponential is well-defined.

In addition, defining the matrix exponential in this way means thatit retains many of the useful properties of the scalar exponential thatwe are accustomed to. Namely, suppose we try to differentiate eAt withrespect to t. This can be done by differentiating the power series given

155

above term-by-term.

d

dteAt =

∞∑n=1

Anntn−1

n!=∞∑n=1

Antn−1

(n− 1)!

= A∞∑n=1

An−1tn−1

(n− 1)!= A

∞∑n=0

Antn

n!= AeAt

Another useful property of the matrix exponential is that evaluatingat t = 0 gives the identity matrix. This fact also follows from the powerseries definition of eAt. Put another way

eAt|t=0 = I

Combining these two facts, we see that the matrix exponential solvesthe IVP

d

dteAt = AeAt, eAt|t=0 = I

However, the fundamental matrix Φ discussed previously also satis-fies the same IVP:

Φ′(t) = AΦ, Φ(0) = I

It follows from the uniqueness of solutions to IVPs (applied to matrixequations) that in fact Φ = eAt, so the matrix exponential gives anotherway of calculating the fundamental matrix discussed previously. Inother words, the solution to the IVP

x′ = Ax, x(t0) = x0

is given byx(t) = eAtx0

A couple other properties of the matrix exponential that extend ideasof the usual exponential function:

eAteAs = eA(s+t), e−At = (eAt)−1

We have seen that in theory given an IVP of the form

x′ = Ax, x(t0) = x0

we need only compute the matrix exponential of A to be able to solvethe IVP for any initial condition. However, in practice this is quitedifficult. Why? We must take all positive powers of A and add themall up as an infinite sum, which is generally computationally infeasible.However, if we can diagonalize the matrix A, it will make the process

156

much easier. By diagonalizing A, we are essentially decoupling thesystem of differential equations up to some similarity transformation.A quick reminder on how diagonalizing a matrix works:

First, A is diagonalizable if it has n linearly independent eigenvec-tors (this is guaranteed, for instance, if all the eigenvalues of A aredistinct). Suppose the eigenvectors are given by v(1), ...,v(n) and Ahas eigenvalues λ1, ..., λn. If

T =

v(1)1 . . . v

(n)1

......

v(1)n . . . v

(n)n

, D =

λ1 0 . . . 00 λ2 . . . 0...

......

0 0 . . . λn

then we have the similarity transformation

D = T−1AT

so that the matrix whose columns are the eigenvectors of A can diago-nalize A into a matrix with its eigenvalues on the diagonal. Notice thatthe above similarity transformation leaves the eigenvalues of A fixed,but reorients the axes so that the eigenvectors become the standardbasis vectors. For this reason, matrices like D are often referred toas change-of-basis matrices. Note: if A has fewer than n linearlyindependent eigenvectors, this process will not work.

Example: Diagonalize the matrix

A =

(2 −13 −2

)From what we saw in previous examples, the eigenvalues of A are given

by λ1 = 1, λ2 = −1, v1 =

(11

), v2 =

(13

), so in this case the similarity

transformation is given by(3/2 −1/2−1/2 1/2

)(2 −13 −2

)(1 11 3

)=

(1 00 −1

)

Let’s use the process of diagonalization to aid us in solving IVPswith the matrix exponential. Considering the general system

x′ = Ax

under the assumption that A is diagonalizable, let’s consider a trans-formed version of the given system. Specifically, suppose x = Ty,

157

where T is the matrix of eigenvectors of A(since A has a full set ofeigenvectors, we can always do this). In this case, we may rewrite theoriginal system as

Ty′ = ATy

If we multiply both sides of this equation by T−1 we obtain

y′ = T−1ATy = Dy

where D is the diagonal matrix of eigenvalues of A. Since D is diagonal,we can easily compute

Dk =

λk1 0 . . . 00 λk2 . . . 0...

......

0 0 . . . λkn

It follows from the definition of the matrix exponential that

eDt =

eλ1t 0 . . . 00 eλ2t . . . 0...

......

0 0 . . . eλnt

which gives the fundamental matrix for the ODE y′ = Dy. To get backthe fundamental matrix for x, we simply invert the transformation,multiplying by the matrix T, which gives

Ψ(t) =

v(1)1 eλ1t . . . v

(n)1 eλnt

......

v(1)n eλ1t . . . v

(n)n eλnt

as a fundamental matrix for the original system. Let’s demonstrate theprocess with the same example.

Example: If

A =

(2 −13 −2

)then

D =

(1 00 −1

), T =

(1 11 3

)so that a fundamental matrix is given by

Ψ(t) =

(et e−t

et 3e−t

)

158

Compare this to the fundamental matrix found previously using alter-native methods.

In general, if D is a diagonal matrix with entries d1, . . . , dn on thediagonal then eDt is easy to compute: it’s another diagonal matrixwith diagonal entries ed1 , . . . , edn . However, in general it is often muchharder to compute the matrix exponential. By exploiting certain prop-erties of the matrix A we can do it though.

Example: Compute the matrix exponential of

A =

(0 11 0

)This matrix is not diagonal, so we can’t use the preceding observation.However, A2 = I. Using this fact, we can see that An = A if n is oddand An = I if n is even. Thus

eAt =∞∑n=0

Antn

n!=∞∑n=0

t2n

(2n)!t2n+1

(2n+1)!

t2n+1

(2n+1)!t2n

(2n)!

=

∑∞n=0

t2n

n!

∑∞n=0

t2n+1

(2n+1)!∑∞n=0

t2n+1

(2n+1)!

∑∞n=0

t2n

n!

=

(cosh t sinh tsinh t cosh t

)7.7 Suggested Problems: 4,6,10,15,16

§7.8: Repeated EigenvaluesSo far in our attempts to solve the equation x′ = Ax, we have dealt

with both real and complex roots of the characteristic equation. How-ever, in all the examples we have seen so far the matrix A has had afull set of n linearly independent eigenvectors. In this section, we dealwith the case that A has repeated eigenvalues.

We discussed previously that even if A has repeated eigenvalues, itmay still have a full set of eigenvectors. In this case, the methods of§7.5 still apply (see, for instance Example 4 of §7.5). However, if Ahas an eigenvalue λ of algebraic multiplicity k and the geometric mul-tiplicity of λ is strictly less than k, we need another method. Let’s

159

demonstrate with an example:

Example:

x′ =

(−3/2 1−1/4 −1/2

)x

Searching for eigenvalues of A:∣∣∣∣−3/2− λ 1−1/4 −1/2− λ

∣∣∣∣ = λ2 + 2λ+ 1

which has a double root at λ = −1. Looking for eigenvectors:(−1/2 1−1/4 1/2

)which gives the lone equation x1 = 2x2, and thus just one eigenvector:

v1 =

(21

). Thus we have found one solution

x(1) =

(21

)e−t

but to obtain a fundamental set we need a second independent solution.One reasonable guess would be to follow the method of Chapter 3 andmultiply our first solution by t. Let’s see if that works: if

x(2) = v1te−t

where v1 is the eigenvector we just found, then plugging into the dif-ferential equation gives

v1e−t − v1te

−t = Av1te−t

However, if we balance coefficients of e−t, we see it must be the casethat v1 = 0. This is bad. How can we fix it? We need something tobalance the v1e

−t term on the other side of the equation: so we’ll adda term of the form ηe−t, where eta is a vector to be determined. Thusthe second solution we want to guess is in fact

x(2) = v1te−t + ηe−t

Suppose we plug this guess into the system. First,

x(2)′ = v1e−t − v1te

−t − ηe−t

In addition

Ax(2) = Av1te−t + Aηe−t = Aηe−t − v1te

−t

Now, in order to solve the equation, we must require that η satisfy

v1e−t − ηe−t = Aηe−t

160

or if we cancel out all the et terms

v1 − η = Aη

We can rewrite this as(A + I)η = v1

This is the equation we need to solve. Writing A and v1 explicitly:(−1/2 1−1/4 1/2

)(x1x2

)=

(21

)This gives the equation x2 − 1

2x1 = 2, or x2 = 2 + 1

2x1. Thus we can

write

η =

(x1

2 + 12x1

)= x1

(112

)+

(02

)However, notice that the vector

(1

1/2

)is just a multiple of v1, so when

we plug it into our guess for x(2), we’ll just be getting terms we alreadyfound in x(1). Thus we can drop it from η and instead our secondindependent solution is given by

x(2) =

(21

)te−t +

(02

)e−t

and our general solution is given by

x(t) = c1

(21

)e−t + c2

[(21

)te−t +

(02

)e−t]

In terms of phase plane analysis, when there is a repeated eigenvaluethe origin is called an improper node. Just as with regular nodesand spirals, the improper node can be stable or unstable, dependingon the sign of λ. In the above case, it is stable since λ < 0. How dotrajectories behave when a system exhibits an improper node? Theystill approach 0 if the node is stable and infinity if it is unstable. Butthere is only one eigenvector now, so it’s harder to say the mannerthey approach in. In fact, they behave quite a bit like spirals, onlythey approach the axis of the eigenvector. Here’s a phase portrait ofthe previous example with some sample trajectories:

161

Figure 29. Improper Node Phase Portrait

As with the other cases, we can (roughly) recreate this picture with-out the use of a computer. Once we know the eigenvector and eigen-value, we know what vector solutions end up moving along, as well aswhether they move towards or away from 0. In addition, the trajec-tories rotate towards the direction of the eigenvalue, and all we mustdo is find the direction of rotation. To do this, we can just plug in anice point and observe what happens. For instance, in the preceding

example, if we let x =

(01

), then Ax =

(1−1/2

), which is down and

to the right. Thus in this case the solutions swirl in a clockwise motion.

The previous example demonstrates the more general process fordealing with matrices without a full set of eigenvectors. Specifically, ifwe have an eigenvalue λ of algebraic multiplicity two, but λ only hasone eigenvector v, then our first fundamental solution (as always) isgiven by x(1) = veλt. We can find a second independent solution bysolving

(A− λI)η = v

(This is exactly what we did in the example). In this case, our secondindependent solution is given by

x(2) = vteλt + ηeλt

162

Oftentimes we say that η is a generalized eigenvector with eigen-value λ if it solves

(A− λI)η = v

This is because we have

(A− λI)2η = 0

so in a sense η is a “second-order” eigenvector.

Just as with equations with only simple eigenvalues, we can find afundamental matrix for systems where A has repeated eigenvalues. Inthe case of the previous example, we had

x(1) =

(21

)e−t, x(2) =

(21

)te−t +

(02

)e−t

so the corresponding fundamental matrix is given by

Ψ(t) =

(2e−t 2te−t

e−t te−t + e−t

)= e−t

(2 2t1 t+ 1

)In addition, we can compute the fundamental matrix Φ(t) satisfyingΦ(0) = I (aka the matrix exponential) by computing

Ψ(t)Ψ−1(0) = e−t(

2 2t1 t+ 1

)(1/2 0−1/2 1

)= e−t

(1− t 2t−t/2 t+ 1

)

Finally, we return to the discussion of diagonalizing a matrix A.Recall that A can only be diagonalized if it has a full set of linearly in-dependent eigenvectors. However, in this section we deal with matriceswhere that is not the case. Ideally, we would like to have an analogoustransformation that simplifies the matrix A into an “almost-diagonal”form even if it isn’t truly diagonalizable. This almost-diagonal form isknown as Jordan normal form. A matrix is in Jordan normal formif it has all its eigenvalues on the diagonal (just as when we diagonal-ize a matrix) and no other nonzero entries excluding ones in certainpositions on the diagonal just above the main diagonal. We can puta non-diagonalizable matrix in Jordan normal form through the use ofgeneralized eigenvectors.

Example: Returning to the matrix

A =

(−3/2 1−1/4 −1/2

)

163

we saw that A had the repeated eigenvalue λ = −1 with eigenvector

v =

(21

)and generalized eigenvector η =

(x1

(1/2)x1 + 2

). If we choose

x1 = 0, we can form the matrix of generalized eigenvectors (note: wechoose x1 = 0 so we get a fundamentally different vector not containingpieces of the original eigenvector)

T =

(2 01 2

)and we can check that

T−1AT =

(1/2 0−1/4 1/2

)(−3/2 1−1/4 −1/2

)(2 01 2

)=

(−1 10 −1

)=: J

gives the Jordan normal form of the matrix A. The one appears abovethe diagonal whenever there is an eigenvector missing.

Example: Suppose A is 5×5 with three distinct eigenvalues λ1, λ2, λ3each having just one eigenvector, but λ1 and λ2 have algebraic multi-plicity two. Then the Jordan normal form of A would be given by

λ1 1 0 0 00 λ1 0 0 00 0 λ2 1 00 0 0 λ2 00 0 0 0 λ3

Finally, we can use a similar process to that seen at the end of §7.7

to solve x′ = Ax involving diagonalizing A, but now we may usethe Jordan normal form if A is not diagonalizable. Specifically, if wewrite x = Ty for T our generalized eigenvector matrix, then the abovesystem becomes

Ty′ = ATy

or multiplying by T−1 on the right:

y′ = Jy

From here, we can compute eJt (see the homework problem on com-puting the exponential of a 2×2 Jordan block) to find y, then multiplyby T−1 to find x.


164

§7.9: Nonhomogeneous Linear SystemsIn this section, we show several different methods of solving nonho-

mogeneous linear systems of the form

x′ = P(t)x + g(t)

Just as in the case of second-order linear ODEs, we can find the generalsolution of the preceding system by combining the general solution ofthe corresponding homogeneous system x′ = P(t)x with a particularsolution of the nonhomogeneous system. In particular, if x(1), . . . ,x(n)

form a fundamental set of solutions of the homogeneous problem andv is a particular solution of the nonhomogeneous equation, then thegeneral solution of the nonhomogeneous equation is given by

x(t) = c1x(1)(t) + . . .+ cnx

(n)(t) + v(t)

In this section, we see four different methods of solving such equations.They are

(1) Diagonalization

(2) Undetermined Coefficients

(3) Variation of Parameters

(4) Laplace Transform

We have already encountered most of these techniques in some ca-pacity, and we will merely adjust them to solve systems of ODEs ratherthan a single equation. First, the most unfamiliar of the bunch:

Diagonalization: If we consider the case that the matrix P isconstant-coefficient, we have the system

x′ = Ax + g(t)

If A is diagonalizable, we may use a similarity transformation to de-couple the system (much like we have in the preceding two sections).Let T be the matrix of the eigenvectors of A. Since T is invertible, wecan write x = Ty and obtain the system

Ty′ = ATy + g(t)

Multiplying by T−1 on both sides gives

y′ = T−1ATy + T−1g(t) = Dy + h(t)

where h(t) = T−1g(t) and D is the diagonal matrix of the eigenvaluesof A. This is now an uncoupled system of equations: we can write it

165

in component form as

y′i(t) = λiyi + hi(t)

where λi is the eigenvalue corresponding to the ith eigenvector of A.Each of these equations can be solved individually using the integratingfactor method: letting µ(t) = e−λit gives the solution

yi(t) = eλit∫ t

t0

e−λishi(s) ds+ cieλit

for ci some constant. This gives the solution y and to obtain x we usethe relation x = Ty. In doing this, the constant terms in the abovesolution give the general solution of the homogeneous equation and theintegral becomes a particular solution of the nonhomogeneous equa-tion. Let’s do an example using this method:


x′ =

(2 −13 −2

)x +

(et

t

)We already found the eigenvalues and eigenvectors of the matrix A in

§7.7: they are λ1 = 1, λ2 = −1, v1 =

(11

), v2 =

(13

). This means

that the matrices of eigenvectors and eigenvalues are given by

T =

(1 11 3

), D =

(1 00 −1

)Thus if y = Tx, we obtain the following system for y:

y′ = Dy + T−1g(t) =

(1 00 −1

)y +

(3/2 −1/2−1/2 1/2

)(et

t

)or in component form

y′1 = y1 +3

2et − 1

2t

y′2 = −y2 −1

2et +

1

2t

If we solve each of these using the integrating factor method, we obtain

y1 = et∫

3

2− 1

2te−t dt =

3

2tet +

1

2(t+ 1) + c1e

t

y2 = e−t∫−1

2e2t +

1

2tet dt = −1

4et +

1

2(t− 1) + c2e

−t

166

and so we can write y in vector form as

y =

(32tet + 1

2(t+ 1) + c1e

t

−14et + 1

2(t− 1) + c2e

−t

)To find x, we just multiply y by T:

x =

(1 11 3

)( 32tet + 1

2(t+ 1) + c1e

t

−14et + 1

2(t− 1) + c2e

−t

)

=

(32tet + 1

2(t+ 1) + c1e

t − 14et + 1

2(t− 1) + c2e

−t

32tet + 1

2(t+ 1) + c1e

t − 34et + 3

2(t− 1) + 3c2e

−t

)

= c1

(11

)et + c2

(13

)e−t +

(3/23/2

)tet −

(1/43/4

)et +

(12

)t−(

01

)

Undetermined Coefficients: Just as in Chapter 3, we may usethe method of undetermined coefficients to obtain particular solutionsto the system

x′ = Ax + g(t)

As before, this method only works if the nonhomogeneous term is com-posed of sines, cosines, polynomials, and exponentials. The method isalmost identical to the one from Chapter 3, except for a couple keydifferences. First, our undetermined coefficients are now vectors ratherthan numbers. Second, if part of the nonhomogeneous term solvesthe homogeneous problem, we need to change the guess we make ina slightly different way. Before, if the nonhomogeneous term was eλt

where λ was a (simple) root of the characteristic equation, we wouldguess teλt. However, now the appropriate guess is ateλt + beλt, wherea and b are undetermined vectors. In other words, we don’t simplymultiply our guess by t, we must keep the lower-order term aroundtoo. The fact that we need to do this is closely related to the issuesthat arose in §7.8 when we tried to deal with repeated eigenvalues bymultiplying our first fundamental solution by t. Let’s demonstrate themethod using the same example as before:

Example: Solve

x′ =

(2 −13 −2

)x +

(et

t

)using the method of undetermined coefficients. The nonhomogeneousterm contains a first-degree polynomial, so part of our guess will beat + b. To deal with the exponential term in g, the starting guess

167

would be cet; however λ = 1 is an eigenvalue. Combining this fact withthe previous discussion, we see that we must instead guess ctet + det.Putting it all together, our guess of particular solution is given by

x = at+ b + ctet + det

Computing:

x′ = a + cet + ctet + det

and

Ax +

(et

t

)= Aat+ Ab + Actet + Adet +

(10

)et +

(01

)t

If we balance coefficients, we obtain the system:

0 = Aa +

(01

)a = Ab

c + d = Ad +

(10

)c = Ac

The final equation says that c is an eigenvector of A with eigenvalue 1,

which using our knowledge of the eigenvectors of A means c =

(kk

).

The first equation can be written as(2 −13 −2

)(a1a2

)=

(0−1

)which gives a =

(12

). Then the second equation can be expressed as(

2 −13 −2

)(b1b2

)=

(12

)which yields b =

(0−1

). Given what we know about c, the third

equation can be rewritten as

(A− I)d =

(1 −13 −3

)(d1d2

)=

(k − 1k

)Notice that this tells us that k = 3(k − 1) or k = 3/2. This meansd1 − d2 = 1/2, so

d =

(d2 + 1/2

d2

)= d2

(11

)+

(1/20

)

168

The first term of d just gives a solution of the homogeneous equation, so

we can leave it out of our particular solutions and choose d =

(1/20

).

Thus we have the particular solution

x =

(12

)t−(

01

)+

(3/23/2

)tet +

(1/20

)et

and combining this with the general solution of the homogeneous equa-tion we obtain

x = c1

(11

)et + c2

(13

)e−t +

(12

)t−(

01

)+

(3/23/2

)tet +

(1/20

)et

This is exactly the solution we got using the diagonalization technique,but there is one difference: the coefficient in front of et is not the same.However, notice that the difference between the two is a multiple of

the eigenvector

(11

)present in the general solution of the homoge-

neous equation, so the two solutions are still the same: they will justhave different values of c1.

Variation of Parameters: Next, we examine how to use variationof parameters to solve more general systems of the form

x′ = P(t)x + g(t)

where P is not necessarily diagonalizable. As before, the method ofvariation of parameters is very general, but also can be difficult toapply explicitly. When introducing this method for second-order equa-tions, we assumed we already had a fundamental set of solutions. Inthis case, we assume that a fundamental matrix of the correspondinghomogeneous system x′ = P(t)x has already been found, and we de-note this matrix by Ψ(t) as usual. Previously, we looked for a weightedcombination of the fundamental solutions of the homogeneous problemas a particular solution of the nonhomogeneous equation. In this case,we search for a solution of the form

x = Ψ(t)u(t)

for u to be determined. If we differentiate this equation and plug intothe system we obtain

x′ = Ψ′(t)u(t) + Ψ(t)u′(t) = P(t)Psi(t)u(t) + g(t)

However, Ψ is a fundamental matrix so Ψ′(t) = P(t)Ψ(t). Thus theabove equation can be simplified to

Ψ(t)u′(t) = g(t)

169

Since Ψ is a fundamental matrix, it is invertible, so we can solve for u′:

u′(t) = Ψ−1(t)g(t)

and we can solve for u by integrating:

u(t) =

∫Ψ−1(t)g(t) dt+ c

where c is some constant vector. In this case, we can multiply by Ψ toobtain the general solution

x = Ψ(t)c + Ψ(t)

∫ t

t0

Ψ−1(s)g(s) ds

where the first term gives the general solution of the correspondinghomogeneous equation (as we saw in §7.7) and the second term is theparticular solution found using variation of parameters. Here, t0 issome appropriate choice of lower bound for integration. Specifically, ifwe’re given the IVP with initial condition x(t0) = x0, then we have thesolution

x(t) = Ψ(t)Psi−1(t0)x0 + Ψ(t)

∫ t

t0

Ψ−1(s)g(s) ds

If we choose the particular fundamental matrix Φ(t) satisfying Φ(t0) =I then we can rewrite the solution as

x(t) = Φ(t)x0 + Φ(t)

∫ t

t0

Φ−1(s)g(s) ds

Example: Let’s find the general solution of the same system asbefore using variation of parameters:

x′ =

(2 −13 −2

)x +

(et

t

)In this case, a fundamental matrix is given by

Ψ(t) =

(et e−t

et 3e−t

)Computing:

Ψ−1(t) =

(32e−t −1

2e−t

−12et 1

2et

)

Ψ−1(t)g(t) =

(32e−t −1

2e−t

−12et 1

2et

)(et

t

)=

(32− 1

2te−t

−12e2t + 1

2tet

)

170

Thus ∫Ψ−1(t)g(t) dt =

(32t+ 1

2(t+ 1)e−t

−14e2t + 1

2(t− 1)et

)+ c

and

x(t) =

(et e−t

et 3e−t

)c +

(et e−t

et 3e−t

)( 32t+ 1

2(t+ 1)e−t

−14e2t + 1

2(t− 1)et

)

= c1

(11

)et + c2

(13

)e−t +

(32tet + 1

2(t+ 1)− 1

4et + 1

2(t− 1)

32tet + 1

2(t+ 1)− 3

4et + 3

2(t− 1)

)

= c1

(11

)et + c2

(13

)e−t +

(3/23/2

)tet −

(1/43/4

)et +

(12

)t−(

01

)which agrees with the solution we found previously.

Laplace Transform: We may also use the Laplace transform to findsolutions of nonhomogeneous systems. The extension of the Laplacetransform to vector-valued functions merely involves taking the Laplacetransform of each individual component, and forming a vector from theresult. It follows that for vectors x we have the differentiation formula

L{x′(t)}(s) = sL{x(t)}(s)− x(0)

We can apply this formula to solve systems of ODEs using a similarprocess to that of Chapter 6.

Example: Let’s solve

x′ =

(2 −13 −2

)x +

(et

t

)one final time using the Laplace transform. Writing this equation as

x′ = Ax + g(t)

and taking the Laplace transform gives (with X(s) = L{x}(s))

sX(s)− x(0) = AX(s) + G(s)

where G(s) = L{g}(s). Notice that we use the fact that L{Ax} =AL{x} since the Laplace transform is linear. Using that L{et} = 1

s−1and L{t} = 1

s2, we have

G(s) =

(1s−11s2

)

171

We can rewrite the above equation as

(sI−A)X(s) = G(s) + x(0)

Let’s suppose x(0) = 0 to simplify the calculation. Then we can solvefor X:

X(s) = (sI−A)−1G(s)

In our case:

sI−A =

(s− 2 1−3 s+ 2

)(sI−A)−1 =

(s+2s2

−1s2

3s2

s−2s2

)

X(s) =

(s+2s2

−1s2

3s2

s−2s2

)(1s−11s2

)=

(s+2

s2(s−1) −1s4

3s2(s−1) + s−2

s4

)From this point, we can perform the standard partial fraction decom-position to invert the Laplace transform and find the solution x(t),however the process is a very lengthy one. For this reason, the Laplacetransform is a useful method for solving systems of ODEs when a com-puter is readily available for inverting the transform, but in most casesone of the previous three methods will be more feasible to compute byhand.


Evan Smothers€¦ · Read me rst: The following are my lecture notes for Math 22B. They will...

Documents

Transcript of Evan Smothers€¦ · Read me rst: The following are my lecture notes for Math 22B. They will...