Vector space stu · The two operations, adding and scaling, satisfy a long list of properties (see...

Chapter 3

Vector space stu↵

3.1 Vector spaces

The key to understanding, and using, Fourier’s claim is to take the perspective that functions“behave like functions.” In order to make this precise, let’s first review the properties of vectorsthat you learned in the vector calculus course.

The key properties of vectors is that they can be added and scaled:

• If v,w are vectors, then we can add them to obtain a new vector v +w.

• If v is a vector and ↵ is a number, then we can rescale v by ↵ to obtain a new vector ↵v.

The two operations, adding and scaling, satisfy a long list of properties (see Exercise 3.1 below)that basically mean that adding and scaling are compatible in the way you think they should be.For example, ↵(v +w) = ↵v + ↵w, etc. A collection of objects that can be added and scaled in acompatible way is called a vector space. The numbers that are used to rescale the objects in thevector space are called scalars. In this class we consider vector spaces with both real scalars and,in §3.4 below, complex scalars.

Example 3.1.1. The collection of all continuous functions with domain [0, L] forms a vector space.To see this note that if f and g are continuous functions, then the function f + g, defined by(f + g)(x) = f(x) + g(x), is a continuous function. Similarly, if f is a continuous function and ↵is a real number, then ↵f is also a continuous function.

It is common to give vector spaces a mathematical symbol as a name. The vector space ofvectors in n-dimensional space is given the symbol Rn. The vector space of continuous functionswith domain [0, L] is given the symbol C([0, L]).

In the vector calculus course, you used the scaling property of vector to parametrize lines inRn. For example, the line in R2 passing through p = (1, 3) and parallel to v = h4,�2i can beparametrized by the path

x(t) = p+ tv = (1 + 4t, 3� 2t). (3.1.1)

If instead we wanted a path that oscillated about the point p in the direction of v with frequencyof one oscillation per unit time, then the desired path is

x(t) = p+ sin(2⇡t)v = (1 + 4 sin(2⇡t), 3� 2 sin(14⇡t)).

25

26 CHAPTER 3. VECTOR SPACE STUFF

We can animate this oscillating path with the following Sage code.

c = animate ([ point ((1+4* sin (2*pi*t) ,3-2*sin (2*pi*t))) for t in srange (0 ,1 ,0.05)],xmin=-4,ymin=-2,xmax=6,ymax=6,figsize =[4 ,3])

c.show(delay =25)

An analogous construction can be done in the vector space C([0, L]). Remember, though that“points” and “vectors” in this setting are actually functions. Thus a “path” is really a functionu(t, x) that for each fixed t is a vector (ie, function) in C([0, L]). Thus we can parametrize the linepassing through the “point” f(x) = e

x that is parallel to the “vector” (x) = cos�3⇡Lx�with the

path

u(t, x) = f(x) + t (x) = ex + t cos

✓3⇡

Lx

◆.

If instead we wanted a path that oscillated about the function f in the direction of with frequencyof one oscillation per unit time, then the desired path is

u(t, x) = f(x) + sin(2⇡t) (x) = ex + sin(2⇡t) cos

✓3⇡

Lx

◆.

We can animate this oscillating path with the following Sage code, where we have set L = ⇡.

var(’x’)u = animate ([exp(x) + sin (2*pi*t)*cos (3*x) for t in srange (0 ,1 ,0.05)],

xmin=0,ymin=-2,xmax=pi ,ymax=6,figsize =[4 ,3])u.show(delay =25)

From this last example we see that we can interpret the standing wave solutions to the waveequation as oscillating paths in the vector space of functions. In particular, a standing wave of theform u(t, x) = A(t) (x) is interpreted as being an oscillating path whose “direction” is determinedby (x).

We now introduce two important concepts in the study of vector spaces, “independence” and“basis.” We motivate these concepts with the following example.

example:3D-linear-algebra Example 3.1.2. Consider the vectors u,v,w 2 R3 given by

u =

0

@101

1

A , v =

0

@10�1

1

A , w =

0

@120

1

A .

We make two claims:

1. Any vector x 2 R3 can be constructed by adding together multiples of u,v,w, and

2. there is only one combination of u,v,w that will add up to form x.

To verify the first claim means to find scalars ↵,�, � such that

x = ↵u+ �v + �w.

This means that we want to solve the equation0

@x1

x2

x3

1

A = ↵

0

@101

1

A+ �

0

@10�1

1

A+ �

0

@120

1

A .

3.1. VECTOR SPACES 27

We can write this equation as a system of three equations with three unknowns (↵,�, �):

x1 = ↵+ � + �

x2 = 2�

x3 = ↵� �.

We can systematically solve this by adding/subtracting multiples of these equations to eliminate thevariables on the right. The result is that we must have

↵ =1

2x1 �

1

4x2 +

1

2x3

� =1

2x1 �

1

4x2 �

1

2x3

� =1

2x2.

(Here is some Sage code that does the computation using matrix algebra.) This computation showsthat for any possible vector x we can choose scalars ↵,�, � in such a way that we can build x fromu,v,w. This addresses the first claim.

The second claim is also verified by the computation above. Note that we don’t have any choicein what ↵,�, � are. Thus for any given x, there is a unique way to construct it from u,v,w.

It is useful, however, to consider the second claim without the first claim. This is because thereare circumstances where we have a list of vectors that can’t necessarily be used to build any vector,but for which any construction might necessarily be unique. So, suppose that there is a vector xfor which there are two possible combinations, one given by numbers ↵, �, � and another given byb↵, b�, b�:

x = b↵u+ b�v + b�wx = ↵u+ �v + �w

Subtracting these two equations gives

0 = (b↵� ↵)| {z }↵

u+ (b� � �)| {z }�

v + (b� � �)| {z }�

w.

In other words, we have a combination of the vectors u,v,w, given by the numbers ↵,�, �, thatadds up to zero. This means that ↵,�, � must satisfy the system of equations

0 = ↵+ � + �

0 = 2�

0 = ↵� �.

It is easy to deduce that the only solution to this system is ↵ = 0,� = 0, � = 0 which means thatthe two combinations of u,v,w are not, in fact, di↵erent.

Notice that in this alternate approach to the second claim, it su�ces to consider which combi-nations u,v,w that make the zero vector.

Example 3.1.2 motivates a number of definitions. We define a linear combination of thevectors v1, . . . ,vk to be any sum of the form

↵1v1 + · · ·+ ↵kvk,


where ↵1, . . . ,↵k are scalars. Thus in Example 3.1.2, we found that any vector x 2 R3 can beexpressed as a linear combination of u,v,w.

A collection of vectors v1, . . . ,vk is called (linearly) independent if the only linear combina-tion of those vectors that yields the zero vector is the combination where each of the scalars is zero.In Example 3.1.2, we found that the vectors u,v,w are linearly independent.

example:degree-2-polynomials Example 3.1.3. Consider the functions p0(x) = 1, p1(x) = x, p2(x) = x2 as elements of the vector

space C([0, 1]). Linear combinations of p0, p1, p2 are polynomials of degree 2.In this vector space, the zero vector is the function that is identically zero. It is easy to see that

the only linear combination of p0, p1, p2 that yields this zero function is

(0)p0 + (0)p1 + (0)p2.

Thus p0, p1, p2 are linearly independent.

We end this section by addressing a technical, but important, point. Notice that the definitionof linear combination only deals with finite sums of vectors. This is important because with infinitesums of vectors we need to worry about convergence of the sum. For example, we might want toextend the list of polynomials in Example 3.1.3 to the infinite list

p0(x) = 1, p1(x) = x, . . . pk(x) = xk, . . .

of functions in C([0, 1]). We also know that the exponential function exp(x) = ex is in C([0, 1]) and

that it has the Taylor series

exp(x) = 1 + x+1

2x2 +

1

3!x3 + . . .

= p0(x) + p1(x) +1

2p2(x) +

1

3!p3(x) + . . . .

It is tempting to view this series expansion of the exponential function as a sort of “infinite linearcombination” of the functions p0, p1, p2, . . . . However, it is easy to construct infinite sums of thesefunctions that do not converge to a continuous function. For example, the geometric series

1

1� x= 1 + x+ x

2 + x3 + . . .

p0(x) + p1(x) + p2(x) + p3(x) + . . .

does not converge to a function in C([0, 1]). Consequently, we do not use the phrase “linearcombination” when we are talking about an infinite sum. Instead, we use the word “series” to whenconsidering an infinite sum.

In the exercises you show that the various collections of functions are vector spaces. Beforestating the exercises we introduce a bit of notation and terminology.

Functions that can be di↵erentiated as many times as we want are called smooth. The collectionof all smooth functions with domain ⌦ is given the symbol C1(⌦). For example, the exponentialfunction is in the collection C

1(R), but the absolute value function is not in C1(R).

The absolute value function, however, only has “trouble” at one spot, namely at x = 0. Func-tions that are smooth except at a finite number of points are called piecewise smooth. Thecollection of piecewise smooth functions with domain ⌦ is given the symbol C1

pw(⌦). You might

3.1. VECTOR SPACES 29

wonder what the derivative of the absolute value function is. At most places, of course, the deriva-tive is simply the slope. But at x = 0 the derivative not defined. For convenience, when there is a“jump” in a piecewise smooth function, we define the value at the jump to be zero.

Finally, we say that a piecewise smooth function f with domain ⌦ is square integrable ifZ

⌦|f |2dV

is finite; here dV represents the length/area/volume element that is appropriate for the region⌦. For example, the absolute value function is square integrable on the domain ⌦ = [�1, 1], inwhich case dV = dx. The collection of piecewise smooth functions with domain ⌦ that are square-integrable is given the symbol L2(⌦)1.

HW:vector-space-defn Exercise 3.1. Look up the definition of vector space online. What are all the properties that addingand scaling must satisfy?

Exercise 3.2.

1. For each real number p define the function fp(x) = xp on the domain (0, 1). For which p is

fp in L2((0, 1))?

2. Use the fact that (f � g)2 � 0 to show that 2fg f2 + g

2 for all f, g. Subsequently show thatL2(⌦) is a vector space.

Exercise 3.3.

1. Let L20(R) be the collection of all square integrable functions with domain R and satisfying the

infinite-string boundary condition (2.3.5). Show that L20(R) is a vector space.

2. Let L2P([�L,L]) be the collection of all square integrable functions with domain [�L,L] that

satisfy the periodic boundary condition (2.3.6). Explain why L2P([0, L]) is a vector space.

3. Let L2D([0, L]) be the collection of all square integrable functions with domain [0, L] that satisfy

the Dirichlet boundary condition (2.3.7). Explain why L2D([0, L]) is a vector space.

4. Let L2N([0, L]) be the collection of all square integrable functions with domain [0, L] that satisfy

the Neumann boundary condition (2.3.8). Explain why L2N([0, L]) is a vector space.

HW:ode-solutions-form-vector-space Exercise 3.4. Show that the collection of solutions u(t) to the di↵erential equation

d2u

dt2+ 14u = 0

forms a vector space. How many linearly independent solutions can you find?

Exercise 3.5. We say that a vector space is finite dimensional is there is a maximum numbervectors in any collection of linearly independent vectors. If there is no maximum, then we say thatthe vector space is infinite dimensional.

1. Explain why R2 is finite dimensional.

1The spaces L2

(⌦) can be extended to the Lebesgue space L2(⌦). However, this requires the development of the

Lebesgue theory of integration, which is beyond the scope of this course.


2. Explain why C([0, 1]) is infinite dimensional.

3. Is the collection of solutions to the ODE in Exercise 3.4 finite dimensional or infinite dimen-sional?

3.2. INNER PRODUCT SPACES 31

3.2 Inner product spaces

In the vector calculus course, lengths and angles defined by vectors in Rn were determined usingthe dot product. In this section we generalize the notion of a dot product to other vector spaces.Before we do, let’s recall the basic properties of the dot product for vectors in Rn:

• Symmetry property. We have v ·w = w · v for all vectors v,w.

• Positivity property. We have v · v � 0 for all vectors v.

• Definite property. We have v · v = 0 if and only if v = 0.

These last two properties are often combined in to a single “positive definite property.”It is convenient to think of the dot product as a function that takes in two vectors and returns a

scalar. Using this perspective, we define an inner product on a real vector space to be a functionthat takes in two vectors u, v and returns a number, which we give the symbol hu, vi, having thefollowing properties:

• Linearity property. We have h↵1v1 + ↵2v2, wi = ↵1hv1, wi + ↵2hv2, wi for all scalars ↵1,↵2

and for all vectors v1, v2, w in V .

• Symmetry property. We have hv, wi = hw, vi for all v, w in V .

• Positivity property. We have hv, vi � 0 for all v in V .

• Definite property. We have hv, vi = 0 if and only if v = 0.

A (real) vector space together with an inner product is called an inner product space.The following example is very important.

Example 3.2.1. For the vector space L2(⌦) we define the L2inner product by

hu, vi =Z

⌦u v dV. (3.2.1) standard-L2-inner-product

It is easy to see that the properties of integration imply the linearity and symmetry properties. Thepositivity property follows from the fact that u2 � 0 for any function u. The definite property followsfrom considering each smooth piece of the function u independently, and using our convention thatfunctions in L2(⌦) are zero along “jumps”.

The inner product (3.2.1) is referred to in the rest of these notes as the standard inner product

For vectors in Rn we define the norm (or magnitude) of the vector using the dot product. Wecan similarly define the norm of vector u, which we denote by kuk, by

kuk = hu, ui1/2.

example:first-norm-in-L2 Example 3.2.2. Consider the functions u(x) = 1� x2 and v(x) = cos(⇡x) in L2([0, 1]). We have

kuk = hu, ui1/2

=

✓Z 1

0(1� x

2)(1� x2) dx

◆1/2

=

r8

15


and

kvk =

✓Z 1

0cos2(⇡x) dx

◆1/2

=1p2.

We can also use the inner product to define the angle between two vectors. If both u and v

are nonzero vectors in an inner product space, then the angle between them is defined to be thenumber ✓ such that

cos ✓ =hu, vikuk kvk .

Example 3.2.3. The angle between the functions u, v in Example 3.2.2 must satisfy

cos ✓ =hu, vikuk kvk =

p15

2

Z 1

0(1� x

2) cos(⇡x) dx =

p30

2⇡.

Thus ✓ ⇡ 0.5 radians.

Two non-zero vectors are said to be orthogonal if the angle between them is ⇡/2. In otherwords, two (non-zero) vectors u, v are orthogonal if hu, vi = 0.

Example 3.2.4. Consider the functions u, v from Example 3.2.2 as well as the function w(x) =x� 3/8. Direct computation shows that u and w are orthogonal, but u and v are not.

The standard inner product (3.2.1) is not the only possible inner product one can define forvector spaces of functions. Consider functions defined on the domain ⌦. If w is a function on ⌦such that w(x) � 0, with the value zero (if present) only permitted along the boundary of ⌦, then

hu, viw =

Z

⌦u v w dV

is an inner product for functions with domain ⌦. We refer to h·, ·iw as a weighted inner product

and refer to w as the weight function. The corresponding norm is denoted k · kw. The collectionof piecewise smooth functions u with domain ⌦ such that kukw finite is given the symbol L2

w(⌦),

and is referred to as a weighted L2space.

example:Hermite-inner-product Example 3.2.5 (Hermite inner product). Consider the domain ⌦ = R and the function wH(x) =

e�x

2

. We call the resulting inner product the Hermite inner product2, and give it the symbol

h·, ·iH . In particular, we have

hu, viH =

Z 1

�1u(x)v(x)e�x

2

dx.

Note that the function u(x) = 1 has finite norm with respect to the Hermite inner product. In fact,

kukH =

✓Z 1

�1e�x

2

dx

◆1/2

= ⇡1/4

2The inner product is named after mathematician Charles Hermite.

3.2. INNER PRODUCT SPACES 33

The function v(x) = x also has finite norm. Furthermore, hu, vi = 0, meaning that these twofunctions are orthogonal. (The easiest way to see this is that u is even, v is odd, and the weightfunction is even. Since we are integrating an odd function over a symmetric domain the integral iszero.)

• would be better to replace theserandom functions with somethingmore meaningfulExercise 3.6. Let ⌦ be the unit disk in R2. The functions u(x, y) = x, v(x, y) = x

2 + y2,

w(x, y) = ex2+y

2

are elements in L2(⌦).

1. Compute the norms of u, v, and w.

2. Compute hu, vi.

3. Compute hv, wi.Exercise 3.7.

1. Let A,B be two numbers. Use algebra to show that (A�B)2 � 0 implies that AB 12 (A

2+B2).

2. Suppose v, w are elements of an inner product space. Show that

|hv, wi| 1

2(kvk2 + kwk2).

3. Suppose v, w are elements of an inner product space. Show that��

⌧v

kvk ,w

kwk

�� 1.

Conclude that |hv, wi| kvkkwk. This important fact is known as the Cauchy-Bunyakovsky-Schwarz inequality.

• needed

Exercise 3.8. We define l2(N) to be the set of all sequences A = {an}n2N = {a1, a2, a3, . . . } such

that

kAk2 =1X

n=1

|an|2

converges.

1. Show that l2(N) is a vector space.

2. What inner product makes l2(N) an inner product space?

3. How should one define l2(Z)?

HW:Dirichlet-eigenfunctions-are-orthogonal Exercise 3.9.

1. Show that the eigenfunctions k defined by (2.3.13) are orthogonal to one another with respectto the standard inner product in L2([0, L]).

2. Compute k kk. Does the answer depend on k?

HW:Hermite-polynomials-are-orthogonal Exercise 3.10. In this exercise we work in L2H(R), defined using the Hermite inner product from

Example 3.2.5. We define the following functions:

H0(x) = 1, H1(x) = 2x, H2(x) = 4x2 � 2, H3(x) = 8x3 � 12x.

1. Show that each of the functions Hk are orthogonal to one another.

2. Compute the norm of each of the functions Hk.


3.3 The best approximation problem

sec:best-approximation-problemA list of vectors in an inner product space is called an orthogonal collection if each vector isorthogonal to all of the other vectors. For example:

• The vectors

i =

0

@100

1

A , j =

0

@010

1

A , k =

0

@001

1

A

form an orthogonal collection in R3.

• In Exercise 3.9 you showed that the Dirichlet eigenfunctions (2.3.13) are orthogonal in L2([0, L]).

We now introduce the best approximation problem:

Suppose v1, . . . , vn is a orthogonal collection of vectors in an inner product space, andlet u be some other vector. What linear combination of the vectors v1, . . . , vn bestapproximates u?

More precisely, how can we choose scalars ↵1, . . . ,↵n so that the vector

u = ↵1v1 + · · ·+ ↵nvn

best approximates u in the sense thatku� uk

is as small as possible.To understand this problem a bit better, let’s look at two simple examples in R3.

Example 3.3.1. Consider the orthogonal collection that contains only one vector

v =

0

@111

1

A .

Let u be the vector

u =

0

@123

1

A

In this setting, the best approximation problem asks us to find which vector ↵v is closest to u.We can easily visualize what is happening here. Vectors of the form ↵v all lie in the line passing

through (0, 0, 0) and with direction given by v. As we move along this line we can compute thedistance to the point (1, 2, 3). The best approximation problem asks us to find the point with thesmallest distance.

We can solve this using methods from first semester calculus. Define the function f by

f(↵) = k↵v � uk2.

If we have a minimizer for f then we have a value of ↵ where the square of the norm is smallest,and hence where the norm is smallest. We compute

f(↵) =

0

@↵� 1↵� 2↵� 3

1

A ·

0

@↵� 1↵� 2↵� 3

1

A = (↵� 1)2 + (↵� 2)2 + (↵� 3)2.

3.3. THE BEST APPROXIMATION PROBLEM 35

It is easy to compute f0(↵) and then solve 0 = f

0(↵) for ↵. The result is ↵ = 2. Thus the vector ofthe form ↵v that best approximates u is the vector

u = 2v =

0

@222

1

A .

Example 3.3.2. Consider now the orthogonal collection

v1 =

0

@111

1

A , v2 =

0

@1�10

1

A .

(You should quickly check that this is an orthogonal collection!) Again, let u be the vector

u =

0

@123

1

A

We would like to find scalars ↵1,↵2 such that

u = ↵1v1 + ↵2v2

is such that ku� uk is as small as possible.Motivated by the previous example, we define a function f by

f(↵1,↵2) = k↵1v1 + ↵2v2 � uk2.

We can now use the optimization methods from vector calculus to find the values of ↵1,↵2 thatminimize f . To do this we compute

f(↵1,↵2) = (↵1 + ↵2 � 1)2 + (↵1 � ↵2 � 2)2 + (↵1 � 3)2.

Thus

grad f =

✓6↵1 � 124↵2 + 2

◆

Setting grad f = 0, we find that the optimal values are

↵1 = 2, ↵2 = �1

2.

Consequently the linear combination of v1,v2 that best approximates u is

u = 2v2 �1

2v2 =

0

@3/25/22

1

A .

The two examples above show us how to address the best approximation problem in general.Suppose we have orthogonal vectors v1, . . . , vn and also a vector u that we want to approximate.We define a function f by

f(↵1, . . . ,↵k, . . . ,↵n) = k↵1v1 + · · ·+ ↵kvk + · · ·+ ↵nvn � uk2.


Since the vectors vk are orthogonal to one another, we have

f(↵1, . . . ,↵k, . . . ,↵n)

= kuk2 + ↵21kv1k2 + · · ·+ ↵

2kkvkk2 + · · ·+ ↵

2nkvnk2

� 2↵1hv1, ui � · · ·� 2↵khvk, ui � · · ·� 2↵nhvn, ui.

Thus the gradient of f is

grad f =

0

BBBBBB@

2↵1kv1k2 � 2hv1, ui...

2↵kkvkk2 � 2hvk, ui...

2↵nkvnk2 � 2hvn, ui

1

CCCCCCA(3.3.1)

Consequently it is easy to see that in order to minimize f , and hence in order to minimize ku� uk,we should choose

↵1 =hv1, uikv1k2

, . . . ,↵k =hvk, uikvkk2

, . . . ,↵n =hvn, uikvnk2

.

In other words, the linear combination of the v1, . . . , vn that best approximates the vector u is

u =hv1, uikv1k2

v1 + · · ·+ hvk, uikvkk2

vk + · · ·+ hvn, uikvnk2

vn.

One should interpret this formula in the following way. The vectors v1, . . . , vn describe n di↵erentorthogonal directions inside the vector space. The quantity

hvk, uikvkk2

vk

represents that part of the vector u that lies in direction vk. (The technical mathematics term is“the projection of u on to the line spanned by vk.) Thus the best approximation is accomplishedby finding the portion of u lying in the directions v1, . . . , vn and then summing.

Of course, there may be portions of u that are orthogonal to all of the vectors v1, . . . , vn.Therefore the vector u might not be equal to u. The di↵erence u� u is precisely that part of u thatis orthogonal to all of the vectors v1, . . . , vn.

The resolution of the best approximation problem has a very important consquence: it tells ushow to choose the constants ↵k in (2.4.3). In Exercise 3.9 you showed that the Dirichlet eigenfunc-tions k are orthogonal, and have norm k kk2 = ⇡

2 . If we want the finite sum

fn(x) =nX

k=1

↵k k(x) (3.3.2) finite-Fourier-sum

to best approximate the function f , then we should choose

↵k =h k, fik kk2

(3.3.3) Fourier-coefficients

We call these numbers the Fourier coe�cients of the function f .


It is important to note that we have not showed that this choosing ↵k according to (3.3.3)necessarily implies that fn ! f as n ! 1, as the Fourier hypothesis conjectures, but if there is tobe any hope of convergence, then this is how the constants need be chosen. The convergence fn tof is discussed below.

With this information in hand, let’s revisit Fourier’s example from §II.

example:Fouriers-example Example 3.3.3. The function that we want to approximate is the functio ' given in (2.4.4). Herewe take L = ⇡. A lengthy, but straightforward, computation shows that

h', ki =1� (�1)k

k2sin(k↵).

Recall from Exercise 3.9 that k kk2 = ⇡

2 . Thus we choose

↵k =

(4

k2⇡sin(k↵) if k is odd,

0 if k is even,

which is precisely the values of ↵k that yield (2.4.5).

Here is another simple example.

example:Dirichlet-constant-function Example 3.3.4. Consider the function f(x) = 1 on the domain [0,⇡]. We compute

h k, fi =1� (�1)k

k

and thus the Fourier coe�cients are

↵k =

(4k⇡

if k is odd,

0 if k is even.

Since we only need to consider odd values of k, set k = 2l + 1, where l = 0, 1, 2, 3, . . . . Thus thefunction f is best approximated by

fn(x) =nX

l=0

4

⇡(2l + 1)sin ((2l + 1)x) .

The Sage code

var(’x,l,n’)n=10f(x) =sum ((4/(pi*(2*l+1)))*sin ((2*l+1)*x),l,0,n)plot(f,(x,0,pi),figsize =[4 ,2])

yields the following plot of this approximation when n = 10:


This does a reasonable job of approximating the constant function f(x) = 1. Play around with thecode and see what happens when n is large. Does the approximation improve?

Our resolution to the best approximation problem gave us a very simple way to approximate afunction f by the Dirichlet eigenfunctions: simply compute the Fourier coe�cients ↵k and construct(3.3.2). The limit as n ! 1 of (3.3.2) is called the Fourier sine series (because the Dirichleteigenfunctions k are sine functions). To indicate that the Fourier sine series does the best jobpossible at approximating the function f , we write

f ⇠1X

k=1

↵k k.

The examples above, and the exercises below, provide evidence that the Fourier sine series doesconverge to the function f , although Example 3.3.4 leads us to suspect that something funny ishappening at the endpoints. Details about the convergence are addressed in §?? below.• section reference needed

The best approximation method here works any time we have an inner product space anda collection of orthogonal vectors in that space. The Dirichlet eigenfunctions k are only onepossibility. This raises some questions. Were we lucky that the Dirichlet eigenfunctions wereorthogonal? Or is there some “reason” that we ended up with an orthogonal collection? Is therea systematic way to construct orthogonal collections of functions? One approach to generatingorthogonal collections is addressed in Exercise 3.12. A more general theory appears in Chapter ??• chapter reference needed

below.

HW:Fourier-sine-series-examples Exercise 3.11.

1. Find Fourier coe�cients for “square wave” function

f(x) =

(1 if 0 x <

⇡

2

�1 if ⇡

2 < x ⇡.

(According to our convention, what should the value of f be in order to have f in L2([0,⇡])?Does this value a↵ect the Fourier coe�cients?) Use Sage to generate a plot of the resultingapproximation.


2. Find Fourier coe�cients for the function

f(x) =

8><

>:

x if 0 x <⇡

2

0 if x = ⇡

2

x� ⇡ if ⇡

2 < x ⇡.

Use Sage to generate a plot of the resulting approximation.

HW:find-Legendre-using-Gram-Schmidt Exercise 3.12. In this exercise you explore a method to generate an orthogonal collection of poly-nomials in L2([�1, 1]). The polynomials P0, P1, P2, P3, . . . are recursively constructed according tothe following conditions:

• Pk is a polynomial of degree k,

• Pk is orthogonal to P0, . . . , Pk�1,

• Pk(1) = 1.

1. Explain why we must have P0(x) = 1. Compute kP0k2.

2. Now suppose that P1(x) = a0+a1x. Show that the condition hP1, P0i = 0 implies that a0 = 0.Conclude that P1(x) = x. Compute kP1k2.

3. Suppose that P2 = a0 + a1x+ a2x2. Show that the conditions

hP2, P0i = 0 hP2, P1i = 0 P2(1) = 1

imply that P2(x) =32x

2 � 12 . Compute kP2k2.

4. Find P3(x). Show that kP3k2 = 27 .

Exercise 3.13 (Requires Exercise 3.12). Let f(x) = ex and let Pk be the polynomials from Exercise

3.12. Find the constants ↵0, . . . ,↵3 such that the polynomial

f(x) = ↵0P0(x) + ↵1P1(x) + ↵2P2(x) + ↵3P3(x)

best approximates f in L2([�1, 1]).

Exercise 3.14 (Requires Exercise 3.10). In this exercise we work in L2H(R), defined using the

Hermite inner product (see Exercise 3.10). Let f(x) = ex.

1. Show that f is an element of L2H(R).

2. Find the constants ↵0, . . . ,↵3 such that the polynomial

f(x) = ↵0H0(x) + ↵1H1(x) + ↵2H2(x) + ↵3H3(x)

best approximates f in L2H(R).

You might find the following Sage code helpful:

var(’x’)a=integral ((2*x)*exp(x)*exp(-x^2) ,(x,-Infinity ,Infinity))show(a)


3.4 Complex inner product spaces spaces

sec:complex-vector-spacesWe now introduce inner product spaces involving complex numbers. Recall that complex numberstake the form a+ ib, where both a and b are real numbers and where i

2 = �1. We now introducesome vocabulary regarding complex numbers. The real part of complex number a+ ib is the realnumber a; the imaginar part is the real number b.

The complex conjugate of a complex number is the complex number formed by everywherereplacing i by �i. The complex conjugate of a complex number is denoted with an overbar. Thusthe conjugate of z = a+ ib is z = a+ ib.

The modulus of a complex number z is the non-negative real number (zz)1/2 and is given thesame symbol as absolute value. Thus the modulus of z = a + ib is |z| =

pa2 + b2. The modulus

represents the extent to which a complex number is zero: we have |z| = 0 exactly when z = 0.An important identity of complex numbers is Euler’s formula:

ei✓ = cos ✓ + i sin ✓.

Thus the real part of ei✓ is cos ✓ and the complex part is sin ✓. Note that for any complex numberz we have

z = |z|ei✓

for some number ✓ 2 [0, 2⇡).

Example 3.4.1. Consider the complex number z = �p3 + 3i. The real part of z is �

p3 and the

imaginary part is 3. The modulus is |z| =p12 and we can write

z =p12ei

2⇡3 .

A complex vector space is a vector space where the scalars are complex numbers. Since allof the usual algebraic operations are possible with complex numbers, complex vector spaces behaveexactly the same way that real vector spaces do.

Example 3.4.2. The vector space C2, which consists of column vectors having two complex entries,is a complex vector space. For instance, the vector

v =

✓1 + i

�3i

◆

is in C2. If we scale v by the scalar 2� i we obtain

(2� i)v =

✓3 + i

3� 6i

◆.

While the definition of complex vector spaces is essentially the same as for real vector spaces, thedefinition of complex inner product spaces is slightly di↵erent. This is because we need to make useof complex conjugates in order to compute the modulus of complex numbers. Thus the definition ofcomplex vector spaces involves complex conjugates. In particular, function h·, ·i that takes in twoelements of a complex vector space and gives out a complex number makes is a complex inner

product if it

3.4. COMPLEX INNER PRODUCT SPACES SPACES 41

• is conjugate linear in the first entry and linear in the second entry 3, meaning

h↵v, wi = ↵hv, wi hv,↵wi = ↵hv, wihv1 + v2, w1 + w2i = hv1, w1i+ hv1, w2i+ hv2, w1i+ hv2, w2i;

• is conjugate symmetric, meaning that

hv, wi = hw, vi;

• is positive definite, meaning thathv, vi � 0,

with equality only when v = 0. (Technical note: the conjugate-symmetry property impliesthat hv, vi is real, thus it makes sense to discuss this inequality.)

Example 3.4.3. We can define a complex inner product on C2 by⌧✓

v1

v2

◆,

✓w1

w2

◆�= v1w1 + v2w2.

We note that all of the definitions that rely on inner products are exactly the same in thecomplex case as in the real case. For example, we say that two vectors v, w are orthogonal ifhw, vi = 0.

We now introduce the example of a complex inner product that is most important for this class.Fix some L > 0 and consider all piecewise smooth functions that have domain [�L,L] and givecomplex numbers as outputs. Functions with complex outputs are called complex-valued andtake the form

u(x) = ureal(x) + i uim(x),

where ureal and uim are functions with real outputs. A complex-valued function u is called piecewise

smooth if the functions ureal and uim are each piecewise smooth real functions. The derivative ofsuch a function is computed by

u0(x) = u

0real(x) + i u

0im(x).

We say that a complex-valued piecewise-smooth function u is square integrable, and thus inL2(⌦), if Z

⌦|u|2 dV < 1,

where |u|2 = uu is the modulus of u and where dV is the appropriate length/area/volume element.For functions v, w in L2(⌦) we define the inner product

hv, wi =Z

⌦v w dV.

Thus, just as in the case of real-valued functions, L2(⌦) is the collection of all functions whose norm(defined by the integral inner product) is finite.

3Note that there is not consensus about whether the conjugate linear slot should be first or second.


Example 3.4.4. Consider the domain ⌦ = [0,⇡]. The function v(x) = eix is complex-valued with

real part vreal(x) = cos(x) and imaginary part vim(x) = sin(x). We have v(x) = e�ix and

kvk2 = hv, vi =Z

⇡

0e�ix

eixdx =

Z⇡

0dx = ⇡.

Thus kvk =p⇡ and v is in L2([0,⇡]). Note that

v0(x) = ie

ix = i cos(x)� sin(x).

Similarly, we see that the function w(x) = ix�5 is in L2([0,⇡]). We ask, are these the functionsv and w orthogonal? We compute the inner product of v and w to be

hv, wi =Z

⇡

0e�ix(ix� 5) dx =

⇥�(x� i)e�i x � 5i e�i x

⇤⇡0= ⇡ + 8i.

Thus the two functions are not orthogonal.

We conclude this section with the following note about notation.

Remark 3.4.1.

1. In physics you sometimes see a star used to indicate complex conjugate, writing z⇤ instead

of z. Also, physicists use the so-called “bra-ket” notation h·|·i for inner products. Finally,physicists sometimes write “d3x” for the volume element dV , which they place before theintegrand rather than after it. The result of all this is that the following things all describethe same (three dimensional) integral:

Math hu, vi =Z

⌦uv dV

Physics hu|vi =Z

⌦d3xu

⇤v

2. In mathematics, one often sees complex inner products with the complex conjugate on the sec-ond, rather than the first, entry. Under this convention, complex inner products are conjugatelinear in the second entry and linear in the first entry.

HW:periodic-eigenfunctions Exercise 3.15. In this exercise you work in L2([�L,L]) for some L > 0. Consider the functions

�k(x) = eik⇡L x where k = . . . ,�2,�1, 0, 1, 2, . . . .

1. Show that the functions �k are in L2([�L,L]). What is k�kk?

2. Show that the functions �k satisfy the periodic boundary conditions (2.3.6).

3. Show that the functions �k are orthogonal to one another.

3.5. PERIODIC FOURIER SERIES 43

3.5 Periodic Fourier series

We now consider the best approximation problem in complex vector spaces. Suppose we have anorthogonal collection of complex vectors v1, . . . , vn and want to choose complex numbers ↵1, . . . ,↵n

so that the sumu = ↵1v1 + · · ·+ ↵nvn

best approximates vector u. Proceeding as in §3.3, we consider the function

f(↵1, . . . ,↵n) = ku� (↵1v1 + · · ·+ ↵nvn)k2,

which we want to minimize. Note that when we expand this expression because the inner productbeing used to define the norm is complex conjugate in the first entry. Thus

f(↵1, . . . ,↵n) = kuk2 � ↵1hv1, ui � · · ·� ↵nhvnui� ↵1hv1, ui � · · ·� ↵nhvnui

+ |↵1|2kv1k2 + · · ·+ |↵n|2kvnk2

Now we need to be careful because ↵1, . . . ,↵n are complex numbers. Thus in order to use methodsof multivariable calculus for optimization we need to convert the problem to a problem of realvariables. We can do this by writing ↵1 = a1 + ib1, etc., and then setting

@f

@ak= 0 and

@f

@bk= 0 (3.5.1) first-complex-optimization

for each k = 1, . . . , n.For computational simplicity we compute, using the chain rule, in terms of the variables ↵k =

ak + ibk and ↵k = ak � ibk. Using the chain rule we have

@f

@ak=@f

@↵+@f

@↵

@f

@bk= i

✓@f

@↵� @f

@↵

◆

Thus (3.5.1) is equivalent to the condition that

@f

@↵k

= 0 and@f

@↵k

= 0

for each k = 1, . . . , n.In order to compute the derivatives of f , note that for each k we have |↵k|2 = ↵k↵k. Thus

@f

@↵k

= �hvk, ui+ ↵kkvkk2@f

@↵k

= �hvk, ui+ ↵kkvkk2.

Setting each of these equal to zero we conclude that the optimal choice of ↵k is

↵k =hvk, uikvkk2

.


Thus the best approximation of u is given by

un =hv1, uikv1k2

v1 + · · ·+ hvk, uikvkk2

vk + · · ·+ hvn, uikvnk2

vn.

It is important to note the order in the inner product: in the complex setting the order mattersbecause the inner product is conjugate symmetric, not symmetric.

For the purposes of this course, the most important example of orthogonal functions in a complexinner product space are the functions

�k(x) = eik⇡L x

k = . . . ,�2,�1, 0, 1, 2, . . . (3.5.2) complex-periodic-eigenfunctions

introduced in Exercise 3.15. The approximation of a function determined by the functions (3.5.2)is called the (periodic) Fourier series

4. Given a function u in L2([�L.L]), Fourier series isconstructed by first computing the Fourier coe�cients

ck =h�k, uik�kk2

=1

2L

ZL

�L

e�i

k⇡L y

u(y) dy. (3.5.3) FS-coefficient-formula

These coe�cients are used to form the associated Fourier series

u =1X

k=�1ck�k.

example:FS-line Example 3.5.1. Consider the function u(x) = x. We compute the Fourier coe�cients

ck =1

2L

ZL

�L

e�i

k⇡L y

x dx =

((�1)k L

k⇡i if k 6= 0

0 if k = 0.

Thus the Fourier series for u is

u(x) =�1X

k=�1(�1)k

L

k⇡ie

ik⇡L x +

1X

k=1

(�1)kL

k⇡ie

ik⇡L x

Notice that we can re-index the first sum to obtain an expression that only involves real-valuedfunctions.

u(x) =1X

k=1

(�1)kL

k⇡i

⇣eik⇡L x � e

�ik⇡L x

⌘=

1X

k=1

(�1)k+1 2L

k⇡sin

✓k⇡

Lx

◆

The following Sage code generates a plot of the function u (dashed black line) and the n = 10partial sum of the Fourier series; here we have set L = ⇡ for the purposes of plotting.

var(’x,k,n’)n=10f(x) = xfplot = plot(f,(x,-pi ,pi),figsize =[4,2], color=’black ’,linestyle="dashed")

fs(x) = sum ((2*(( -1)^(k+1))/k)*sin(k*x),k,1,n)fsplot = plot(fs ,(x,-pi ,pi),figsize =[4 ,4])

show(fplot+fsplot)

4Here the word periodic is in parentheses because this is the “default” Fourier series. If someone refers to Fourier

series without specifying boundary conditions, they mean the complex periodic series.


The resulting graphic is

example:FS-parabola Example 3.5.2. Consider the function v(x) = x2. Using the Sage code

var(’y,k,L’)ck = integrate( exp(-I*k*pi*y/L)*y^2, (y,-L,L))show(ck)

we compute the Fourier coe�cients

ck =1

2L

ZL

�L

e�i

k⇡L y

dy =

((�1)k2L2

⇡2k2 if k 6= 0L

2

3 if k = 0.

Thus the Fourier series for u is

v(x) =L2

3+

X

k 6=0

(�1)k2L2

⇡2k2eik⇡L x

.

Re-indexing the sum, we can write this as

v(x) =L2

3+

1X

k=1

(�1)k4L2

k2⇡2cos

✓k⇡

Lx

◆

The following Sage code generates a plot of the function u (dashed black line) and the n = 10 partialsum of the Fourier series; here we have set L = ⇡.

var(’x,k,n’)n=10f(x) = x^2fplot = plot(f,(x,-pi ,pi),figsize =[4,2], color=’black’,linestyle="dashed")


fs(x) = pi^2/3 + sum (((4*( -1)^k)/(k^2))*cos(k*x),k,1,n)fsplot = plot(fs ,(x,-pi ,pi),figsize =[4 ,3])

show(fplot+fsplot)

The resulting graphic is

There are two important things to note about Examples 3.5.1 and 3.5.2. The first is that in bothcases we were able to express the Fourier series as a sum of real functions: in the first case the sumis in terms of sine functions; in the second case it is in terms of cosine functions. In general, if thefunction u we start with is real-valued, then the Fourier series is able to be expressed in a mannerwhich makes apparent that it, too, is real-valued. See Exercise 3.17 for a further exploration ofthis.

The second important thing to note about Examples 3.5.1 and 3.5.2 is the behavior at theendpoints of the domain. In Example 3.5.2, the Fourier series approximation is quite good, evenwhen we only plot the first 10 terms. In Example 3.5.1, the behavior at the endpoints is a bit strange.To better understand what’s going on here, notice that the functions (3.5.2) are all 2L-periodic,meaning that �k(x + 2L) = �k(x) for any real number x. Thus the Fourier series approximationof any function is also 2L-periodic. To see this, consider the following graphic, which containsthe plots of the Fourier series approximations from the two examples, now shown on the domain[�3⇡, 3⇡]:

The functions that these plots are approximating are the periodic extensions of the functionsu(x) = x and v(x) = x

2, which are obtained by extending the domain from [�L,L] to all real


numbers under the condition that the functions are 2L-periodic. The plots of these periodicallyextended functions are

The Sage code used to generate these plots is:

var(’x’)

f1(x) = piecewise ([(( -3*pi ,-pi),x+2*pi) ,((-pi ,pi), x) ,((pi ,3*pi), x-2*pi)])f1plot = plot(f1 ,(x,-3*pi ,3*pi),exclude=(-pi , pi))

f2(x) = piecewise ([(( -3*pi ,-pi) ,(x+2*pi)^2) ,((-pi,pi), x^2) ,((pi ,3*pi), (x-2*pi)^2)])

f2plot = plot(f2 ,(x,-3*pi ,3*pi))

show(f1plot ,figsize =[4 ,3])#show(f2plot ,figsize =[4 ,3])

These latter plots shed light on the behavior at the endpoints seen in Example 3.5.1. The Fourierseries approximation is actually approximating the periodic extension of the function u(x) = x,which is not continuous at x = ±L. Rather, the periodic extension jumps at x = ±L, and thus theFourier series approximation does as well.

HW:periodic-Fourier-series Exercise 3.16. Find the (periodic) Fourier series for the following functions. Have Sage make aplot of the approximations.

1. The “square”

u(x) =

(�1 if � L < x < 0

1 if 0 < x < L.

2. The “triangle”

u(x) =

(L+ x if � L < x < 0

L� x if 0 < x < L.

3. The “sawtooth”

u(x) =

(x+ L if � L < x < 0

x if 0 < x < L.

4. The “pulse of size a”

u(x) =

(0 if |x| > a

12a if |x| < a,

where a is some positive number less than L.


HW:FS-periodic-trig Exercise 3.17. Suppose that u is a real-valued function in L2([�L,L]) and that

u(x) =1X

k=�1cke

ik⇡L x

is the corresponding Fourier series.

1. Show that

u(x) = a0 +1X

k=1

ak cos

✓k⇡

Lx

◆+

1X

k=1

bk sin

✓k⇡

Lx

◆,

where

a0 =1

2L

ZL

�L

u(y) dy

and, for k = 1, 2, 3, . . . ,

ak =1

L

ZL

�L

u(y) cos

✓k⇡

Lx

◆dy, bk =

1

L

ZL

�L

u(y) sin

✓k⇡

Lx

◆dy.

2. Find the relationship between the coe�cients ak, bk and the complex Fourier coe�cients ck.

Vector space stu · The two operations, adding and scaling, satisfy a long list of properties (see...

Documents

Transcript of Vector space stu · The two operations, adding and scaling, satisfy a long list of properties (see...