THE FUNDAMENTALS OF FOURIER ANALYSIS The heated ring ...

THE FUNDAMENTALS OF FOURIER ANALYSIS

KIRIL DATCHEV

1. The heated ring problem

1.1. Introduction. Two centuries ago, Fourier discovered general principles of analysis, now calledFourier analysis, that led to the solution of many important physical problems. The following isone of the simplest of these problems.

A thin ring is heated in some way and then insulated. How does the heat move through thering?

θ

Figure 1. The heated ring.

Let u(t, θ) denote the temperature at time t, at a position θ radians along the circumference ofthe ring from some fixed point. This means u is periodic with respect to θ:

u(t, θ) = u(t, θ + 2π), for all θ and t. (1)

Let f(θ) denote the initial temperature at each position θ, so that

u(0, θ) = f(θ), for all θ. (2)

We wish to find u(t, θ), the distribution of heat at time t, given f(θ), the starting distribution ofheat.

For any θ, the rate of change of the total heat in the arc from any point a to θ is equal to theflux F of heat into the endpoints of the arc:

∂t

∫ θ

au(t, ϕ)dϕ = F (t, a)− F (t, θ).

Differentiating with respect to θ gives

∂tu(t, θ) = −∂θF (t, θ).

Date: November 23, 2021. Please email any comments or corrections to [email protected].

1

THE FUNDAMENTALS OF FOURIER ANALYSIS 2

If the material of the ring is uniform, then heat flows from hotter regions to colder ones at a rateproportional to the temperature gradient:

F (t, θ) = −κ∂θu(t, θ),

where κ > 0 is a constant. For simplicity we assume κ = 1, giving the heat equation:

∂tu(t, θ) = ∂2θu(t, θ). (3)

1.2. Exercise. By changing variables, more general problems can be reduced to this one.

a) Let κ and L be given positive constants, and let g be a given function of period L. Findconstants a, b, and a function f of period 2π such that u(t, θ) solves (1), (2), (3), if andonly if v(t, x) = u(at, bx) solves

v(t, x) = v(t, x+ L), v(0, x) = g(x), ∂tv = κ∂2xv. (4)

b) Let τ be a given constant and λ be a given positive constant. Find a function `(t) suchthat v(t, x) solves (4) if and only if w(t, x) = τ + `(t)v(t, x) solves1

w(t, x) = w(t, x+ L), w(0, x) = τ + g(x), ∂tw = κ∂2xw − λ(w − τ). (5)

1.3. Particular solutions. We begin our analysis by finding the simplest solutions of the differ-ential equation (3) which obey the periodicity condition (1). The latter suggests

u(t, θ) = T (t) cosnθ, for some integer n.

Plugging into (3) gives

T ′(t) cosnθ = −n2T (t) cosnθ, which is solved by u(t, θ) = Ce−n2t cosnθ.

Replacing cosnθ by sinnθ works out similarly. Since u1 + u2 solves (1) and (3) whenever u1 andu2 do, we get the following more general solution:

u(t, θ) = c0 + a1e−t cos θ + a2e

−4t cos 2θ + · · ·+ aNe−N2t cosNθ

+ b1 e−t sin θ + b2 e

−4t sin 2θ + · · ·+ bN e−N2t sinNθ,

where c0, a1, . . . bN are any constants. To satisfy the initial condition (2) we must have

f(θ) = c0 +

N∑n=1

(an cosnθ + bn sinnθ).

Functions of this form are called trigonometric polynomials. We can use product-to-sum identi-ties for cosine and sine to show that a product of trigonometric polynomials is a trigonometricpolynomial, justifying the term ‘polynomial’.

1The differential equation in (5) corresponds to the case where, instead of being insulated, the ring is placed in anenvironment of temperature τ . Then λ measures the rate at which heat flows between the ring and its environment,λ = 0 reducing back the case where the ring is insulated.


1.4. Examples.

(1) Let f(θ) = 2 sin(θ + π/3). By the angle addition formula,

f(θ) = 2 sin θ cos(π/3) + 2 sin(π/3) cos θ =√

3 cos θ + sin θ,

giving a1 =√

3 and b1 = 1. The corresponding solution of the heated ring problem is

u(t, θ) = e−t√

3 cos θ + e−t sin θ = e−tf(θ).

(2) Let f(θ) = cos2 θ + sin2 2θ. Using cos 2α = cos2 α− sin2 α to reduce both powers gives

f(θ) = 12 + 1

2 cos 2θ + 12 − cos 4θ,

so c0 = 1, a2 = 1/2, and a4 = −1. Thus

u(t, θ) = 1 + 12e−4t cos 2θ − e−16t cos 4θ.

1.5. Exercises.

(1) Find c0, a1, . . . bN and solve the heated ring problem for f(θ) = 4 cos(θ − π/4), and moregenerally for f(θ) = A+B cos(mθ−ϕ), where A, B, and ϕ are given constants and m is agiven integer.

(2) Use power reduction formulas to solve the heated ring problem for f(θ) = sin4 θ + cos4 θ.(3) Let g(x, y) = 2x + y. Sketch the 3-dimensional graphs of the plane z = g(x, y) and the

cylinder x2+y2 = 1 together, and sketch separately their intersections with the plane z = 0.Solve the heated ring problem for f(θ) = g(cos θ, sin θ). For each time t ≥ 0, let Mt(x, y)be the point on the circle x2 +y2 = 1 where the temperature is the highest, and let mt(x, y)be the point where it is the lowest. Find Mt(x, y) and mt(x, y) for all t ≥ 0, and sketch theset of all these points on both graphs. Do the same problem with g(x, y) = 2x2 − 1.

1.6. Series solutions. Having dealt with trigonometric polynomials, we now consider functionsof the form

f(θ) = c0 +∞∑n=1

(an cosnθ + bn sinnθ).

We will learn later how to expand very general functions f in such series, known as Fourier series,but for now we look at some special examples based on familiar power series. We use complexnumbers (see Appendix A), including in particular Euler’s rule:

eiθ = cos θ + i sin θ, cos θ =eiθ + e−iθ

2, sin θ =

eiθ − e−iθ

2i.

Applying the geometric series

1

1− x=

∞∑n=0

xn,

with x replaced by 12eiθ, gives

1

1− 12eiθ

=

∞∑n=0

2−neinθ = 1 +

∞∑n=1

2−n cos(nθ) + i

∞∑n=1

2−n sin(nθ).

To separate the left hand side into real and imaginary parts, we write it as

2

2− eiθ=

2

2− eiθ· 2− e−iθ

2− e−iθ=

4− 2 cos θ + 2i sin θ

4− 4 cos θ + 1,


giving

4− 2 cos θ

5− 4 cos θ= 1 +

∞∑n=1

2−n cos(nθ),2 sin θ

5− 4 cos θ=∞∑n=1

2−n sin(nθ).

Thus, the solutions to the heated ring problem for f(θ) = 4−2 cos θ5−4 cos θ and f(θ) = 2 sin θ

5−4 cos θ are respec-tively

u(t, θ) = 1 +∞∑n=1

2−ne−n2t cos(nθ), and u(t, θ) =

∞∑n=1

2−ne−n2t sin(nθ);

the periodicity (1) and initial condition (2) are immediate, and the differential equation (3) followsby termwise differentiation.2

1.7. Exercise. Find the real and imaginary parts of eeiθ

, and use the power series ex =∑∞

n=0 xn/n!

to find their Fourier series and the corresponding solutions to the heated ring problem

1.8. Towards more general solutions. We now turn to more general functions. We wish towrite an arbitrary function f as a limit of trigonometric polynomials,

SN (θ) = c0 +

N∑n=1

(an cosnθ + bn sinnθ), (6)

as N → ∞, and then write u as a limit of the corresponding solutions. In light of Euler’s rule, ifwe allow complex numbers, this is equivalent to writing

SN (θ) =N∑

n=−Ncne

inθ. (7)

Our first step is to make SN approximate a given f as closely as possible, for a given N . Then thecorresponding solution uN will have an initial value which approximates the desired initial value.Later we will take the limit as N →∞.

1.9. Exercise. Find the equations relating the an, bn, and cn that make (6) and (7) equivalent.

2. Least square approximation

We will choose c0, a1, . . . bN in (6), or equivalently c−N , . . . , cN in (7), so that SN approximatesf in the sense of least squares. Before doing this, we apply the least square method in the simplersetting of planar and spatial vectors, and in the simpler case of approximations using a smallnumber of vectors.

2.1. Approximating one planar vector by another. Let u = (u1, u2) and v = (v1, v2) be twoplanar vectors, with v 6= (0, 0). For which choice of a constant a does av approximate u as closelyas possible?

To solve this, we look at the length squared of the vector u− av. It is given by

|u− av|2 = (u− av) · (u− av) = |u|2 − 2au · v + a2|v|2, (8)

where we used the dot product u ·v = u1v1+u2v2 and the length |u| =√u · u. We wish to minimize

|u− av|2 with respect to a, and we do this by completing the square with respect to a:

|u− av|2 = |u|2 − (u · v)2

|v|2+(u · v|v|− a|v|

)2. (9)

2See Chapter 24 of [Spi], specifically Corollary 3 on page 506 and Theorem 4 on page 507.


Hence |u− av|2 is minimized when

a =u · v|v|2

, |u− av|2 = |u|2 − (u · v)2

|v|2= |u|2 − a2|v|2. (10)

Geometrically, this choice of a corresponds to letting av be the projection of u onto the line throughv: note that u − av and v are orthogonal, i.e. (u − av) · v = 0. The error formula given by thesecond equation of (10) can also be obtained from the Pythagorean theorem: with a given by thefirst equation of (10), the approximating vector av and the error vector u − av are the legs of aright triangle whose hypotenuse is the vector u.

2.2. Example. Let u = (4, 2) and v = (1, 1). Then the best choice of a is

a =(4, 2) · (1, 1)

(1, 1) · (1, 1)= 3.

If instead v = (1,−2), then the best choice is

a =(4, 2) · (1,−2)

(1,−2) · (1,−2)= 0.

1 2 3 4

1

2

3

uv

u− av

u = u− av

v

1 2 3 4

-2

-1

1

2

Figure 2. The vectors from Example 2.2.

2.3. Exercises.

(1) Find the choice of a such that the distance between a(1, 2) and (−2,−1) is as small aspossible. Sketch (−2,−1), a(1, 2), and (−2,−1)− a(1, 2).

(2) Let u = (1, 3), and v = (2, c), where c is a constant. Let a be chosen to minimize |u− av|.For which values of c is a maximal and for which is it minimal? Sketch u, av, and u − avin those cases.

(3) Other ways to measure the error will give other answers, generally harder to compute. Letthe ‘city block’ distance between u = (u1, u2) and v = (v1, v2) be given by |u1−v1|+|u2−v2|.With u = (1, 3) and v = (2, 1), find a such that the city block distance between u and av isas small as possible. Sketch u, av, and u− av.

The same formulas work for spatial vectors. Indeed, let u = (u1, u2, u3) and v = (v1, v2, v3) betwo such vectors, with v 6= (0, 0, 0). Then (8) and (9) still give the length squared of u − av,and (10) still gives the optimal choice of a and the resulting error formula, but this time with thethree-dimensional dot product u · v = u1v1 + u2v2 + u3v3.


2.4. Example. Let u = (1, 2, 3) and v = (1, 1, 1). Then the best choice of a is

a =(1, 2, 3) · (1, 1, 1)

(1, 1, 1) · (1, 1, 1)= 2.

2.5. Exercise. Find the choice of a such that a(1, 2, 3) approximates (1, 1, 1) as closely as possible.

2.6. Approximating one real-valued function by another. We next adapt these formulas tofunctions on an interval (A,B). We consider first real-valued functions and constants, using theintegral inner product

〈f, g〉 =

∫ B

Afg, (11)

to replace the dot product. The corresponding notion of distance between two functions is themean square distance:

‖f − g‖ = 〈f − g, f − g〉1/2 =(∫ B

A|f − g|2

)1/2.

Now we ask, given functions f and g, for which choice of a constant a does ag approximate f asclosely as possible, in the sense of minimizing the mean square distance ‖f − ag‖?

As in (8), we write out the distance squared between f and ag:

‖f − ag‖2 = 〈f − ag, f − ag〉 = ‖f‖2 − 2a〈f, g〉+ a2‖g‖2, (12)

and completing the square gives

‖f − ag‖2 = ‖f‖2 − 〈f, g〉2

‖g‖2+(〈f, g〉‖g‖

− a‖g‖)2..

Hence ‖f − ag‖2 is minimized when

a =〈f, g〉‖g‖2

, ‖f − ag‖2 = ‖f‖2 − 〈f, g〉2

‖g‖2= ‖f‖2 − a2‖g‖2, (13)

which is the analogue of (10).

2.7. Example. Let (A,B) = (0, 2) f(x) = x2 and g(x) = x. Then

a =〈x2, x〉〈x, x〉

=

∫ 20 x

3∫ 20 x

2=

3

2,

and the error is

‖x2 − (3/2)x‖2 =

∫ 2

0x4 − (3/2)2

∫ 2

0x2 = 2/5,

which is small compared to ‖x2‖2 =∫ 20 x

4 = 32/5. If instead (A,B) = (−1, 1), then

a =〈x2, x〉〈x, x〉

=

∫ 1−1 x

3∫ 1−1 x

2= 0.

In this case the error ‖x2 − 0‖2 =∫ 1−1 x

4 = 2/5 is no longer small compared to ‖x2‖2 = 2/5.We now compare the mean square distance to other natural notions of distance between two

functions, the maximum distance

max[A,B]

|f − g|


1 2

2

4

x32x

x2

x

x2

-1 1

-1

1

Figure 3. Graphs of the functions from Example 2.7.

and the mean distance: ∫ B

A|f − g|.

To compare to the maximum distance we write

‖f − g‖ =(∫ B

A|f − g|2

)1/2≤(∫ B

A1)1/2

max[A,B]

|f − g| = (B −A)1/2 max[A,B]

|f − g|

To compare with the mean distance we note that (13) implies ‖f‖2 ≥ 〈f, g〉2/‖g‖2, which we rewriteas the Cauchy–Schwarz inequality:

|〈f, g〉| ≤ ‖f‖‖g‖, or∣∣∣ ∫ B

Afg∣∣∣ ≤ (∫ B

Af2)1/2(∫ B

Ag2)1/2

, (14)

Applying (14) with f replaced by |f − g| and g replaced by 1 gives a relationship between meansquare distance and mean distance:∫ B

A|f − g| ≤ (B −A)1/2

(∫ B

A|f − g|2

)1/2= (B −A)1/2‖f − g‖.

2.8. Example. Let f(x) = sgn(x), i.e. f(x) = x/|x| if x 6= 0 and f(0) = 0. Let us approximate fon [−1, 2] by g = 1. Then ‖f − ag‖ is minimized when

a =〈f, g〉‖g‖2

=

∫ 0−1−1 +

∫ 20 1∫ 2

−1 1=

1

3.

To minimize∫ 2−1 |f − bg| we write, for b ∈ [−1, 1],∫ 2

−1|f − bg| =

∫ 0

−1(1 + b) +

∫ 2

0(1− b) = b+ 1 + 2− 2b = 3− b,

which is minimized when b = 1. To minimize max[−1,2] |f − cg|, we write

max[−1,2]

|f − cg| = max(|1 + c|, |1− c|),

which is minimized when c = 0.

2.9. Exercises.

(1) Find the choice of a which minimizes∫ π−π |x − a sinx|2dx. (Hint: For the numerator,

integrate by parts. For the denominator, use a half-angle formula.)(2) Give another proof of (14), observing that for every t > 0 we have (tf − t−1g)2 ≥ 0, and

hence 2∫fg ≤ t

∫f2 + t−1

∫g2, and making the optimal choice of t.

(3) Give yet another proof of (14) by expanding∫ ∫

(f(x)g(y)− f(y)g(x))2dxdy ≥ 0.


2.10. Approximating one complex-valued function by another. We next adapt these for-mulas to complex-valued functions. We replace the integral inner product of (11) with

〈f, g〉 =

∫ B

Afg. (15)

Then, for given functions f and g, we look for the choice of constant c such that cg approximates fas closely as possible, in the sense of minimizing the mean square distance ‖f − cg‖. We now have

‖f − cg‖2 = ‖f‖2 − c〈f, g〉 − c〈g, f〉+ |c|2‖g‖2.

Since c〈f, g〉 = c〈g, f〉, completing the square with respect to c gives

‖f − cg‖2 = ‖f‖2 − |〈f, g〉|2

‖g‖2+

∣∣∣∣〈f, g〉‖g‖− c‖g‖

∣∣∣∣2 .Hence ‖f − cg‖2 is minimized when

c =〈f, g〉‖g‖2

, ‖f − cg‖2 = ‖f‖2 − |〈f, g〉|2

‖g‖2= ‖f‖2 − |c|2‖g‖2, (16)


2.11. Example. Let f(x) = x, let g(x) = einx where n ∈ N is given, and let (A,B) = (−π.π).Then

c =〈f, g〉‖g‖2

=

∫xe−inx∫

1=

1

−2πin

(xe−inx

∣∣π−π −

∫ π

−πe−inx

)=

cosnπ

−in= −(−1)ni/n.

2.12. Approximating with two orthogonal vectors. Next, let u, v, and w be three vectors,either planar or spatial, with |v|2 and |w|2 both different from 0, and with v orthogonal to w, i.e.v ·w = 0. For which choice of constants a and b does av+ bw approximate u as closely as possible?

Writing out the length of the vector u− av − bw, and using v · w = 0, gives:

|u− av − bw|2 = (u− av − bw) · (u− av − bw) = |u|2 − 2au · v − 2bu · w + a2|v|2 + b2|w|2. (17)

Completing the square with respect to both a and b gives

|u− av − bw|2 = |u|2 − (u · v)2

|v|2− (u · w)2

|w|2+(u · v|v|− a|v|

)2+(u · w|w|− b|w|

)2.

Hence |u− av − bw|2 is minimized when

a =u · v|v|2

, b =u · w|w|2

, |u− av − bw|2 = |u|2 − a2|v|2 − b2|w|2. (18)

Note that the choice of a is the same as the one in (10). Thus, to compute the best approximationusing two orthogonal vectors together, we can compute the best approximation using each of themindividually and then add. As in (10), the error formula in (18) can be obtained by applying thePythagorean theorem to the right triangle with hypotenuse u and legs av + bw, u− av − bw.

2.13. Example. Let u = (1, 2, 3), v = (1, 1, 1), and w = (c, 1, 1). Then w is orthogonal to v when0 = v · w = c+ 1 + 1, i.e. when c = −2. Then the best choice of a is 2 as in Example 2.4, and thebest choice of b is

b =(1, 2, 3) · (−2, 1, 1)

(−2, 1, 1) · (−2, 1, 1)=

1

2.

That givesa(1, 1, 1) + b(−2, 1, 1) =

(1, 21

2 , 212

).


vw

u

av + bw

Figure 4. The vectors from Example 2.13.

2.14. Exercises.

(1) Find a vector (w1, w2) which has the same length as (1, 2) and is orthogonal to it. Find thechoice of a and b such that a(1, 2) + b(w1, w2) approximates (−2,−1) as closely as possible.What is the error |u− av − bw|2? What result from linear algebra would allow you to find|u− av − bw|2 without finding a or b?

(2) Find c such that (1, c, 1) is orthogonal to (1, 1, 1). Find the choice of a and b such thata(1, 1, 1) + b(1, c, 1) approximates (1, 2, 3) as closely as possible. Find the error in this case,and in Example 2.13. Why is the error so much greater than in Example 2.13?

These formulas can be adapted to functions on an interval (A,B) as before. As long as ‖g‖ and‖h‖ are nonzero, and 〈g, h〉 = 0, the distance squared between f and ag + bh is minimized when

a =〈f, g〉‖g‖2

, b =〈f, h〉‖h‖2

, E = ‖f‖2 − a2‖g‖2 − b2‖h‖2.

2.15. Example. Let (A,B) = (−1, 1), f(x) = (x − 1)2, g(x) = x, and h(x) = 2. Then we can

check that 〈g, h〉 =∫ 1−1 2x = 0 and compute

a =

∫ 1−1(x− 1)2x∫ 1−1 x

2=

∫ 1−1−2x2∫ 1−1 x

2= −2, b =

∫ 1−1 2(x− 1)2∫ 1

−1 4=

2

3.

So the least square approximation to (x − 1)2 of the form ax is −2x, the one of the form 2b is 43 ,

and the one of the form ax+ 2b is −2x+ 43 . The errors are

‖f − ag‖2 = ‖f‖2 − a2‖g‖2 = 32/5− 22(2/3) = 31115 ,

‖f − bh‖2 = ‖f‖2 − b2‖h‖2 = 23845 ,

‖f − ag − bh‖2 = ‖f‖2 − a2‖g‖2 − b2‖h‖2 = 845 .

2.16. Exercises.

(1) Use a product-to-sum identity to show that 〈sinx, sin 2x〉 = 0 on an arbitrary interval oflength 2π. Find the choices of a and b which minimize

∫ π−π |x− a sinx+ b sin 2x|2dx.

(2) Let (A,B) = (0, 2), f(x) = x2, g(x) = x, and h(x) = 1. In Example 2.7 we computeda = 〈x2, x〉/〈x, x〉 = 3/2. Compute similarly b = 〈x2, 1〉/〈1, 1〉. Graph the functions f , ag,


-1 1

-2

-1

1

2

3

4

(x− 1)2 (x− 1)2 (x− 1)2

−2x

43

−2x+ 43

-1 1

-2

-1

1

2

3

4

-1 1

-2

-1

1

2

3

4

Figure 5. Graphs of the functions from from Example 2.15.

and ag + bh together on (0, 2), and observe that ag is a better approximation to f thanag + bh is. Why is ag + bh failing to give the least square approximation in this case?

2.17. Approximating functions with finite Fourier sums. The calculations in equations (17)and (18) generalize directly to approximations using any finite number of orthogonal vectors. Let fbe a given function on the interval (A,B), and let g0, . . . , gk be given functions such that ‖gj‖ 6= 0for every j and which are mutually orthogonal, i.e. 〈gj , g`〉 = 0 whenever j 6= `. Let a0, . . . , ak beany real constants. The distance squared between f and

∑ajgj is∥∥∥f − k∑

j=0

ajgj

∥∥∥2 = ‖f‖2 − 2

k∑j=0

aj〈f, gj〉+

k∑j=0

a2j‖gj‖2,

and completing the square with respect to each of a0, . . . , ak gives∥∥∥f − k∑j=0

ajgj

∥∥∥2 = ‖f‖2 −k∑j=0

〈f, gj〉2

‖gj‖2+

k∑j=0

(〈f, gj〉‖gj‖

− aj‖gj‖)2. (19)

Hence ‖f −∑ajgj‖2 is minimized when

aj =〈f, gj〉‖gj‖2

for all j,∥∥∥f − k∑

j=0

ajgj

∥∥∥2 = ‖f‖2 −k∑j=0

a2j‖gj‖2. (20)

To apply the above to the Fourier sums (6), we must check that the functions 1, cosx, sinx,cos 2x, sin 2x, . . . are all mutually orthogonal on any interval of length 2π; it will be convenient towork with (−π, π).

2.18. Exercise. Use differentiation to show that, if f has period 2π, then∫ π

−πf =

∫ 2π+a

af,

for any real a.

2.19. Proposition. The functions 1, cosx, sinx, cos 2x, sin 2x, . . . are mutually orthogonal on theinterval (−π, π).

Proof. This is straightforward to check by doing the relevant integrals using the product-to-sumformulas for sine and cosine, as in Exercise 2.16, but we prefer the following more conceptual proof,which is based on the fundamental differential equation of the sine and cosine functions.

We begin by proving a general result about eigenspaces of symmetric operators. Let T be asymmetric operator on infinitely differentiable 2π-periodic functions. This means that

〈Tf, g〉 = 〈f, Tg〉,


for any such functions f and g. Suppose f and g are eigenfunctions of T , with correspondingnonequal eigenvalues λ and µ, i.e.

Tf = λf, Tg = µg, λ 6= µ.

Then

λ〈f, g〉 = 〈Tf, g〉 = 〈f, Tg〉 = µ〈f, g〉.This implies (λ− µ)〈f, g〉 = 0, which implies 〈f, g〉 = 0.

Now apply this fact with T = R, where R is defined by [Rf ](x) = f(−x). This operator issymmetric because

〈Rf, g〉 =

∫ π

−π[Rf ]g =

∫ π

−πf(−x)g(x)dx =

∫ π

−πf(y)g(−y)dy =

∫ π

−πf [Rg] = 〈f,Rg〉,

where we used the change of variables x = −y, dx = −dy. The functions 1, cosx, cos 2x, . . . are alleven, and hence obey Rf = f , and are hence eigenfunctions of R with eigenvalue 1. The functionssinx, sin 2x, . . . are all odd, and hence obey Rf = −f , and are hence eigenfunctions of R witheigenvalue −1. This shows that

〈1, sinmx〉 = 〈cos jx, sinmx〉 = 0

for all j and m.Now apply the same fact with T = D, where D is defined by [Df ](x) = f ′′(x). This operator is

symmetric because

〈Df, g〉 =

∫ π

−πf ′′g = f ′g|π−π −

∫f ′g′ = f ′g|π−π − fg′|π−π +

∫fg′′ = 〈f,Dg〉,

where in the second and third inequalities we integrated by parts, and in the last one we usedthe fact that f , g, f ′, and g′ are 2π-periodic. Applying D to 1, cosx, sinx, cos 2x, sin 2x, . . . gives0,− cosx,− sinx,−2 cos 2x,−2 sin 2x, . . . . Hence 1, cosx, sinx, cos 2x, sin 2x, . . . are eigenfunctionsof D with eigenvalues 0,−1,−1,−2,−2, . . . and this shows that

〈1, cosmx〉 = 〈sin jx, sinmx〉 = 〈cos jx, cosmx〉 = 0,

whenever j 6= m. �

2.20. The Fourier coefficients and mean square error. Proposition 2.19 allows us to applythe general formulas (19) and (20) to conclude that SN , given by (6), namely

SN (x) = c0 +

N∑n=1

(an cosnx+ bn sinnx),

best approximates a given periodic function f in the sense of least squares when the coefficients anand bn are chosen according to the formulas

c0 =〈f, 1〉‖1‖2

, an =〈f, cosnx〉‖ cosnx‖2

, bn =〈f, sinnx〉‖ sinnx‖2

, for n ≥ 1.

Calculating the integrals in the denominators∫ π

−π1 = 2π,

∫ π

−πcos2 nx =

∫ π

−π

12 + 1

2 cos 2nx = π,

∫ π

−πsin2 nx =

∫ π

−π

12 −

12 cos 2nx = π,

allows us to write more simply

c0 =1

2π

∫ π

−πf(x)dx, an =

1

π

∫ π

−πf(x) cosnx, bn =

1

π

∫ π

−πf(x) sinnx. (21)


The numbers c0, an, and bn defined by (21) are called the Fourier coefficients of f . The errorformula in (20) becomes

‖f − SN‖2 = ‖f‖2 − c20‖1‖2 −N∑n=1

a2n‖ cosnx‖2 + b2n‖ sinnx‖2,

or ∫ π

−π|f − SN |2 =

∫ π

−π|f |2 − 2πc20 −

N∑n=1

π(a2n + b2n). (22)

2.21. Example. Let

f(x) =

{1, −π ≤ x ≤ π/2,0, π/2 < x < π,

and extend f to be 2π periodic. Then

a0 =1

π

∫ π

−πf(x) =

1

π

∫ π/2

−π1 =

3

2,

an =1

π

∫ π

−πf(x) cosnx =

1

π

∫ π/2

−πcosnx =

1

nπsin nπ

2 , for n ≥ 1,

because sinπn = 0, and

bn =1

π

∫ π

−πf(x) sinnx =

1

π

∫ π/2

−πsinnx =

1

nπ(− cos nπ2 + cos(−nπ)).

In particular,

a1 =1

π, b1 = − 1

π, a2 = 0, b2 =

1

π, a3 = − 1

3π, b3 = − 1

3π,

S1(x) =3

4+

1

πcosx+

1

πsinx, S2(x) = S1(x) +

1

πsin 2x, S3(x) = S2(x)− 1

3πsin 3x− 1

4πcos 3x.

The corresponding errors are∫|f − S1|2 =

∫|f |2 − π

2a20 − πa21 − πb21 =

3π

2− 9π

8− 1

π− 1

π= 0.54 . . .∫

|f − S2|2 =

∫|f − S1|2 − πa22 − πb22 =

∫|f − S1|2 −

1

π= 0.22 . . .∫

|f − S3|2 =

∫|f − S2|2 − πa23 − πb23 =

∫|f − S2|2 −

2

9π= 0.15 . . .

For comparison note that∫|f |2 = 3π/2 = 4.71 . . .

-π -π

2

π

2

π

f(x) f(x) f(x)

S1(x) S2(x)S3(x)

-π -π

2

π

2

π

-π -π

2

π

2

π

Figure 6. Graphs of the functions from from Example 18.


2.22. Exercises.

(1) Let f(x) = x when −π ≤ x < π, and extend f to be 2π-periodic. Find a general formulafor SN (x). Use a calculator to determine for which N we have

∫|f − SN |2 < 1

10

∫|f |2 and

for which N we have∫|f − SN |2 < 1

100

∫|f |2.

(2) (a) Find constants a and b that minimize∫ 1−1 |e

x − a− bx|2dx.

(b) Find constants c and d such that the line c+ dx is tangent to ex at x = 0.(c) Graph ex, a+ bx, and c+ dx together for x ∈ (−1, 1).

(3) (a) Find a constant c such that g(x) = 1 + cx is orthogonal to h(x) = sinx on (0, π).Graph g and h together for x ∈ (0, π)

(b) Let f(x) = 1 for x ≥ π/2 and f(x) = 0 otherwise. Find constants a and b thatminimize

∫ π0 |f − ag − bh|

2. Graph f and ag + bh together for (0, π).(4) (a) Verify that the functions 1, x, and cosx are mutually orthogonal on (−π, π).

(b) Find constants a, b, and c that minimize∫ π−π |x

37 − a − bx − c cosx|2dx and find thevalue of the integral.

3. Mean Square Convergence

3.1. Continuous functions. Let f be a continuous function of period 2π. Let the Fourier coeffi-cients of f be given by

cn =1

2π

∫ π

−πf(t)e−intdt. (23)

Let the symmetric partial sums of the Fourier series of f be given by

SN (θ) =N∑−N

cneinθ. (24)

We ask: in what sense does SN converge to f as N →∞?The simplest sense in which |f − SN | is small is given by the central result of Section 2: for any

constants bn we have∫ π

−π

∣∣∣f(θ)−N∑−N

bneinθ∣∣∣2dθ =

∫ π

−π|f |2 − 2π

N∑−N|cn|2 + 2π

N∑−N|bn − cn|2, (25)

which is just (19) with g1, . . . , gn replaced by e−iNt, . . . , eiNt, and can be checked by using |z−w|2 =|z|2 − zw − zw + |w|2 on both sides and simplifying the left side with (23). We see the left side of(23) is smallest when cn = bn for all n.

To obtain a convergence result, we supplement the identity (25) with the following statement,proved in Theorem 3-1 of [See]: for any f as above, and for any ε > 0 there is a trigonometricpolynomial P such that

max |f − P | < ε. (26)

By (25), we have∫|f − SN |2 ≤

∫|f − P |2, as long as N is large enough that P (θ) =

∑N−N bne

inθ

for some constants bn. Hence ∫ π

−π|f − SN |2 ≤

∫ π

−π|f − P |2 < 2πε2.

Since ε was arbtrary, we have lim∫|f − SN |2 = 0, and taking bn = cn and N → ∞ in (25) gives

the following result:


3.2. Theorem. Let f be a continuous function of period 2π. The symmetric partial sums SNconverge to f in mean square, i.e.∫ π

−π|f − SN |2 → 0, as N →∞, (27)

the series∑|cn|2 converges, and Parseval’s equality [Par] holds:

∞∑−∞|cn|2 =

1

2π

∫ π

−π|f |2. (28)

We will extend Theorem 3.2 to the most general possible functions f and sequences cn, n ∈ Z.We will obtain a correspondence, given by (23), between functions f such that the integral in (28)converges, and sequences cn, n ∈ Z such that the series in (28) converges. Moreover (27) and (28)will still hold.

The simplest functions omitted by Theorem 3.2 are the following:

3.3. Exercise. Let a and b be real numbers such that 0 < b − a < 2π, and let c be a complexnumber. Let f be a function of period 2π such that f(x) = 1 when a < x < b and f(x) = 0when b < x < a + 2π. Find the Fourier coefficents cn, find a number A such that |ncn| ≤ A|c|,and conclude that

∑|cn|2 converges. Given λ ∈ (0, π), find a formula for

∑∞1 sinc2(λn) implied by

(28), where sincx = sinxx .

Our first generalization of Theorem 3.2 includes functions like the ones in Exercise 3.3.

3.4. Theorem. Let f be a piecewise continuous function of period 2π. Let cn and SN be given by(23) and (24). Then SN converges to f in mean square (27), the series

∑|cn|2 converges, and

Parseval’s equality (28) holds.

Proof. As in the proof of Theorem 3.2, we use the completion-of-squares identity (25), and observethat (28) follows from (27) by taking N →∞ in (25). Hence it is enough to prove (27).

Let ε > 0 be given. It is enough to find a trigonometric polynomial P such that∫ π

−π|f − P |2 < ε, (29)

because, as in the proof of Theorem 3.2,∫|f − SN |2 ≤

∫|f − P |2 for N large enough.

To find such a P , it is enough to find a continuous function g of period 2π such that∫ π

−π|f − g|2 < ε/4, (30)

because the existence of P such that ∫ π

−π|g − P |2 < ε/4, (31)

is guaranteed by the convergence in mean square (27) of Theorem 3.2, and then (29) follows from(30) and (31) by |z + w|2 ≤ 2|z|2 + 2|w|2.

To find such a g, we take g = f except in a small neighborhood of each discontinuity, where wemake g linear, as in Figure 7;

∫|f − g|2 goes to zero as the neighborhoods become smaller.

�

Our proof of Theorem 3.4 relies on a process of mean square approximation: given a piecewisecontinuous function f , we find a continuous function g such that

∫|f − g|2 is small, and then a

trigonometric polynomial P such that∫|g−P |2 is small. Elaborating this process of approximating

of wilder functions by tamer ones will give us our ultimate result.


-π π -π π

Figure 7. An example of f and g.

Here are some examples of the kind of function we will have to include, to obtain a result for anarbitrary sequence cn such that

∑|cn|2 converges.

3.5. Example. Consider the cosine series

∞∑n=1

cosnθ

n.

To find the sum, we consider the geometric series (1 − x)−1 =∑∞

0 xn, after integrating term-by-

term, we get − ln(1 − x) =∑∞

1 xn/n, at least for |x| < 1. If we bravely substitute x = eiθ, eventhough |x| = 1 then, and take the real part, we get

∞∑n=1

cosnθ

n= −Re ln(1− eiθ) = − ln |1− eiθ| = − ln 2− ln | sin(θ/2)|,

where we used the fact that if ez = w then Re z = ln |w|, and also |1 − eiθ| = |e−iθ/2 − eiθ/2| =2| sin(θ/2)|.

The calculation above can be checked directly with some more careful analysis. To proceed usingmean square convergence instead, one can verify that

− 1

2π

∫ 2π

0ln(sin(θ/2)) cosnθdθ =

{ln 2, n = 0,

1/n, n = 1, 2, . . . ,

as follows. For n = 0, use sin a = cos(a − π2 ) to show

∫ 2ππ ln(sin(θ/2))dθ =

∫ π0 ln(cos(θ/2))dθ,

and then use 2 sin a cos a = sin 2a. For n ≥ 1, integrate by parts, use the product-to-sum formula

sin a cos b = sin(a+b)+sin(a−b), and then use the Dirchlet kernel formulasin((K+ 1

2)θ)

sin(θ/2) =∑K−K e

ikθ.

3.6. Example. Define a sequence of intervals in (0, 1) as follows. Let I1 be the open intervalcentered at 1/2 of length 1/4. For each j ≥ 2, let Ij be the union of the 2j−1 open intervals whichhave the following properties: they have equal length, they have the same midpoints as the 2j−1

intervals of (0, 1) \ (I1 ∪ · · · ∪ Ij−1), and they have total length 4−j .

0 1

Figure 8. I1 is the interval of length 1/4, I2 is the union of the two intervals oflength 1/32, and I3 is the union of the four intervals of length 1/256.


Let fj(x) = 1 when x ∈ Ij and fj(x) = 0 otherwise. Let

f(x) =∞∑j=1

fj(x), cn,j =

∫ 1

0fj(t)e

−2πintdt, cn =∞∑j=1

cn,j .

The first infinite sum converges because for each x at most one term is nonzero. The second infinitesum converges because

|cn,j | ≤∫ 1

0fj ≤ 2−2j . (32)

Let us also check that∑|cn|2 converges. For this, observe that using |

∫ BA e−2πintdt| ≤ π−1n−1 (see

Exercise 3.7), gives

|cn,j | ≤ 2j−1π−1n−1. (33)

Taking the 3/8 power of (32) and the 5/8 power of (33) and multiplying the two gives

|cn,j | ≤ 2−j/8(2πn)−5/8, and hence |cn| ≤∞∑j=1

2−j/8(2πn)−5/8 =2−1/8(2πn)−5/8

1− 2−1/8.

Thus∑|cn|2 converges. We would like to conclude that the cn are the Fourier coefficients of f .

The difficulty is that f is not Riemann integrable:

3.7. Exercise. In the notation of Example 3.6, find I1, I2, and inf I3. How many intervals are inIj for each j? Use the answer to prove (33). Show that any Riemann upper sum of f is 1, and anyRiemann lower sum is ≤ 1/3.

Hint: For the upper sums, show that any interval in (0, 1) contains points from some Ij byfinding a sequence αj tending to zero such that the length of any interval in (0, 1) \ (I1 ∪ · · · ∪ Ij) isat most αj . Show that any lower sum corresponding to a choice of partition a = x0 < · · · < xj = b

is ≤∑j

k=1

∫fk and use (32).

Despite the result of Exercise 3.7, the cn in Example 3.6 are the Fourier coefficients of f , thepartial sums SN converge to f in mean square, and Parseval’s equality holds. But to see this wemust use a more powerful theory of integration, namely that of Lebesgue.

4. The Lebesgue integral

4.1. Measures of intervals and points. Let a and b be real numbers with a < b. The measureof the interval [a, b] is b − a, and this is also the measure of the intervals [a, b), (a, b], and (a, b).Individual points have zero measure.

If E ⊂ R is a union of finitely many or countably many disjoint intervals and points, then itsmeasure is the sum of the measures of the intervals.

For example, the measure of{x ∈ [−2π, 2π] : sinx ≥ 1/

√2}

= [−7π/4,−5π/4] ∪ [π/4, 3π/4]

is π, and so are the measures of

{x ∈ (−2π, 2π) : sinx > 1/√

2} and {x ∈ [−2π, 2π) : cosx > 1/√

2}.

4.2. Exercise. Find the measure of the set where

(x− 1)1(x− 2)2(x− 3)3(x− 4)4(x− 5)5(x− 6)6(x− 7)7 < 0.


4.3. Measure zero. Lebesgue’s theory of measure [Leb] extends this concept to much more generalsubsets of R, and also to higher dimensional sets. For now we only need the following:

A set E ⊂ R has measure zero if, for every ε > 0, there is a collection of intervals of total length≤ ε such that E is contained in the union of those intervals.

It is no loss of generality to demand that the covering intervals be open; one can pass from closedor half-open intervals to open ones by doubling the lengths of the covering intervals and keepingthe same centers.

If a set E is contained in the union of a collection of intervals, we say E is covered by thoseintervals.

A property holds almost everywhere, or for almost every x in a given set, if the set of pointswhere it fails has measure zero. This is abbreviated a.e.

We will see that, for purposes of integration, sets of measure zero can generally be ignored, andthat having a property almost everywhere is often just as good as having it everywhere.

To check the consistency of the definition of measure zero, and of measure of an interval, wemust show that a closed interval [a, b] does not have measure zero. This follows from the fact thatit is impossible to cover [a, b] by a collection of open intervals having total length less than b − a,which is fairly obvious for finite collections of open intervals, but not for infinite collections. We candeduce the result for infinite collections from the result for finite collections using the Heine–Boreltheorem, which will also need later anyway: See Appendix B.

4.4. Examples. Let E = {x0} where x0 ∈ R. Then E has measure zero because E ⊂ (x0−ε, x0+ε)for any ε > 0. Similarly, if E′ = {x1, x2, . . . , xn}, then E′ has measure zero because

E′ ⊂ (x1 − ε, x1 + ε) ∪ (x2 − ε, x2 + ε) ∪ · · · ∪ (xn − ε, xn + ε).

One the other hand, if E′′ = {x1, x2, x3, . . . }, then E′′ has measure zero because

E′′ ⊂ (x1 − ε/2, x1 + ε/2) ∪ (x2 − ε/4, x2 + ε/4) ∪ (x3 − ε/8, x3 + ε/8) ∪ · · · .

Finally, if E = E1 ∪ E2 ∪ · · · , where each En has measure zero, then we can show that E doestoo by letting In be the union of a collection of open intervals of total length ≤ ε2−n such thatEn ⊂ In. Then the union I1 ∪ I2 ∪ · · · covers E and has total length ≤ ε2−1 + ε2−2 + · · · = ε.

A piecewise continuous function is continuous almost everywhere. Dirichlet’s function (aka thecharacteristic function of the rational numbers)

D(x) =

{1, x ∈ Q,0, x 6∈ Q.

is 0 almost everywhere. The function f(x) = x takes irrational values almost everywhere. Anyone-to-one function takes irrational values almost everywhere.

4.5. Exercises. Prove the following:

(1) For any a ∈ R, the set Ea = {x ∈ R : cosx = a} has measure zero.(2) cosx is irrational almost everywhere.(3) 0 < | cosx| < 1 a.e.

4.6. The Cantor set. Let I0 = [0, 1], and for each j ≥ 1 let Ij be obtained from Ij−1 by deletingthe open middle third of each interval of Ij−1. Thus

I1 = [0, 1/3] ∪ [2/3, 1] I2 = [0, 1/9] ∪ [2/9, 1/3],∪[2/3, 7/9] ∪ [8/9, 1],


and more generally, each Ij consists of 2j intervals having total length (2/3)j . The Cantor set isthe set of points common to all of the intervals Ij . To visualize this set, let

fj(x) =

{1, x ∈ Ij0, otherwise,

f(x) =∞∑j=1

fj(x);

see Figure 9. The sum∑∞

j=1 fj(x) converges when x is not in the Cantor set, because such an x

lies in finitely many of the Ij . The sum∑∞

j=1 fj(x) diverges when x is in the Cantor set, becausethen x lies in all of the Ij . The Cantor set has measure zero because it is covered by each Ij , whichis a collection of intervals having total length less than (2/3)j , and this is less than any ε > 0 if jis large enough.

Thus the sum defining f(x) diverges on a set of measure zero, and converges almost everywhere,and f is defined almost everywhere.

Figure 9. The partial sum f1 + f2 + f3 of the Cantor set example 4.6.

4.7. Upper integrable functions. Our first extension of the notion of integral is motivated byExamples 3.6 and 4.6.

To state it, we define a step function to be a function ϕ : R→ R, or R→ C, which has a constantvalue ck on each of a finite set of n intervals (xk−1, xk), with x0 < x1 < · · · < xn, and which is zerooutside of [x0, xn]. The integral of a step function is defined by∫

ϕ =n∑k=1

ck(xk − xk−1). (34)

The functions fj in Examples 3.6 and 4.6 are all step functions, and any linear combination of stepfunctions is a step function.

x0 x1 x2 x3 x4

Figure 10. An example of a step function.


In (34), and below, we denote by ∫f =

∫f(x) dx (35)

the integral of a function over the real line R. For any a < b, we define the integral of a function gover the interval (a, b) in terms of (35) by writing∫ b

ag =

∫f, f(x) =

{g(x), a < x < b,

0, otherwise.(36)

We now extend our concept of integral to the set of functions f : R → R such that there isa nondecreasing sequence of step functions, ϕ1 ≤ ϕ2 ≤ · · · , such that supn

∫ϕn is finite and

ϕn(x)→ f(x) for almost every x ∈ R. We call such functions upper integrable. For upper integrablefunctions we define ∫

f = limn→∞

∫ϕn. (37)

For example, for the f in Example 4.6, we have∫f =

∞∑1

∫fj =

∞∑1

(2/3)j = 2.

We must check that this definition is consistent, i.e. that

(1) the limit in (37) is independent of the choice of the ϕn, and(2) if f is a step function, then the definitions (37) and (34) agree.

Clearly (2) follows from (1), because if f is a step function then we may take ϕn = f for all n. Toprove (1), we will prove the more general statement that if f , g are upper integrable functions withf ≤ g, and ϕ1 ≤ ϕ2 ≤ · · · , ψ1 ≤ ψ2 ≤ · · · corresponding sequences of step functions, then

limm→∞

∫ϕm ≤ lim

n→∞

∫ψn. (38)

The proof is difficult, so to warm up we prove that this definition covers all nondecreasingsequences of step functions with bounded integrals.

Lemma. Let ϕ1 ≤ ϕ2 ≤ · · · be a nondecreasing sequence of step functions. Then either the sequenceof integrals

∫ϕ1,

∫ϕ2, . . . diverges to infinity or else the sequence of functions ϕ1(x), ϕ2(x), . . .

converges for almost all x.

Proof. Suppose the sequence of integrals does not diverge to infinity. Then take a number A suchthat

∫ϕn ≤ A for all n. Let S be the set of points where the sequence ϕn(x) fails to converge. To

show S has measure zero, we cover it in the following way. For every n ∈ N and M ∈ R, let Sn,Mbe the set of points where

ϕn(x) ≤M < ϕn+1(x),

and let

SM = S1,M ∪ S2,M ∪ · · ·Then S ⊂ SM , provided M ≥ maxϕ1. Moreover, SM consists of countably many points andintervals, because each Sn,M consists of finitely many points and intervals. It is enough to showthat the total length of the intervals in SM tends to 0 as M →∞.

Let `n,M be the total length of the intervals in Sn,M . Then

A ≥∫ϕn+1 ≥

∫S1.M∪···∪Sn,M

ϕn+1 ≥∫S1.M∪···∪Sn,M

M = M(`1,M + · · ·+ `n,M ),


where for the equality at the end we used the fact that the S1,M , . . . , Sn,M are mutually disjoint.This implies that

∑nj=1 `j,M ≤ A/M and hence that

∑∞j=1 `j,M ≤ A/M , i.e. to the total length of

the intervals in SM is ≤ A/M , which tends to 0 as M →∞ as desired. �

Proof of (38). We reduce the problem to the analysis of a sequence of step functions tendingmonotonically to zero almost everywhere. Let m be given, and observe that the sequence ϕm−ψ1 ≥ϕm−ψ2 ≥ · · · tends almost everywhere to the limit ϕm−g, which is almost everywhere ≤ f−g ≤ 0.Consequently, the sequence of positive parts3 max(ϕm − ψ1, 0) ≥ max(ϕm − ψ2, 0) ≥ · · · tends tozero almost everywhere. By Lemma B.1,

limn→∞

∫max(ϕm − ψn, 0) = 0.

Hence

limn→∞

∫(ϕm − ψn) ≤ 0, or

∫ϕm ≤ lim

n→∞

∫ψn.

Letting m→∞ completes the proof. �

Observe that (38) further implies that for any upper integrable f and g we have

f ≤ g =⇒∫f ≤

∫g. (39)

Now that we have shown the consistency of the definition, let us look at some more examples, andcompare this new integral with the more familiar Riemann integral. Recall that given a function fon [a, b] and a choice of partition a = x0 < x1 < · · ·xm = b, the upper and lower Riemann sums off with respect to this partition are

m∑j=1

(xj − xj−1) supx∈[xj−1,xj ]

f(x) and

n∑j=1

(xj − xj−1) infx∈[xj−1,xj ]

f(x).

The function is Riemann integrable if, for any ε > 0, there is partition such that the upper andlower sums are within ε of each other, and the Riemann integral is the common value these sumsapproach as ε→ 0.

4.8. Examples.

(1) Let g : [a, b]→ R be continuous. Define f and∫ ba g in terms of (35) using (36).

Then f is upper integrable and the above definition of∫ ba g =

∫f agrees with the one from

Riemann integration. Indeed, for each n, let a = x0 < x1 < · · · < xn = b be equally spacedpoints in [a, b], and define ϕn : [a, b] → R to have the constant value ck = min[xk−1,xk] f on

(xk−1, xk). The integral∫ϕn is the lower Riemann sum: see Figure 11.4

More generally, the same holds if g is any Riemann integrable function, as shown inSection 5.1.7 of [SzN], but we will not need this.

(2) Consider Dirichlet’s function

D(x) =

{1, x ∈ Q,0, x 6∈ Q,

3If f and g are functions, max(f, g) denotes the function x 7→ max(f(x), g(x)). Thus max(f, 0) is f when f ispositive and 0 otherwise.

4Adapted from https://en.wikipedia.org/wiki/Riemann_sum.

https://en.wikipedia.org/wiki/Riemann_sum


Figure 11. The shaded area corresponds to∫ ba ϕ8(x)dx.

as in Example 4.4. Then we may take ϕn(x) = 0 for all n and x, so D is an upper integrable

function and∫ ba D =

∫D = 0 regardless of a and b. On the other hand, the Riemann integral

of D over [a, b] does not exist, because all the upper Riemann sums are b− a.(3) Dirichlet’s function D(x) is not Riemann integrable, but it is almost everywhere equal to

the zero function, which is Riemann integrable. Examples of upper integrable functions fwhich are not almost everywhere equal to Riemann integrable functions are provided byExamples 3.6 and 4.6.

(4) If f and g are upper integrable, then so are f + g, max(f, g), and min(f, g). This followsfrom examining the corresponding sequences of step functions ϕn + ψn, max(ϕn, ψn), andmin(ϕn, ψn).

4.9. Exercise. Show that if f is upper integrable,∫f = 0, and f ≥ 0 almost everywhere, then

f = 0 almost everywhere.Hint: Let ϕ1 ≤ ϕ2 ≤ · · · be a sequence of step functions tending almost everywhere to f , and letψn = max(0, ϕn).

4.10. Integrable functions. We next extend the concept of integral to the set of functions h =f − g, where f and g are upper integrable functions. We call such functions integrable, and denotethe set of them by L1 or L1(R). For these functions we define∫

h =

∫f −

∫g. (40)

This definition is consistent because if f and g are another such pair of functions then writing∫f −

∫g =

∫f −

∫g ⇐⇒

∫f +

∫g =

∫g +

∫f , (41)

reduces this to showing that∫

(f + g) =∫f +

∫g for any upper integrable functions, which follows

from the definition (37).From Example 4.8 (4), we see that if f and g are integrable, then so are any linear combinations

of them, as are max(f, g), min(f, g), and |f |; for the last one use |f | = max(f,−f).


A complex-valued function is integrable if and only if its real and imaginary parts are. The setof complex-valued integrable functions is also denoted L1, and one must determine from contextwhether real- or complex-valued functions are meant. We write L1([a, b]) for the set of (real-or complex-valued) integrable functions on a bounded interval [a, b], and this is also sometimesabbreviated as L1.

From the definitions (37) and (40) we see that any integrable function can be approximated bya sequence of step functions:

Lemma. For any h ∈ L1 there is a sequence of step functions ϕ1, ϕ2, . . . such that ϕn → h a.e.and

∫|ϕn − h| → 0 as n→∞.

4.11. Exercises.

(1) If there are constants A and B such that A ≤ h(x) ≤ B for all x, show that the stepfunctions ϕn may be chosen such that A ≤ ϕn(x) ≤ B.

Hint: Given ϕn from the Lemma, define ϕn by

ϕn(x) =

ϕn(x), A ≤ ϕn(x) ≤ B,B, ϕn(x) ≥ B,A, ϕn(x) ≤ −A,

and show that the ϕn have the required properties.(2) If h is complex-valued and there is a constant C such that |h| ≤ C, show that the step

functions ϕn may be chosen such that |ϕn| ≤ C√

2 by applying (1) to the real and imaginaryparts of h.

(3) Show that if h1 and h2 are integrable, and h1 ≤ h2, then∫h1 ≤

∫h2.

Hint: Argue as in (41) to reduce this to (39).

4.12. Absolute Convergence Theorem. We now prove several theorems giving senses in whichthe set of integrable functions is complete, meaning that limits of integrable functions are integrable,and the mapping f 7→

∫f is continuous, meaning lim

∫fn =

∫lim fn. The first theorem deals with

absolutely convergent series.

Theorem. Let g1, g2, . . . be integrable functions. If the series of integrals∫|g1| +

∫|g2| + · · ·

converges, then the series of functions g1 + g2 + · · · converges almost everywhere to an integrablefunction, and

∞∑n=1

∫gn =

∫ ∞∑n=1

gn. (42)

Proof. We consider separately five cases, reducing each case to the one before. Let

hn = g1 + · · ·+ gn.

1. Suppose the functions gn, and hence hn, are all nonnegative step functions. Then∑gn =

limhn is an upper integrable function and the result follows from the Lemma of Section 4.7.2. Suppose the functions gn, and hence hn, are all nonnegative upper integrable functions. Then

by definition there are sequences of step functions

ϕ1,1 ≤ ϕ1,2 ≤ · · · → h1 a.e.,

ϕ2,1 ≤ ϕ2,2 ≤ · · · → h2 a.e.,

etc.,


Define a new sequence of step functions ϕn = max{ϕj,k | 1 ≤ j, k ≤ n}. Then ϕ1 ≤ ϕ2 ≤ · · · and∫ϕn ≤

∫hn for all n, so by Case 1 above (i.e., by the Lemma of Section 4.7), there is an upper

integrable function h such that ϕn → h almost everywhere and∫ϕn →

∫h. Since ϕn,m ≤ ϕm for

n ≤ m, letting m→∞ and combining with ϕn ≤ hn we get

ϕn ≤ hn ≤ h, for each n.

Letting n→∞ we see that hn → h almost everywhere. Similarly,∫ϕn ≤

∫hn ≤

∫h,

so letting n→∞ again gives the desired conclusion.3. Suppose the gn are nonnegative integrable functions. Write each gn as a difference of two

nonnegative upper integrable functions, gn = g(1)n −g(2)n , with

∫g(2)n ≤ 2−n. This is possible because

given any upper integrable functions f(1)n and f

(2)n with gn = f

(1)n −f (2)n , we may take a step function

ϕn ≤ f (2)n such that∫f(2)n −

∫ϕn ≤ 2−n, and then put g

(1)n = f

(1)n − ϕn and g

(2)n = f

(2)n − ϕn; note

that g(1)n ≥ 0 follows from g

(1)n = gn + g

(2)n .

Thus∑∫

g(2)n converges, and since

∑∫gn converges by hypothesis, we see that

∑∫g(1)n con-

verges. Then, by Case 2 above,∑g(1)n and

∑g(2)n converge almost everywhere to upper integrable

functions, with∑∫

gn =∫ ∑

g(1)n −

∫ ∑g(2)n as desired.

4. Suppose the gn are real-valued integrable functions. Then write gn = an − bn, where an =max(gn, 0) and −bn = min(gn, 0), and apply Case 3 to an and bn.

5. Suppose the gn are complex-valued integrable functions. Then write gn = an + ibn, and applyCase 4 to an and bn. �

4.13. Exercise. Let f be an integrable function such that∫|f | = 0. Show that f = 0 almost

everywhere by applying the above theorem with all the gn replaced by |f |.

4.14. Monotone Convergence Theorem. Our next theorem follows immediately from the pre-vious one. It deals with monotone sequences.

Theorem. Let f1 ≤ f2 ≤ · · · be a sequence of integrable functions. If the sequence of integrals∫f1,∫f2, . . . converges, then the sequence of functions f1, f2, . . . converges to an integrable func-

tion, and ∫limn→∞

fn = limn→∞

∫fn.

The same result holds if instead f1 ≥ f2 ≥ · · · .

Proof. If f1 ≤ f2 ≤ · · · , then apply the above Theorem with g1 = f1 and gn = fn− fn−1 for n ≥ 2.If f1 ≥ f2 ≥ · · · , then apply the previous case to h1 ≤ h2 ≤ · · · , with hn = −fn. �

4.15. Dominated Convergence Theorem. Our next theorem deals with sequences which arebounded, or ‘dominated’ by an integrable function.

Theorem. Let f1, f2, . . . be a sequence of integrable functions such that |fn(x)| ≤ g(x) a.e. forsome integrable function g, and suppose this sequence convergence a.e. to a function f . Then f isintegrable and ∫

limn→∞

fn = limn→∞

∫fn.


Proof. By considering real and imaginary parts separately we may assume that the fn are realvalued.

Let gn = limm→∞min(fn, . . . , fm). By the Monotone Convergence Theorem, each gn is inte-grable, and

∫gn ≤

∫g. Also g1 ≤ g2 ≤ · · · → f a.e., by the definition of a limit. So by the

Monotone Convergence Theorem, f is integrable and∫f = lim

∫gn.

In the same way let hn = limm→∞max(fn, . . . , fm). By the Monotone Convergence Theorem,each hn is integrable, and

∫hn ≥ −

∫g. Also h1 ≥ h2 ≥ · · · → f a.e., by the definition of a limit.

So by the Monotone Convergence Theorem,∫f = lim

∫hn.

Finally, using

gn(x) ≤ fn(x) ≤ hn(x),

integrating and taking n→∞ gives the conclusion. �

This gives us a powerful method of checking functions for integrability:

Corollary. Let fn be a sequence of integrable functions tending almost everywhere to a function f .If there is an integrable function g such that |f | ≤ g, then f is integrable.

Proof. By considering real and imaginary parts separately we may assume that the fn are realvalued. Then apply the dominated convergence theorem to the sequence fn defined by

fn(x) =

g(x), g(x) ≤ fn(x),

fn(x), −g(x) ≤ fn(x) ≤ g(x),

−g(x), fn(x) ≤ −g(x).

�

For example, let f = u + iv be a complex-valued integrable function, i.e. a complex-valuedfunction such that u and v are integrable. Then we can check that |f | =

√u2 + v2 is also integrable

by applying the Corollary with fn replaced by√ϕ2n + ψ2

n, where ϕn and ψn are sequences of stepfunctions converging a.e. to u and v respectively, and with g replaced by |u|+ |v|.

4.16. Exercise. Show that the functions fn of the Corollary are integrable by writing them interms of the max and min of f, fn,−g.

4.17. Exercise.

(1) Use (a− b)2 ≥ 0 to show that 2ab ≤ a2 + b2 for any real a and b.

(2) Show that√u2 + v2 ≥ (cosα)u + (sinα)v for any real α, u, v, by squaring both sides and

applying part (1).(3) Use part (2) to show that if f is a complex-valued integrable function, then∫

|f | =∫|u+ iv| ≥ Re

[e−iα

∫(u+ iv)

]= Re

[e−iα

∫f],

for any real α.(4) Deduce that

∫|f | ≥ |

∫f | by choosing α appropriately.

4.18. Square integrable functions. We are now ready to define the space of square integrablefunctions on a bounded interval [a, b]:

L2 = L2([a, b]) = {f ∈ L1 such that |f |2 ∈ L1}. (43)

If f ∈ L2 and g ∈ L2, then the product fg is integrable. This can be checked by observing that ifϕn and ψn are sequences of step functions tending to f and g respectively, then ϕnψn is a sequence


of step functions tending to fg, so the Corollary of Section 4.15 applies with fn replaced by ϕnψn,f replaced by fg, and g replaced by 1

2 |f |2 + 1

2 |g|2 (see part (1) of Exercise 4.17).

In particular, we still have the least square approximation identity (25), and the Cauchy–Schwarzinequality (14): ∣∣∣ ∫ π

−πfg∣∣∣ ≤ (∫ π

−π|f |2

)1/2(∫ π

−π|g|2)1/2

. (44)

Square integrable functions have Fourier series which converge to them in the mean square sense.

Theorem. Let f ∈ L2([−π, π]). Put

cn =1

2π

∫ π

−πf(t)e−intdt, SN (θ) =

N∑−N

cneinθ.

Then

limN→∞

∫ π

−π|f − SN |2 = 0,

∫ π

−π|f |2 = 2π

∞∑−∞|cn|2.

Proof. In this proof, all integrals are from −π to π. Let ε > 0 be given. Proceeding as in the proofof Theorem 3.4, we see that it is enough to show that lim

∫|f − SN |2 = 0, and by taking real and

imaginary parts separately we may assume that f is real valued. Again proceeding as in the proofof Theorem 3.4, we see that it is enough to find a trigonometric polynomial P such that∫

|f − P |2 < ε.

To find such a trigonometric polynomial, it is enough if we find a step function ϕ such that∫|f − ϕ|2 < ε/4,

because by Theorem 3.4 we know that there exists a trigonometric polynomial P such that∫|ϕ− P |2 < ε/4,

and |f − P |2 ≤ 2|f − h|2 + 2|h − P |2. To find such a step function, let fk(x) = f(x) if |f(x)| ≤ kand fk(x) = 0 otherwise. Then

∫|f − fk|2 → 0 by the Monotone Convergence Theorem so if k is

large enough we will have ∫|f − fk|2 < ε/16.

Fix such a k, and, using Exercise 4.11, let ϕj be a sequence of step functions such that ϕj → fkalmost everywhere,

∫|ϕj − fk| → 0, and |ϕj | ≤ k. Then∫

|fk − ϕj |2 ≤ 2k

∫|fk − ϕj | < ε/16,

for j large enough, as desired. �

To go in the opposite direction, from the sequence cn to the function f , we use the following:.

Lemma (Fatou’s Lemma). Let f1, f2, . . . be a sequence of nonnegative integrable functions suchthat

∫fn ≤ A for all n and fn → f almost everywhere. Then f is integrable, and

∫f ≤ A.

Proof. Let gn = limm→∞min(fn, . . . , fm). By the Monotone Convergence Theorem, each gn isintegrable, and

∫gn ≤ A. Also g1 ≤ g2 ≤ · · · → f a.e., by the definition of a limit. So by the

Monotone Convergence Theorem, f is integrable and∫f = lim

∫gn ≤ A. �


Theorem. Let f1, f2, . . . be a sequence of bounded integrable functions on a bounded interval [a, b].Suppose that, for every ε > 0 there is N ∈ N such that

m ≥ n ≥ N =⇒∫ ∣∣∣ m∑

j=n

fj

∣∣∣2 < ε. (45)

Then there is a function f ∈ L2([a, b]) such that

limn→∞

∫ ∣∣∣f − n∑j=1

fj

∣∣∣2 = 0 (46)

Proof. We first use the Absolute Convergence Theorem to define f . Let m1 < m2 < · · · be asequence of positive integers such that

n ≥ mk =⇒∫ ∣∣∣ n∑

j=mk

fj

∣∣∣2 < 2−k,

and let

g1 =

m1−1∑j=1

fj , g2 =

m2−1∑j=m1

fj , · · · , so that

∫|gk|2 ≤ 2−k+1 for k ≥ 2.

Then, by the Absolute Convergence Theorem, the series∑∞

k=0 gk converges almost everywhere toan integrable function which we denote f , provided we can check that

∑∫|gk| converges. This

convergence follows from the fact that∫|gk| ≤

(∫|gk|2

)1/2√2π ≤ 2(−k+1)/2

√2π.

Let ε > 0 be given. Take N such that (45) holds. Then, for any k such that mk > n ≥ N ,∫ ∣∣∣ k∑j=1

gj −n∑j=1

fj

∣∣∣2 =

∫ ∣∣∣ mk∑j=n+1

fj

∣∣∣2 < ε.

By Fatou’s Lemma, |f −∑n

j=1 fj |2 is integrable, and∫ ∣∣∣f − n∑j=1

fj

∣∣∣2 < ε.

Since |f |2 is the sum of two integrable functions, namely |f −∑n

j=1 fj |2 and |f |2 − |f −∑n

j=1 fj |2,it is integrable itself and hence f ∈ L2. Because ε is arbitrary, (46) follows. �

Applying the above Theorem with the sequence of functions f1, f2, f3 . . . replaced by the se-quence of functions c0, c1e

iθ, c−1e−iθ, c2e

2iθ, c−2e−2iθ, . . . shows that if cn is any sequence of com-

plex numbers such that∑∞−∞ |cn|2 converges, there is f ∈ L2 such that

limN→∞

∫ π

−π

∣∣∣f(θ)−N∑−N

cneinθ∣∣∣2dθ = 0.

4.19. Exercise. Show that if f ∈ L2 and g ∈ L2, then f + g ∈ L2.


4.20. Differentiation under the integral. This is also known as Leibniz’ rule.Let I and J be two intervals, finite or infinite, and let f : I×J → C be differentiable with respect

to the second variable. Suppose that, for every y ∈ J , f(x, y) and ∂yf(x, y) are in L1(I), and thatthere is g ∈ L1 such that |∂yf(x, y)| ≤ g(x) for almost every y ∈ J . Then, for any y ∈ J , we have

d

dy

∫f(x, y)dx =

∫∂yf(x, y)dx.

To prove this, let y1, y2, . . . be a sequence in J tending to y. Apply the Dominated ConvergenceTheorem with

fn(x) =f(x, yn)− f(x, y)

yn − y, |fn(x)| = |∂yf(x, cn)| ≤ g(x),

where cn between y and yn is provided by the Mean Value Theorem.

5. Fourier transform

We denote by L1 = L1(R) the Lebesgue integrable functions on R. For f ∈ L1 we define theFourier transform of f by

f(ξ) =1

2π

∫ ∞−∞

f(t)e−itξdt. (47)

Note that the integral defining f converges for all real ξ since |f(t)e−itξ| = |f(t)| and f ∈ L1. Thisdefinition agrees with the one in Section 4.2 of [See], but we allow more general functions f . Thenormalization is chosen to parallel that for Fourier coefficients,

cn =1

2π

∫ π

−πf(t)e−intdt,

because the Fourier transform is the analogue of the Fourier coefficients for non-periodic functions.

5.1. Fourier transform of an integrable function. For any f ∈ L1, the Fourier transform f isa bounded continuous function.

Proof. The boundedness of f follows from

|f(ξ)| =∣∣∣∣ 1

2π

∫ ∞−∞

f(t)e−itξdt

∣∣∣∣ ≤ 1

2π

∫ ∞−∞|f |.

For continuity, we observe that if limn→∞ ξn = ξ, then

limn→∞

f(ξn) = limn→∞

1

2π

∫ ∞−∞

f(t)e−itξndt =1

2π

∫ ∞−∞

limn→∞

f(t)e−itξndt =1

2π

∫ ∞−∞

f(t)e−itξdt = f(ξ),

where the interchange of limit and integral is justified by the dominated convergence theorem, since|f(t)e−itξn | = |f(t)|. �

We next wish to establish an analogue of the Fourier series formula

f(θ) =

∞∑−∞

aneinθ. (48)

We saw in Chapters 1 and 3 of [See] and in the discussion above that some careful analysis goes intounderstanding (48) precisely (with various kinds of convergence of the series holding under variousassumptions on f). The analogue of (48) for Fourier transforms is called the Fourier inversionformula or the Fourier integral formula, and it has similarly varied versions. We begin with aversion for the case when both f and f are integrable.


5.2. Fourier inversion I. Suppose f ∈ L1, f ∈ L1, and f is bounded and continuous. Then forall x ∈ R we have

f(x) =

∫ ∞−∞

f(ξ)eiξxdξ. (49)

Proof. As in Section 4-3 of [See], let

v(x, y) =

∫ ∞−∞

f(t)P (x− t, y)dt,

where P is the Poisson kernel for the half plane:

P (x− t, y) =1

2π

∫ ∞−∞

e−|ξ|yeiξ(x−t)dξ =1

π

y

(x− t)2 + y2.

1. Part of Theorem 4-2 of [See] says that for every real x we ahve

f(x) = limy→0+

v(x, y). (50)

Indeed, to see this we can use the change of variables r = (t− x)/y to write

v(x, y) =1

π

∫ ∞−∞

f(x− ry)1

π

1

r2 + 1dr.

Then, since ∣∣∣∣f(x− ry)1

π

1

r2 + 1

∣∣∣∣ ≤ 1

π

sup |f |r2 + 1

,

we can use the dominated convergence theorem to show that

limy→0+

v(x, y) =

∫ ∞−∞

limy→0+

f(x− ry)1

π

1

r2 + 1dr = f(x)

∫ ∞−∞

1

π

1

r2 + 1dr = f(x).

2. Another way of writing (50) is

f(x) = limy→0+

∫ ∞−∞

f(t)1

2π

∫ ∞−∞

e−|ξ|yeiξ(x−t)dξdt.

To get the desired conclusion we must swap the order of integration and pass the limit through theintegral. More specifically, if we can show that∫ ∞

−∞f(t)

1

2π

∫ ∞−∞

e−|ξ|yeiξ(x−t)dξdt =

∫ ∞−∞

e−|ξ|y1

2π

∫ ∞−∞

f(t)e−itξdteiξxdξ, (51)

for every y > 0 and x ∈ R, then we will have

f(x) = limy→0+

∫ ∞−∞

e−|ξ|y1

2π

∫ ∞−∞

f(t)e−itξdteiξxdξ.

Then we can pass the limit inside by the dominated convergence theorem since∣∣∣∣e−|ξ|y 1

2π

∫ ∞−∞

f(t)e−itξdteiξx∣∣∣∣ =

∣∣∣∣e−|ξ|y 1

2π

∫ ∞−∞

f(t)e−itξdt

∣∣∣∣ ≤ ∣∣∣∣ 1

2π

∫ ∞−∞

f(t)e−itξdt

∣∣∣∣ =∣∣∣f(ξ)

∣∣∣ ,giving

f(x) =

∫ ∞−∞

limy→0+

e−|ξ|y1

2π

∫ ∞−∞

f(t)e−itξdteiξxdξ

=

∫ ∞−∞

1

2π

∫ ∞−∞

f(t)e−itξdteiξxdξ =

∫ ∞−∞

f(ξ)eiξxdξ.


Thus to complete the proof we must show (51). We will deduce this from Fubini’s theorem, whichsays that the value of an absolutely convergent iterated integral is unchanged when the order ofintegration is changed. We derive Fubini’s theorem next, after some introductory discussion ofdouble integrals. �

5.3. Double integrals. The results on Lebesgue integration dervied above for functions of onevariable can be directly extended to functions of several real variables. We discuss only the caseof two variables; the case of three of more variables proceeds along the same lines. In fact, ourone-variable discussion goes through with only minor complications, which we now briefly describe.

Let an open two-dimensional interval be product of one-dimensional intervals

(a, b)× (c, d) = {(x, y) | a < x < b and c < y < d}.Its area is (d− c)(b− a).

A set E ⊂ R2 has measure zero if, for every ε > 0, there is a collection of open two-dimensionalintervals of total area ≤ ε such that E is contained in the union of those intervals. Examples includepoints, line segments, and finite and countable unions of these. Since the term ‘measure zero’ cannow apply to subsets of either R or of R2, for clarity or emphasis we sometimes say linear measurezero for the former and planar measure zero for the latter. Thus a line segment has planar measurezero but not linear measure zero.

The Heine–Borel theorem says that every cover of [a, b] × [c, d] by open two-dimensional inter-vals contains a finite subcover. The same proof by contradiciton works: if there were no finitesubcollection, at least one of the four subintervals

[a, a+b2 ]× [c, c+d2 ], [a, a+b2 ]× [ c+d2 , d], [a+b2 , b]× [c, c+d2 ], and [a+b2 , b]× [ c+d2 , d],

would inherit this property, and one can proceed as before, dividing at each stage into four subin-tervals rather than two.

A two-dimensional step function is a function ϕ : R2 → R which has the constant value ck oneach of a finite set of mutually disjoint intervals (xk−1, xk) × (yk−1, yk), and is zero otherwise. Itsintegral is defined by ∫

ϕ(x, y) dx dy =n∑k=1

ck(xk − xk−1)(yk − yk−1).

Then the functions

x 7→ ϕ(x, c) and y 7→ ϕ(c, y)

are one-dimensional step functions for every c ∈ R, and the functions

x 7→∫ϕ(x, y)dy and y 7→

∫ϕ(x, y)dx,

are also one-dimensional step functions, and∫ϕ(x, y)dxdy =

∫ (∫ϕ(x, y) dx

)dy =

∫ (∫ϕ(x, y) dy

)dx. (52)

Fubini’s theorem below will generalize the above results to integrable functions.Note that a one-dimensional step function is discontinuous on a finite set, while a two-dimen-

sional step function is discontinuous on an infinite set of measure zero. However, both kinds of stepfunction are continuous almost everywhere, and this is what is needed in the arguments we usedto define integrable functions and prove the convergence theorems.

We thus extend the notion of integral to the set of upper integrable functions, i.e. functionsf : R2 → R such that there is a nondecreasing sequence of step functions, ϕ1 ≤ ϕ2 ≤ · · · , such that


supn∫ϕn is finite and ϕn(x, y) → f(x, y) for almost every (x, y) ∈ R2. Then we extend it to the

set of functions h = f1− f2, where f1 and f2 are upper integrable, and we call these the integrablefunctions. A complex-valued function is integrable if its real and imaginary parts are. We denotethe set of integrable functions L1(R2) or L1 for short. We have the same convergence theorems,namely the absoulte convergence theorem, the monotone convergence theorem, etc. with the sameproofs.

For g : [a, b]× [c, d]→ C, we define∫[a,b]×[c,d]

g(x, y)dxdy =

∫f(x, y)dxdy, f(x, y) =

{g(x, y), when a < x < b and c < y < d,

0, otherwise,

provided that f , so defined, is in L1.Example. A continuous function [a, b] × [c, d] → R is integrable by the same argument as inExample 4.8; it is the limit of a nondecreasing sequence of nondecreasing step functions.

5.4. Sets of planar measure zero. To extend the result on iterated integration for step functionsfrom (52) to more general L1 functions requires an analysis of the relationship between sets of planarmeasure zero and sets of linear measure zero. The main result is the following:

Let E ⊂ R2 have (planar) measure zero. Then the sets

{x ∈ R | (x, c) ∈ E} and {y ∈ R | (c, y) ∈ E} (53)

have (linear) measure zero for almost every c ∈ R.

Proof. For each j ∈ N, let Jj be a covering of E by open intervals having total area 2−j , letI1, I2, . . . be an enumeration of all the intervals in all the Jj , and for each n let ϕn be the stepfunction which is 1 on In and 0 elsewhere. Then

∞∑1

∫ϕn =

∞∑1

2−j = 1,

and, since∞∑1

∫ϕn =

∞∑1

∫ (∫ϕn(x, y)dy

)dx,

by the absolute convergence theorem we see that the series∑∞

1

∫ϕn(x, y)dy converges for almost

every x. Fixing such an x0, and applyiing the absolute convergence theorem again, shows that∑∞1 ϕn(x0, y) converges for almost every y. But if (x0, y) ∈ E, then

∑∞1 ϕn(x0, y) diverges because

ϕn(x0, y) = 1 for infinitely many values of n by construction. This finishes the proof for the secondset of (53), with c = x0, and the proof for the first set is the same. �

5.5. Fubini’s Theorem. Let f ∈ L1(R2). Then the functions

x 7→ f(x, c), y 7→ f(c, y), x 7→∫f(x, y)dy, and y 7→

∫f(x, y)dx,

are all in L1(R) for almost every c ∈ R, and∫f(x, y)dxdy =

∫ (∫f(x, y) dy

)dx =

∫ (∫f(x, y) dx

)dy. (54)


Proof. We present the proof just for the first equality of (54), and that the functions involved init are integrable. Moreover, it is enough to prove the statement for upper integrable functions,because a general integrable function is a linear combination of upper integrable functions.

Let ϕ1 ≤ ϕ2 ≤ · · · be a sequence of step functions tending almost everywhere to f , and let E bethe set where this sequence does not converge. By the monotone convergence theorem, and usingFubini’s theorem for step functions (52), the sequence

Φn(x) =

∫ϕn(x, y)dy,

converges almost everywhere, and∫

Φn →∫f . Fix x0 such that Φn(x0) converges and such that

the set {y ∈ R | (x0, y) ∈ E} has measure zero. Applying the monotone convergence theorem again,we have ϕn(x0, y)→ f(x0, y) for almost every y, so f(x0, y) ∈ L1 and

Φn(x0)→∫f(x0, y)dy.

Since this works for almost every x0, we apply the monotone convergence theorem one more timeto conclude that

∫f(x, y)dy ∈ L1 and

∫Φn →

∫(∫f(x, y)dy)dx. Since we already knew that∫

Φn →∫f , that gives the conclusion. �

To check that a function f ∈ L1(R2) so that Fubini’s theorem is applicable to it, it is oftenconvenient to define something like

fn(x, y) =

{f(x, y), |x| ≤ n and |y| ≤ n and |f(x)| ≤ n,0, otherwise.

and apply Fubini’s theorem and Corollary 4.15 to the sequence fn; then f is in L1 provided the fnare and either

∫(∫|f(x, y)|dy)dx or

∫(∫|f(x, y)|dx)dy converges.

For example, to do this for the case of (51), put

fn(t, ξ) =

{f(t)e−|ξ|yeiξ(x−t), |t| ≤ n and |ξ| ≤ n,0, otherwise.

(here x ∈ R and y > 0 are fixed).

5.6. Plancherel’s identities and mean square convergence. To extend the inversion formula(49) beyond the restrictive assumptions of Theorem 5.2, we prove the analogue of the mean squareconvergence result and Parseval’s identity for Fourier series:

limN→∞

∫ π

−π

∣∣∣f(θ)−N∑−N

cneinθ∣∣∣2dθ = 0, (55)

and ∫ π

−π|f |2 = 2π

∞∑−∞|cn|2. (56)

We begin with the analogue of (56):

Theorem. Suppose f and g are both in L1. Then∫ ∞−∞

fg =

∫ ∞−∞

fg. (57)


If in addition g is bounded and continuous and g ∈ L1, then∫ ∞−∞

fg = 2π

∫ ∞−∞

f ¯g. (58)

In the case that f = g, we can rewrite Plancherel’s identity (58) as∫ ∞−∞|f |2 = 2π

∫ ∞−∞

∣∣∣f ∣∣∣2 , (59)

which is the Fourier transform version of (56). (There is also a version of (58) for Fourier series,which can be deduced from (56): see Exercise 3-13 from page 54 of [See].)

Proof. To prove the first, we observe that∫f(u)g(u)du =

∫f(u)

1

2π

∫g(v)e−iuvdvdu =

∫g(v)

1

2π

∫f(u)e−ixξdudv =

∫f(v)g(v)v,

where all integrals are from −∞ to ∞, and the interchange of order of integration is justified byFubini’s theorem since |f(u)g(v)e−iuv| = |f(u)g(v)|.

We prove the second by a strategic application of the first. We note that taking the complexconjugate of (49) and inserting a factor of 2π/2π gives

f(x) =1

2π

∫ ∞−∞

2πf(ξ)e−ixξdξ,

which shows that f is the Fourier transform of 2π¯f . A compact but typographically awkward way

of writing this is

f = 2πˆf. (60)

Consequently, applying first (60) and then (57) gives∫fg = 2π

∫f ˆg = 2π

∫f ¯g.

�

We next extend the Fourier transform and the Plancherel identities (58) and (59) to more generalL2 functions. Given f ∈ L2, let f1, f2, . . . be a sequence of piecewise differentiable funcions, eachvanishing outside of some interval, such that

∫|f − fn|2 → 0. For example, we may take

fn(x) =

Pn(x), |x| ≤ n,An|x|+Bn, |x| ∈ [n, n+ δn],

0, otherwise,

(61)

where Pn is trigonometric polynomial on [−n, n] chosen such that∫ n−n |f − fn|

2 ≤ 1/n and Pn(n) =

Pn(−n), and then δn is chosen small enough that∫|x|∈[n,n+δn](|f |+ max |Pn|)2 ≤ 1/n, and then An

and Bn are chosen such that f is continuous, so that∫ n+δn−n−δn |f−fn|

2 ≤ 2/n. Since∫|x|≥n+δn |f |

2 → 0

that shows∫|f − fn|2 → 0.

Then the sequence fn is Cauchy in L2 (i.e. for every ε > 0 there is N such that∫|fn− fm|2 < ε

when m > n ≥ N .) By (59), the sequence fn is Cauchy in L2 as well, so by the completeness of

L2 there is function in L2, which we call f , such that∫|f − fn|2 → 0. Moreover (58) and (59) still

hold for f and f .


We must check that this new definition of f agrees with the old one when f ∈ L1 ∩ L2. This isso because then fn, defined by (61), also obeys

∫|fn − f | → 0 provided δn is chosen small enough,

so that∫e−iξtfn(t)dt→

∫e−iξtf(t)dt.

This proves (58) and (59) for general f and f in L2. We can also compute f and f in otherways. For example, given f ∈ L2 and A > 0, we put

fA(x) =

{f(x), |x| ≤ A,0, otherwise.

Then∫|f − fA|2 → 0 as A→∞. Hence

∫|f − fA|2 → 0 as A→∞, or

limA→∞

∫ ∞−∞

∣∣∣f(ξ)− 1

2π

∫ A

−Ae−itξf(t)dt

∣∣∣2dξ = 0.

Similarly,

limA→∞

∫ ∞−∞

∣∣∣f(x)−∫ A

−Aeixξf(ξ)dξ

∣∣∣2dx = 0,


6. Distributions

In Chapter 4 of Seeley [See], the Dirichlet problem on the half-plane

∂2xv(x, y) + ∂2yv(x, y) = 0, for y > 0, v(x, 0) = f(x),

for a given function f , is solved by

v(x, y) =

∫f(s)Py(s− x)ds,

where the Poisson kernel Py is given by

Py(x) =1

π

y

x2 + y2.

Similarly the heat equation

∂tu(x, t) = ∂2xu(x, t), for t > 0, u(x, 0) = g(x),

for a given function g is solved by

u(x, t) =

∫g(s)Gt(s− x)ds,

where the heat kernel Gt is given by

Gt(x) =1√4πt

e−x2/4t.

The functions v and u are solutions to the respective problems in the sense that, under suitableassumptions on f and g, we can differentiate under the integral sign to show that the differentialequation is obeyed, and we can prove that limy→0+ v = f and limt→0+ u = g.

If we take the limit of Py as y → 0+ or of Gt as t→ 0+ directly, we get 0 when x 6= 0 and +∞when x = 0. To make sense of this limit, we introduce the framework of distribution theory.


6.1. Test functions. Let S be the set of functions ϕ : R→ C such that

supx∈R|xkϕ(`)(x)| < +∞,

for all k ≥ 0 and for all ` ≥ 0. This is the space of Schwartz test functions. For example, wemay take ϕ = Pe−Q, where P is any polynomial, and Q is any polynomial of even degree withpositive leading coefficient; such a function is bounded, and any time we differentiate or multiplyby a power of x we get a function having the same form.

6.2. Distributions. A distribution is a linear functional u : S → C. For instance, given an inte-grable function f , we define

f [ϕ] =

∫fϕ. (62)

Then, for any ϕ, we have

Py[ϕ] =1

π

∫y

x2 + y2ϕ(x)dx =

1

π

∫1

s2 + 1ϕ(sy)ds

y→0+−−−−→ ϕ(0)1

π

∫1

s2 + 1ds = ϕ(0),

and, similarly,

Gt[ϕ]t→0+−−−→ ϕ(0).

We define the Dirac delta distribution

δ[ϕ] = ϕ(0),

and we say that limy→0+ Py = limt→0+ Gt = δ in the sense of distributions. We further define

δa[ϕ] = ϕ(a),

for any a ∈ R, which is called the Dirac delta distribution centered at a. Motivated by (62) wesometimes write∫

δϕ =

∫δ(x)ϕ(x)dx = ϕ(0),

∫δaϕ =

∫δ(x− a)ϕ(x)dx = ϕ(a),

but when using such notation it is important to remember that these are not traditional integrals.See Section 2.4 of [Kan] for more on these and other related examples.

(To avoid pathological examples, one usually imposes a continuity requirement on distributions,

namely that u[ϕn] → u[0] for any sequence of test functions such that supx∈R |xkϕ(`)n (x)| → 0 for

all k and `. We will not bother with this because we will not develop the theory far enough hereto need it.)

6.3. Differentiation. If f is C∞ with all derivatives bounded, then f and f ′ define distributionsby (62), and, integrating by parts, we have

f ′[ϕ] =

∫f ′ϕ = −

∫fϕ′ = f [−ϕ′].

More generally, we define the distributional derivative of any distribution by putting

u′[ϕ] = u[−ϕ′].

For example, putting Ay(x) =∫ x−∞ Py, we have

Ay(x) = =1

π(arctan(y/x) + π/2)

y→0+−−−−→ H(x),


where the Heaviside function H(x) is 0 when x < 0, 1 when x > 0, and 1/2 when x = 0. We haveA′y = Py, and we can also check H ′ = δ by writing

H ′[ϕ] = H[−ϕ′] = −∫ ∞0

ϕ′ = ϕ(0) = δ[ϕ].

Similar computations show that sgn′ = 2δ and that if f(x) = x2 when −1 < x < 2 and f(x) = 0otherwise, then f ′ = δ−1 − 4δ2 + g, where g(x) = 2x when −1 < x < 2 and g(x) = 0 otherwise.More generally we have the jump formula, which says that if f : R→ R is piecewise differentiable,then its distributional derivative is given by

f ′ = g +

n∑j=1

cjδxj ,

where g is the function obtained by differentiating f at all points where it is differentiable and cj isthe size of the jump at xj . See Section 2.7 of [Kan] for more on these and other related examples.

6.4. Fourier transform. We define the Fourier transform of a distribution in a way parallel tohow we defined the derivative, but now the integration by parts formula

∫f ′ϕ = −

∫fϕ′ is replaced

by∫fϕ =

∫fϕ. Thus we define

u[ϕ] = u[ϕ].

We have

δ[ϕ] = δ[ϕ] = ϕ(0) =1

2π

∫ϕ,

so that δ = 1/2π. We also have Py = 12πe−y|ξ| and Gt = 1

2πe−tξ2 , so limy→0+ Py = limt→0+ Gt =

1/2π. Similarly

δa[ϕ] = ϕ(a) =1

2π

∫e−itaϕ(t)dt,

so that δa = e−i•a/2π, or we may write δa(ξ) = e−iξa/2π. The Fourier inversion formula says that

Py(x) =1

2π

∫eixξe−y|ξ|,

and since the left side converges in the sense of distributions to δ, the right side does as well andone sometimes writes

δ(x) =1

2π

∫eixξdξ,

where we are again not dealing with a traditional integral and must use an appropriate limitingprocess to understand this. For example, we may put

uR(x) =1

2π

∫ R

−Reixξdξ,

so that

uR[ϕ] =1

2π

∫ ∫ R

−Reixξdξϕ(x)dx =

1

2π

∫ ∫ R

−Re−ixξdξϕ(x)dx =

1

2π

∫ R

−R

∫e−ixξϕ(x)dxdξ =

∫ R

−Rϕ.

Thus, as R→∞, uR[ϕ]→∫ϕ = ϕ(0). This shows that

δ(x) = limR→∞

1

2π

∫ R

−Reixξdξ,


with the limit taken in the sense of distributions. Similarly, when we showed that Py → δ andGt → δ in the sense of distributions, we actually showed that

δ(x) =1

2πlimy→0+

∫eixξe−y|ξ|dξ =

1

2πlimt→0+

∫eixξe−tξ

2dξ,

with the limits taken in the sense of distributions.We can similarly take the Fourier transform of the Heaviside function.

H[ϕ] = H[ϕ] =

∫ ∞0

ϕ.

To write an integral formula in terms of ϕ (rather than ϕ) we take a limit H(x) = limε→0+ e−εxH(x),

and compute the Fourier transform of e−εxH(x) by writing∫ ∞0

e−ixξe−εxdx =1

iξ + ε=

1

i

1

ξ − iε.

Thus

H =1

2πi(ξ − i0)−1,

where we define

(ξ − i0)−1[ϕ] = limε→0+

∫ϕ(ξ)dξ

ξ − iε.

We can do a similar calculation with another antiderivative of δ. Take the limit (H − 1)(x) =limε→0+ e

εx(H − 1)(x) and get

(H − 1) (ξ) =1

i

1

ξ + iε, (H − 1) =

1

2πi(ξ + i0)−1.

Since (H − 1) = H − 1, and 1 = δ, we get

δ(ξ) =1

2πi

((ξ − i0)−1 − (ξ + i0)−1

).

We can also compute∫1

2πi

((ξ − iε)−1 − (ξ + iε)−1

)ϕ(ξ)dξ =

1

π

∫ε

ξ2 + ε2ϕ(ξ)dξ =

∫Pε ϕ

ε→0+−−−−→ ϕ(0),

where we recognized a reappearance of the Poisson kernel in the integrand. See Sections 2.4 and3.5 of [Kan] for more on these and other related examples.

Appendix A. Complex numbers

We write complex numbers as z = x + iy, where x and y are real numbers and i2 = −1. Thenumbers x and y are called, respectively, the real part and imaginary part of z. Addition andmultiplication are defined by

z1 + z2 = (x1 + x2) + i(y1 + y2), and z1z2 = (x1x2 − y1y2) + i(x1y2 + x2y1).

The complex conjugate of z is z = x − iy. The absolute value of z is |z| =√zz =

√x2 + y2, and

this is also called the length, norm, magnitude, or modulus. Note that |z1z2| = |z1||z2|.Geometrically we plot complex numbers in the plane as ordered pairs (x, y). The absolute value

gives the distance to the origin. It is also sometimes useful to write x = |z| cos θ and y = |z| sin θ,giving

z = |z|(cos θ + i sin θ).

Then either θ is the angle between the vector (x, y) and the positive real axis, or else θ is this angleplus an integer multiple of 2π. See Figure 12.


Figure 12. The first two plots (taken from [Fis, Figs 1.2-3]) are of a few pointsin the complex plane; for the second note that −1 + i =

√2(cos 3π

4 + i sin 3π4 ). The

third (taken from [Kle, Figure 14]) depicts addition and multiplication of complexnumbers. Note that the shaded triangles are similar.

We define eiθ using Euler’s rule

eiθ = cos θ + i sin θ. (63)

To motivate (63), recall the Taylor series:

ex = 1 + x+x2

2!+x3

3!+ · · · ,

cosx = 1− x2

2!+x4

4!− x6

6!+ · · · ,

sinx = x− x3

3!+x5

5!− x7

7!+ · · · ,

and note that replacing x by iθ in the first series gives the same terms as in the series for cos θ+i sin θ.Actually, nothing stops us from defining the complex exponential (and for that matter, the complexcosine and sine functions) using the above series, with x replaced by a complex number. But forour purposes it is easier to take (63) as a definition.

From (63) and the Pythagorean theorem sin2 θ + cos2 θ = 1 we deduce

|eiθ| = 1.

So the function θ 7→ eiθ maps R to the unit circle, and eiθ moves counterclockwise on the circle as θincreases: see Figure 13. This function is 2π periodic because cos and sin are. Some sample valuesare

e−2πi = 1, e−πi/3 = 12 − i

√32 , e

0 = 1, eiπ/2 = i, e5iπ/6 = −√32 + i12 , e

iπ = −1, e2iπ = 1.

Using the angle addition formulas for cosine and sine gives

ei(a+b) = cos(a+ b) + i sin(a+ b) = cos a cos b− sin a sin b+ i(sin a cos b+ cos a sin b) = eiaeib,

just as for real exponentials we have ea+b = eaeb. Differentiating (63) gives

ddye

iy = ieiy.

5taken from https://en.wikipedia.org/wiki/Euler’s_formula.

https://en.wikipedia.org/wiki/Euler's_formula


2

2

i · sin x

cos xRe

Im

e‒ix eix

e‒ix

eix

eix e‒ix

e ‒ix ‒

+

‒

Figure 13. A plot of the complex exponential5, which shows Euler’s rule eix =

cosx+ i sinx and its variants cosx = eix+e−ix

2 and sinx = eix−e−ix2i .

Appendix B. The Heine–Borel theorem

In this appendix we prove an important theorem about open covers:

Theorem (Heine–Borel Theorem). If the closed interval [a, b] is covered by an infinite collectionof open intervals I, in the sense that each point in [a, b] is contained in one of the intervals of I,then there is a finite subcollection of I which also covers [a, b].

In other words: every cover of [a, b] by open intervals contains a finite subcover.As an example, let I be the collection of all open intervals of length 1, and let [a, b] = [0, 2]. Then

there are many finite subcovers, such as {(−.5, .5), (0, 1), (.5, 1.5), (1, 2), (1.5, 2.5)}. The point ofthe Heine–Borel Theorem is that a finite subcover still exists even if the definition of I is muchmore intricate. It is important that the interval being covered is closed and bounded and that theintervals doing the covering are open: for instance the interval (0, 1) is covered by the collectionI = {(1/2, 1), (1/3, 1), (1/4, 1), . . . } but not by any finite subcollection of I.

Proof of the Heine–Borel Theorem. Suppose by way of contradiction that no finite subcollection ofintervals from I covers [a, b]. Then at least one of the two subintervals

[a, a+b2 ] and [a+b2 , b],

inherits this property (i.e., it is not covered by any finite subcollection of intervals from I). Callthat subinterval [a1, b1]. Similarly at least one of the two subintervals

[a1,a1+b1

2 ] and [a1+b12 , b1],

has the same property, and we call it [a2, b2]. Continuing, we obtain monotonic sequences a1 ≤a2 ≤ · · · and b1 ≥ b2 ≥ · · · such that

bk − ak = (b− a)/2k.

Consequently, both sequences converge to a common limit, which we call x = lim ak = lim bk.Now take an open interval (c, d) from the collection I such that x ∈ (c, d). Observe that, for k

sufficiently large, all the intervals [ak, bk] are contained in (c, d); it is enough if the length of [ak, bk]is less than both x− c and d−x. This contradicts the statement [ak, bk] is not covered by any finite


subcollection of intervals from I, since we have found a subcollection consisting of just a singleinterval. �

The Heine–Borel theorem has an important consequence for integrals of step functions.

B.1. Lemma. Let ϕ1 ≥ ϕ2 ≥ · · · be a sequence of step functions which tends to 0 a.e. Then

limn→∞

∫ϕn = 0.

Proof. Let [a, b] be an interval outside of which all the ϕn vanish a.e. Let E be the set consistingof a, b, all the points where at least one of the ϕn is not continuous, and the set of points wherethe sequence does not converge. Then E has measure zero. Let ε > 0 be given, and cover E by acollection of open intervals I of total length < ε. For every x0 not in E we have limn→∞ ϕn(x0) = 0,so there is n0(x0) such that

ϕn0(x0)(x0) < ε.

Since the function ϕn0(x0)(x) is constant near x0, the same inequality holds for x in some openinterval J(x0) containing x0. Since the sequence of functions is decreasing, we obtain

ϕn(x) < ε, when n ≥ n0(x0) and x ∈ J(x0).

Then I together with all the J(x0) form an open cover of [a, b], so by the Heine–Borel theoremthere is a finite subcover. Let N be the largest of the n0’s corresponding to this subcover, so that

ϕn(x) < ε, when n ≥ N,

for all x on that part of [a, b] which is covered by intervals J(x0) from the finite subcover. Thusthe integral of ϕn on this part is less than

ε(b− a).

The rest of [a, b] is covered by finitely many intervals coming from I, and these intervals have totallength < ε. Since ϕn(x) ≤ ϕ1(x) ≤ maxϕ1, the integral of ϕn on this part is less than

εmaxϕ1.

Thus, for n ≥ N .

0 ≤∫ϕn < ε(b− a+ maxϕ1).

Since ε > 0 was arbitary, this completes the proof. �

Appendix C. Further discussion and references

These notes present the fundamentals of Fourier analysis, adhering as closely as possible toHume’s principle of avoiding all unnecessary detail, stated at the end of Chapter I of [Hum].

Fourier formulates the heated ring problem of Section 1 in Chapter II, Section I of [Fou], andsolves it in Chapter IV, Section I of [Fou].

The derivation of the Fourier coefficients using finite sums and least square approximation, soas to defer the treatment of the more difficult problem of convergence, follows Klein’s treatment inSection II.3.C of Part III of [Kle], where it is attributed to Bessel.

The approach to Lebesgue integration is Riesz’s, as presented by Sz.-Nagy [SzN] (see also[RieSzN]) and Weir [Wei], the latter is an especially gentle and thorough introduction to the sub-ject. Another good reference is [Apo]. For still more, with an approach closer to Lebesgue’s original[Leb], see [Fol].


For a more detailed presentation of distribution theory with an emphasis on computation andexamples, see Kanwal [Kan]. For a more sophisticated theoretical presentation, see Friedlander andJoshi [FriJos].

As mentioned in Appendix A, one can also define the complex exponential using power series.A classic source for this is Appendix A.2 of Whittaker and Watson’s book [WhiWat], and anotheris Section 3 of Chapter 2 of Ahlfors’ book [Ahl]. A more geometric approach, which better reflectsthe origin of these functions, is that advocated in Part III of Klein’s book [Kle]. We define thelogarithm as the function giving the area under the hyperbola:

log x =

∫ x

1

dt

t,

and define x 7→ ex as the inverse of x 7→ log x. Next (ex)′ = ex follows from (log x)′ = 1/x andthe rule for differentiating inverse functions, and this implies the power series representation forreal numbers, by Taylor’s theorem. The inverse trigonometric functions can be defined similarlyby computing areas under circular arcs, and the corresponding series for sine and cosine derived.Extending these series to complex numbers and comparing the coefficients gives Euler’s rule (63).Such an approach is carried out in a leisurely and detailed fashion in Chapters IX and X of Hardy’sbook [Har] and (for real numbers) in Part III of Spivak’s book [Spi].

The proof of the Heine–Borel in Appendix B is similar to the one at the beginning of §1.2.4 of[SzN], on page 39. There is another interesting proof immediately after it on page 40 of [SzN]. Thelatter is more similar to Borel’s original proof from [Bor]. For more, see Chapter 1 of Oxtoby’sbook [Oxt], especially Theorem 1.5.

References

[Ahl] Lars V. Ahlfors, Complex Analysis, Third Edition, 1979.[Apo] Tom M. Apostol, Mathematical Analysis, Second edition, 1974.

[Bor] Emile Borel, Lecons sur la Theorie des Fonctions, Deuxieme edition, 1914.[Fis] Stephen D. Fisher, Complex Variables, Second Edition, 1990.[Fol] Gerald B. Folland, Real Analysis, Second Edition, 1999.[Fou] Joseph Fourier, The Analytical Theory of Heat, 1822. Translated by Alexander Freeman, 1878.[FriJos] F. G. Friedlander and M. Joshi, Introduction to the theory of distributions, Second Edition 1998.[Har] G. H. Hardy, A Course of Pure Mathematics, Tenth Edition, 1950.[Hum] David Hume, An Enquiry Concerning Human Understanding, 1748.[Kan] Ram P. Kanwal, Generalized Functions: Theory and Applications, Third Edition, 2004.[Kle] Felix Klein, Elementary Mathematics from an Advanced Standpoint: Arithmetic, Algebra, Analysis, Third

edition 1924. Translated by E. R. Hedrick and C. A. Noble, 1932.[Leb] H. Lebesgue, Integrale, Longueur, Aire, Annali di Matematica Pura ed Applicata, Volume 7, Issue 1, pages

231–359, 1902.[Oxt] John C. Oxtoby, Measure and Category, Second Edition, 1980.[Par] Marc-Antoine Parseval, Memoire sur les series et sur l’integration complete d’une equation aux differences

partielles lineaires du second ordre, a coefficients constants. Pages 638–648 of Tome 1 of Memoires presentes al’institut des sciences, lettres, et arts, et lus dans ses assemblees, Paris, 1805.

[RieSzN] Frigyes Riesz and Bela Sz.-Nagy, Functional Analysis. Translated from the second French edition by Leo F.Boron, Blackie and Son Ltd, 1956.

[SzN] Bela Sz.-Nagy, Introduction to Real Functions and Orthogonal Expansions, 1965.[See] Robert T. Seeley, An Introduction to Fourier Series and Integrals, 1966.[Spi] Michael Spivak, Calculus, Fourth Edition 2008.[Wei] Alan J. Weir, Lebesgue Integration and Measure, 1973.[WhiWat] E. T. Whittaker and G. N. Watson, A course of modern analysis, Fourth Edition, 1927.

THE FUNDAMENTALS OF FOURIER ANALYSIS The heated ring ...

Documents

Transcript of THE FUNDAMENTALS OF FOURIER ANALYSIS The heated ring ...