alliance.seas.upenn.edu · Variational Calculus & Variational Principles In Physics Michael A....

50
Variational Calculus & Variational Principles In Physics Michael A. Carchidi November 30, 2009 1. A Typical Problem In The Calculus Let R be the set of real numbers and let f : R R be a dierentiable function dened on R. In a typical calculus problem, you are given the function f : R R over an interval a x b, and you are asked to determine the maximum or minimum values of f . One method of solution, as developed in the calculus, is to determine all a x b for which f 0 (x)=0 or f 0 (x) is not dened (the so-called critical points of f ), call these x 1 , x 2 , x 3 , ..., x n , and then computed {f (a),f (x 1 ),f (x 2 ),f (x 3 ),...,f (x n ),f (b)} and then f min = min{f (a),f (x 1 ),f (x 2 ),f (x 3 ),...,f (x n ),f (b)} while f max = max{f (a),f (x 1 ),f (x 2 ),f (x 3 ),...,f (x n ),f (b)}. Example 1.1 - For Review Only Determine the maximum and minimum value for the function f : R R, dened by f (x)=3x 4 28x 3 + 84x 2 96x +1 over the interval 0 x 5. To solve this, we rst compute f 0 (x) = 12x 3 84x 2 + 168x 96

Transcript of alliance.seas.upenn.edu · Variational Calculus & Variational Principles In Physics Michael A....

Variational Calculus & VariationalPrinciples In Physics

Michael A. Carchidi

November 30, 2009

1. A Typical Problem In The Calculus

Let R be the set of real numbers and let f : R → R be a differentiablefunction defined on R. In a typical calculus problem, you are given the functionf : R → R over an interval a ≤ x ≤ b, and you are asked to determine themaximum or minimum values of f . One method of solution, as developed in thecalculus, is to determine all a ≤ x ≤ b for which f 0(x) = 0 or f 0(x) is not defined(the so-called critical points of f), call these x1, x2, x3, ..., xn, and then computed

{f(a), f(x1), f(x2), f(x3), . . . , f(xn), f(b)}

and thenfmin = min{f(a), f(x1), f(x2), f(x3), . . . , f(xn), f(b)}

whilefmax = max{f(a), f(x1), f(x2), f(x3), . . . , f(xn), f(b)}.

Example 1.1 - For Review Only

Determine the maximum and minimum value for the function f : R → R,defined by

f(x) = 3x4 − 28x3 + 84x2 − 96x+ 1over the interval 0 ≤ x ≤ 5. To solve this, we first compute

f 0(x) = 12x3 − 84x2 + 168x− 96

and then we solve12x3 − 84x2 + 168x− 96 = 0.

This leads to x = {1, 2, 4} as the critical points and each of these is in the region0 ≤ x ≤ 5. Then we form the following table of values.

x y = f(x) Comment0 1 Endpoint1 −36 Critical Point2 −31 Critical Point4 −63 Critical Point5 −4 Endpoint

This shows that the minimum value for f is −63 at x = 4, while the maximumvalue for f is +1 at x = 0. A plot of f(x) versus x for 0 ≤ x ≤ 5 along with thefive points in this table is shown in the following figure.

Plot off(x) = 3x4 − 28x3 + 84x2 − 96x+ 1

versus x for 0 ≤ x ≤ 5

2

Example 1.2 - For Review Only

Determine the maximum and minimum value for the function f : R → R,defined by

f(x) = max{2x− x2 + 3, 10x− x2 − 21}over the interval 0 ≤ x ≤ 6.5. To solve this, we first note that

f(x) = max{2x− x2 + 3, 10x− x2 − 21}

is the same as

f(x) =

⎧⎪⎨⎪⎩2x− x2 + 3, for 0 ≤ x ≤ 3

10x− x2 − 21, for 3 ≤ x ≤ 612

so that

f 0(x) =

⎧⎪⎨⎪⎩2− 2x, for 0 < x < 3

10− 2x, for 3 < x < 612

.

Then f 0(x) = 0 yields x = 1 or x = 5, while f 0(x) not define yields x = 3, sincef 0(x) has a jump discontinuity at x = 3. This leads to x = {1, 3, 5} as the criticalpoints and each of these is in the region 0 ≤ x ≤ 6.5. Then we form the followingtable of values.

x y = f(x) Comment0 3 Endpoint1 4 Critical Point3 0 Critical Point5 4 Critical Point612

134

Endpoint

This shows that the minimum value for f is 0 at x = 3, while the maximum valuefor f is 4 at either x = 1 or x = 5. Note that the extreme values of f (maximum orminimum) are unique, but optimal points (i.e., where these extreme values occur)need not be unique. A plot of f(x) versus x for 0 ≤ x ≤ 6.5 along with the five

3

points in this table is shown in the following figure.

Plot off(x) = max{2x− x2 + 3, 10x− x2 − 21}

versus x for 0 ≤ x ≤ 6.5

2. Some Typical Problems In The Calculus of Variations

We now present a number of typical problems that arise in the calculus ofvariations. In this section, we shall only state the problems. We shall solve theseproblems later in the chapter. We begin by stating some problems that containno constraints except for boundary conditions.

Example 2.1: A Geodesic Problem In The Plane

Given two points (x1, y1) and (x2, y2) in the xy plane, we want to determinethe function y = y(x) that connects these points and has the shortest length.Using the formula for arc length from the calculus, we see that we are searchingfor a function y = y(x) so that the boundary conditions

y(x1) = y1 and y(x2) = y2 (1a)

are satisfied and for which

L[y] =Z x2

x1

q1 + (y0)2dx where y0 =

dy

dx(1b)

4

is as small as possible. Of course we know that the answer to this problem is thestraight line

y(x) = y1 +µy2 − y1x2 − x1

¶(x− x1)

andLmin =

q(x2 − x1)2 + (y2 − y1)2.

Example 2.2: A Geodesic Problem On A Surface

Given two points (u1, v1) and (u2, v2) that lie on a surface described by thevector equation

r = r(u, v)

we want to determine the function v = v(u) that connects these points and hasthe shortest length. Using the formula for arc length from the calculus, we have

ds =√dr · dr

with

dr =∂r

∂udu+

∂r

∂vdv

and so

ds =

vuut̶r

∂udu+

∂r

∂vdv

!·Ã∂r

∂udu+

∂r

∂vdv

!

=

vuut̶r

∂u· ∂r∂u

!du2 + 2

̶r

∂u· ∂r∂v

!dudv +

̶r

∂v· ∂r∂v

!dv2

or

ds =

vuut∂r

∂u· ∂r∂u+ 2

̶r

∂u· ∂r∂v

!v0 +

̶r

∂v· ∂r∂v

!(v0)2du

where v0 = dv/du. Thus we are searching for a function v = v(u) so that theboundary conditions

v(u1) = v1 and v(u2) = v2 (1c)

5

are satisfied and for which

L[v] =Z u2

u1

vuut∂r

∂u· ∂r∂u+ 2

̶r

∂u· ∂r∂v

!v0 +

̶r

∂v· ∂r∂v

!(v0)2du (1d)

is as small as possible.

Example 2.3: The Minimum Work Problem

A particle is to move along a path from the point (x1, y1) to the point (x2, y2)in the xy plane. There is a force

F = F(x, y) = F1(x, y)bi+ F2(x, y)bjthat acts on the particle as it moves. We seek to determine that path for whichthe work done by the force is as small as possible. Since

W =Z (x2,y2)

(x1,y1)F · dr =

Z (x2,y2)

(x1,y1)F1(x, y)dx+ F2(x, y)dy

we have

W [y] =Z x2

x1(F1(x, y) + F2(x, y)y

0)dx where y0 =dy

dx.

Thus we seek to find that function y = y(x) such that the boundary conditions

y(x1) = y1 and y(x2) = y2 (2a)

are satisfied and for which

W [y] =Z x2

x1(F1(x, y) + F2(x, y)y

0)dx (2b)

is as small as possible.

Example 2.4: The Minimum Surface Area of Revolution Problem

Here we start with a function y = y(x) that connects the points (x1, y1) and(x2, y2) in the first quadrant and we want to revolve the curve represented by thisfunction about the positive x axis to form a surface of revolution. We seek that

6

function which gives the surface that has the smallest possible lateral area. Usingthe expression from calculus that gives this area as

S[y] =Z x2

x12πy

q1 + (y0)2dx where y0 =

dy

dx,

we see that we seek to find that function y = y(x) such that the boundary condi-tions

y(x1) = y1 and y(x2) = y2 (3a)

are satisfied and for which

S[y] =Z x2

x12πy

q1 + (y0)2dx, (3b)

is as small as possible.

Example 2.5: The Brachistochrome Problem

Suppose that a bead (of mass m) can slide under its weight (without friction)along a wire in the shape of a planar curve from the point (0, h) to the point(a, 0). We seek to find that curve (described by the function y = y(x)) for whichthe time of travel is as small as possible, given that the particle starts from rest.To compute the time of travel, we use conservation of energy, which states that

mgh =1

2mv2 +mgy or v2 = 2g(h− y).

But the velocity of the particle as it slides along the curve described by y = y(x)is

v = xbi+ ybj with y =dy

dx

dx

dt= y0(x)x = y0x

so thatv = xbi+ xy0bj.

Thenv2 = x2 + x2(y0)2 = (1 + (y0)2)x2

and hencev2 = (1 + (y0)2)x2 = 2g(h− y)

or

x =dx

dt=

vuut2g(h− y)1 + (y0)2

.

7

Note that the positive square root is computed since we expect x > 0 as theparticle slides down the wire. Thus we have

dt =

vuut 1 + (y0)2

2g(h− y)dx

and hence

T [y] =Z a

0

vuut 1 + (y0)2

2g(h− y)dx.

Therefore we seek a function y = y(x) so that the boundary conditions

y(0) = h and y(a) = 0 (4a)

are satisfied and for which

T [y] =Z a

0

vuut 1 + (y0)2

2g(h− y)dx (4b)

is as small as possible. For example, if y is the straight line connecting (0, h) and(a, 0), we get

y = y(x) = h(1− x/a)and

T [y] =Z a

0

vuut 1 + (−h/a)22g(h− h(1− x/a))dx =

Z a

0

vuut1 + h2/a22ghx/a

dx

or

T [y] =

vuut1 + h2/a22gh/a

Z a

0

1√xdx =

vuut1 + h2/a22gh/a

2√a

which finally reduces to

T [y] =

s2(a2 + h2)

gh.

We shall see that the minimum-time path is not a straight line and thus the aboveexpression is NOT going to be the minimum time.

We now present two problems that contain extra constraints in addition toboundary conditions.

8

Example 2.6: The Hanging Chain Problem

A uniform flexible chain (having linear mass density ρ) hangs in static equi-librium under its own weight between two supports. The chain has a fixed lengthof L and we seek that shape y = y(x), which minimizes the total potential energyof the chain. Toward this end we assume that the chain hangs between the twopoints (−a, 0) and (a, 0), where or course 2a < L, with

L[y] =Z a

−a

q1 + (y0)2dx = L = constant.

Since the potential energy of the chain is

U = mgycm = mg1

m

Zchain

y(x)dm = gZchain

y(x)ρds

we see that

U [y] = ρgZ a

−ayq1 + (y0)2dx where y0 =

dy

dx.

Thus we see a function y = y(x) so that the boundary conditions

y(−a) = y(a) = 0 (5a)

are satisfied along with the extra constraint,

L[y] =Z a

−a

q1 + (y0)2dx = L = constant (5b)

for whichU [y] = ρg

Z a

−ayq1 + (y0)2dx (5c)

is as small as possible.

Example 2.7: The Maximum Area Problem

Given a simply closed curve in the xy plane that has a fixed perimeter P ,we seek that curve which encloses the maximum area. Note that a simple closedcurve is one that DOES NOT intersect itself. For example, a figure ”8” is not a

9

simple closed curve since it intersects itself at the center. A simple closed curvecan be written in parametric form as

r = r(t) = x(t)bi+ y(t)bjand its perimeter can be computed using

P =IC|dr| =

IC

qdx2 + dy2 =

Z t2

t1

qx2 + y2dt

where t1 6= t2 and r(t1) = r(t2). The area enclosed is computed using

A =1

2

IC|r× dr| = 1

2

IC(xdy − ydx) = 1

2

Z t2

t1(xy − yx)dt.

Thus we seek two functions x = x(t) and y = y(t) such that the boundaryconditions

x(t1) = x(t2) and y(t1) = y(t2) (6a)

are satisfied along with the constraint,

P [x, y] =Z t2

t1

qx2 + y2dt = P = constant (6b)

and for whichA[x, y] =

1

2

Z t2

t1(xy − yx)dt (6c)

is as large as possible.

3. The Simplest Problems In The Calculus of Variations

If the first five examples above (2.1 - 2.5), we are given integrands of the form

f(y0) =q1 + (y0)2

and

f(u, v, v0) =

vuut∂r

∂u· ∂r∂u+ 2

̶r

∂u· ∂r∂v

!v0 +

̶r

∂v· ∂r∂v

!(v0)2

and

f(x, y, y0) = F1(x, y) + F2(x, y)y0 , f(y0) = 2πy

q1 + (y0)2

10

and

f(y, y0) =

vuut 1 + (y0)2

2g(h− y)which we represent collectively as

f = f(x, y, y0).

We are also given that the function y = y(x) must contain the two points (x1, y1)and (x2, y2), and we seek to find such a function y = y(x) so that the boundaryconditions

y(x1) = y1 and y(x2) = y2 (7a)

are satisfied and for which

J [y] =Z x2

x1f(x, y, y0)dx (7b)

is as large (or as small) as possible. The expression J [y] is called a functional.Unlike a function, which takes as input, a real number and returns as output areal number, a functional takes as input, a differentiable function y = y(x) (alongwith fixed real numbers x1 and x2) and it returns as output a real number.

4. The Euler-Lagrange Equation

Our goal is to determine that function y = y(x) that satisfies the boundaryconditions

y(x1) = y1 and y(x2) = y2 (8a)

for which the functionalJ [y] =

Z x2

x1f(x, y, y0)dx (8b)

is an extremum (maximum or minimum).

4.1 The Ordinary Calculus Problem

To see how we might solve this, let us review one method on how we mightsolve the ordinary calculus problem involving a differentiable function y = f(x)and finding an extremum for it. Toward this end, we assume that we have a pointfor which f is a relative extremum, and let us call this point xc. Let us also assumethat this point is where f has a relative minimum value. The proof for a relative

11

maximum value is very similar. Consider now the two points x±α = xc±α, whereα is very small and positive. These two points are to the left and right of xc.Since xc is a point where f has a relative minimum, for α small enough, we musthave

f(x+α) = f(xc + α) ≥ f(xc) and f(x−α) = f(xc − α) ≥ f(xc),

or0 ≤ f(xc + α)− f(xc) and f(xc)− f(xc − α) ≤ 0.

Dividing by α, we get

0 ≤ f(xc + α)− f(xc)α

andf(xc)− f(xc − α)

α≤ 0

for all positive α and near zero. Taking the limit as α approaches zero, we thenget

0 ≤ limα→0+

f(xc + α)− f(xc)α

and limα→0+

f(xc)− f(xc − α)

α≤ 0

which yield0 ≤ f 0(xc) and f 0(xc) ≤ 0

or f 0(xc) = 0.

Note that you should show that the same result follows even if we assume αto be negative throughout.

Thus we find that a necessary condition for xc is that f 0(xc) = 0, if f(xc) is tobe a relative minimum for the function f . A similar argument leads to f 0(xc) = 0,if f(xc) is to be a relative maximum for the function f as well. Thus we find thatif f has a relative extrema at x = xc, then f 0(xc) = 0.

4.2 The Variational Calculus Problem

To solve the variational calculus problem we assume that y = y(x) is thefunction which satisfies the boundary conditions

y(x1) = y1 and y(x2) = y2

12

and for whichJ [y] =

Z x2

x1f(x, y, y0)dx

is as large (or as small) as possible. We then consider a function

y(x,α) = y(x) + αη(x)

where α is small and

y(x1,α) = y1 and y(x2,α) = y2

which requires thatη(x1) = η(x2) = 0.

We also note that y(x, 0) = y(x), which is the function we seek. Then

J [y(x,α)] = J(α) =Z x2

x1f(x, y(x,α), y0(x,α))dx

is a function of α. To find the extrema, we then want to require that

∂J(α)

∂α

¯¯α=0

= 0.

Computing ∂J/∂α, we have

∂J(α)

∂α=

∂α

Z x2

x1f(x, y(x,α), y0(x,α))dx

=Z x2

x1

∂αf(x, y(x,α), y0(x,α))dx

=Z x2

x1

(∂f

∂x

∂x

∂α+

∂f

∂y(x,α)

∂y(x,α)

∂α+

∂f

∂y0(x,α)

∂y0(x,α)

∂α

)dx

=Z x2

x1

(∂f

∂x(0) +

∂f

∂y(x,α)η(x) +

∂f

∂y0(x,α)η0(x)

)dx

since

y(x,α) = y(x) + αη(x) and y0(x,α) = y0(x) + αη0(x)

lead to∂y(x,α)

∂α= η(x) and

∂y0(x,α)

∂α= η0(x).

13

Thus we have

∂J(α)

∂α=

Z x2

x1

(∂f

∂y(x,α)η(x) +

∂f

∂y0(x,α)η0(x)

)dx

=Z x2

x1

∂f

∂y(x,α)η(x)dx+

Z x2

x1

∂f

∂y0(x,α)η0(x)dx. (9)

Using integration by parts on the second integral, we find thatZ x2

x1

∂f

∂y0(x,α)η0(x)dx =

∂f

∂y0(x,α)η(x)

¯¯x2

x1

−Z x2

x1

d

dx

̶f

∂y0(x,α)

!η(x)dx

=∂f

∂y0(x,α)

¯¯x=x2

η(x2)−∂f

∂y0(x,α)

¯¯x=x1

η(x1)

−Z x2

x1

d

dx

̶f

∂y0(x,α)

!η(x)dx

= 0− 0−Z x2

x1

d

dx

̶f

∂y0(x,α)

!η(x)dx

since η(x1) = η(x2) = 0, and so we haveZ x2

x1

∂f

∂y0(x,α)η0(x)dx = −

Z x2

x1

d

dx

̶f

∂y0(x,α)

!η(x)dx.

Putting this into Equation (9), we have

∂J(α)

∂α=Z x2

x1

∂f

∂y(x,α)η(x)dx−

Z x2

x1

d

dx

̶f

∂y0(x,α)

!η(x)dx

or simply∂J(α)

∂α=Z x2

x1

(∂f

∂y(x,α)− d

dx

̶f

∂y0(x,α)

!)η(x)dx.

Then∂J(α)

∂α

¯¯α=0

=Z x2

x1

(∂f

∂y(x)− d

dx

̶f

∂y0(x)

!)η(x)dx.

Setting this equal to zero and using the fact that η(x) is any function that satisfiesη(x1) = η(x2) = 0, we must conclude that

∂f

∂y(x)− d

dx

̶f

∂y0(x)

!= 0

14

or simplyd

dx

̶f

∂y0

!− ∂f

∂y= 0.

This is the Euler-Lagrange equation and it represents a necessary condition forthe function y = y(x) which minimizes (or maximizes) J [y].

To summarize, we see that a function y = y(x) which satisfies the boundaryconditions

y(x1) = y1 and y(x2) = y2 (10a)

and for whichJ [y] =

Z x2

x1f(x, y, y0)dx (10b)

is as large (or as small) as possible must also satisfy the differential equation

d

dx

̶f

∂y0

!− ∂f

∂y= 0. (10c)

4.3 Another Form For The Euler-Lagrange Equation

Starting with the function f = f(x, y, y0), we note that

df

dx=

∂f

∂x

dx

dx+

∂f

∂y

dy

dx+

∂f

∂y0dy0

dx=

∂f

∂x+

∂f

∂yy0 +

∂f

∂y0y00.

But from Equation (10c) we have

∂f

∂y=d

dx

̶f

∂y0

!

and so we find that

df

dx=

∂f

∂x+d

dx

̶f

∂y0

!y0 +

∂f

∂y0y00 =

∂f

∂x+d

dx

Ãy0∂f

∂y0

!

or justd

dx

Ãy0∂f

∂y0− f

!+

∂f

∂x= 0. (10d)

15

5. The Euler-Lagrange Equation: Two Special Cases

We now consider two special cases of the Euler-Lagrange Equation

d

dx

̶f

∂y0

!− ∂f

∂y= 0.

5.1 The function f does not explicitly depend on the dependent variable y

If the function f(x, y, y0) does not explicitly depend on the dependent variabley, then

f(x, y, y0) = f(x, y0) and∂f

∂y= 0 (11a)

in which case the Euler-Lagrange Equation (10c) reduces to

d

dx

̶f

∂y0

!= 0 or

∂f

∂y0= constant. (11b)

5.2 The function f does not explicitly depend on the independent variable x

If the function f(x, y, y0) does not explicitly depend on the independent vari-able x, then

f(x, y, y0) = f(y, y0) and∂f

∂x= 0 (12a)

in which case the Euler-Lagrange Equation (10d) reduces to

d

dx

Ãy0∂f

∂y0− f

!= 0 or y0

∂f

∂y0− f = constant. (12b)

6. Solutions to Some Typical Problems In The Calculus of Variations

We now present the solutions to some of the problems discussed in Section 2above.

16

Example 6.1: A Geodesic Problem In The Plane

Given two points (x1, y1) and (x2, y2) in the xy plane, we want to determinethe function y = y(x) that connects these points and has the shortest length.Using the formula for arc length from the calculus, we see that we are searchingfor a function y = y(x) so that the boundary conditions

y(x1) = y1 and y(x2) = y2 (13a)

are satisfied and for which

L[y] =Z x2

x1

q1 + (y0)2dx (13b)

is as small as possible. For here we have

f(x, y, y0) =q1 + (y0)2 = f(y0)

which leads to ∂f/∂y = 0 and hence

∂f

∂y0=

y0q1 + (y0)2

= constant

which leads to

y0q1 + (y0)2

= A or y0 =

sA2

1−A2 = B

and hence y(x) = Bx + C. To determine B and C, we put in the boundaryconditions y(x1) = y1 and y(x2) = y2, and these result in

B =y2 − y1x2 − x1

and C = y1 −µy2 − y1x2 − x1

¶x1

and hencey(x) = y1 +

µy2 − y1x2 − x1

¶(x− x1).

We also have

Lmin =Z x2

x1

q1 + (y0)2dx =

Z x2

x1

s1 +

µy2 − y1x2 − x1

¶2dx = (x2−x1)

s1 +

µy2 − y1x2 − x1

¶217

or simplyLmin =

q(x2 − x1)2 + (y2 − y1)2

as expected.

Example 6.2: A Geodesic Problem On A Sphere

Given two points (ϕ1, θ1) and (ϕ2, θ2) that lie on the sphere described by

r = r(ϕ, θ) = r cos(ϕ) sin(θ)bi+ r sin(ϕ) sin(θ)bj+ r cos(θ)bkwe want to determine the function θ = θ(ϕ) that connects these points and hasthe shortest length. Using the formula for arc length discussed earlier with u = ϕand v = θ, we have

ds =

vuut ∂r

∂ϕ· ∂r∂ϕ

+ 2

̶r

∂ϕ· ∂r∂θ

!θ0 +

̶r

∂θ· ∂r∂θ

!(θ0)2dϕ

with∂r

∂ϕ= −r sin(ϕ) sin(θ)bi+ r cos(ϕ) sin(θ)bj

and∂r

∂θ= r cos(ϕ) cos(θ)bi+ r sin(ϕ) cos(θ)bj− r sin(θ)bk,

so that∂r

∂ϕ· ∂r∂ϕ

= r2 sin2(θ) and∂r

∂ϕ· ∂r∂θ= 0

and∂r

∂θ· ∂r∂θ= r2.

Thends =

qr2 sin2(θ) + r2(θ0)2dϕ = r

qsin2(θ) + (θ0)2dϕ.

where θ0 = dθ/dϕ. Thus we are searching for a function θ = θ(ϕ) so that theboundary conditions

θ(ϕ1) = θ1 and θ(ϕ2) = θ2 (14a)

are satisfied and for which

L[θ] = rZ ϕ2

ϕ1

qsin2(θ) + (θ0)2dϕ. (14b)

18

is as small as possible. Here we effectively have

f(ϕ, θ, θ0) =qsin2(θ) + (θ0)2 and

∂f

∂ϕ= 0

which means that

θ0∂f

∂θ0− f = constant .

or, writing the constant as 1/A and putting in the expression for f , we have

θ0θ0q

sin2(θ) + (θ0)2−qsin2(θ) + (θ0)2 =

1

A

orsin2(θ)q

sin2(θ) + (θ0)2= − 1

A.

Solving for θ0 we get

θ0 =qA2 sin4(θ)− sin2(θ) = sin(θ)

qA2 sin2(θ)− 1

or

dϕ =dθ

sin(θ)qA2 sin2(θ)− 1

=csc2(θ)dθqA2 − csc2(θ)

=csc2(θ)dθq

A2 − 1− cot2(θ)Setting u = cot(θ), we have du = − csc2(θ)dθ, and then we have

dϕ =−du√

A2 − 1− u2.

and hence

ϕ = −Z 1√

A2 − 1− u2du = − sin−1

Ã√A2 − 1− u2√A2 − 1

!+B

or

ϕ = − sin−1⎛⎝qA2 − 1− cot2(θ)√A2 − 1

⎞⎠+Band so, setting C =

√A2 − 1, we find that

ϕ = − sin−1⎛⎝qC2 − cot2(θ)

C

⎞⎠+B19

or qC2 − cot2(θ)

C= sin(B − ϕ)

orC2 − cot2(θ) = C2 sin2(B − ϕ).

This leads tocot2(θ) = C2 cos2(B − ϕ)

or simplycot(θ) = C cos(B − ϕ)

where B and C are computed from the boundary conditions

θ(ϕ1) = θ1 and θ(ϕ2) = θ2.

To recognize this curve we write this as

cos(θ) = C sin(θ) cos(B − ϕ)

orcos(θ) = C sin(θ)(cos(B) cos(ϕ) + sin(B) sin(ϕ))

orr cos(θ) = Cr sin(θ) cos(ϕ) cos(B) + Cr sin(θ) sin(ϕ) sin(B).

But x = r sin(θ) cos(ϕ), y = r sin(θ) sin(ϕ) and z = r cos(θ), and so we find that

z = Cx cos(B) + Cy sin(B)

which is the equation of a plane passing through the center of the sphere (theorigin). The path along the sphere which minimizes (or maximizes) arc length isthen the curve that lies on both the sphere and this plane and this curve is a circleof intersection between the sphere and this plane, and this is known as a greatcircle on the sphere since this circle has the same center as that of the sphere andhence it has the same radius as that of the sphere, which is the largest radius acircle that is drawn on a sphere could have.

20

Example 6.3: The Minimum Work Problem

A particle is to move along a path from the point (x1, y1) to the point (x2, y2)in the xy plane. There is a force

F = F(x, y) = F1(x, y)bi+ F2(x, y)bjthat acts on the particle as it moves. We seek to determine that path for whichthe work done by the force is as small as possible. As seen earlier, this problemreduces to finding a function y = y(x) such that the boundary conditions

y(x1) = y1 and y(x2) = y2

are satisfied and for which

W [y] =Z x2

x1(F1(x, y) + F2(x, y)y

0)dx

is as small as possible. Here we have

f(x, y, y0) = F1(x, y) + F2(x, y)y0

and placing this into the Euler-Lagrange equation, we get

d

dx

̶f

∂y0

!− ∂f

∂y= 0

ordF2(x, y)

dx− ∂F1(x, y)

∂y− y0∂F2(x, y)

∂y= 0.

ButdF2(x, y)

dx=

∂F2(x, y)

∂x+

∂F2(x, y)

∂yy0

and so we have

∂F2(x, y)

∂x+

∂F2(x, y)

∂yy0 − ∂F1(x, y)

∂y− y0∂F2(x, y)

∂y= 0

or simply∂F2(x, y)

∂x− ∂F1(x, y)

∂y= 0

21

which is an algebraic equation, not a differential equation. We see then that thismay not have a solution which also satisfies the boundary conditions y(x1) = y1and y(x2) = y2, since we have no arbitrary constants to adjust. Note also that ifthe force F is conservative, which means that

∇× F =Ã∂F2(x, y)

∂x− ∂F1(x, y)

∂y

! bk = 0then any function connecting the points (x1, y1) and (x2, y2) will do and the workis independent of the path. Of course, this is as expected.

Example 6.4: The Minimum Surface Area of Revolution Problem

As seen earlier, for this problem, we seek to find that function y = y(x) suchthat the boundary conditions

y(x1) = y1 and y(x2) = y2

are satisfied and for which

S[y] =Z x2

x12πy

q1 + (y0)2dx,

is as small as possible. Thus f(x, y, y0) = 2πyq1 + (y0)2, and using Equation

(12b)

y0∂f

∂y0− f = constant

we have

y0yy0q1 + (y0)2

− yq1 + (y0)2 =

A

ory(y0)2 − y − y(y0)2q

1 + (y0)2=A

2πor

−yq1 + (y0)2

=A

2π= B

Solving for y0, we get

y0 =

√y2 −B2B

and so dx =Bdy√y2 −B2

22

which leads to

x+ C =Z Bdy√

y2 −B2 = B cosh−1µy

B

¶.

Thus we find thaty = B cosh

µx+ C

B

¶with B and C obtained by requiring the boundary conditions

y1 = B coshµx1 + C

B

¶and y2 = B cosh

µx2 + C

B

¶.

7. The Brachistochrome Problem: A Detailed Analysis

A bead of mass m is to slide without friction in a constant gravitational fieldg. The bead starts from rest at the point (0, h) in an xy plane and slides withoutfriction along some path to the point (a, 0), where x is the horizontal directionand y is the vertical direction. It is assumed that a > 0 and h > 0, and theproblem is to determine the path that allows the bead to accomplish the transitfrom (0, h) to (a, 0) in the least possible time.

It should be noted that there is no physical ground at y = 0 so that motion inwhich y < 0 is allowed, and as we shall see, is possible.

We had seen that the transit time T is given as

T [y] =Z a

0

vuut 1 + (y0)2

2g(h− y)dx, (15)

or simply

T [y] =Z a

0f(x, y, y0)dx,

with

f(x, y, y0) =

vuut 1 + (y0)2

2g(h− y) so that∂f

∂x= 0 (16)

Using Equation (12b) we find that

f − y0 ∂f∂y0

= c1. (17)

23

Putting Equation (16) into Equation (17) leads tovuut 1 + (y0)2

2g(h− y) − y0 y0q1 + (y0)2

q2g(h− y)

= c1

or1q

2g(h− y)(1 + (y0)2)= c1,

which reduces to

1 + (y0)2 =1

2gc21(h− y)=

A2

h− y (18a)

withA2 =

1

2gc21= constant (18b)

Solving Equation (18a) for y0 leads to

y0 = ±s

A2

h− y − 1 = ±sA2 − (h− y)

h− y ,

or, sh− y

A2 − (h− y)dy = ±dx

which gives Z sh− y

A2 − (h− y)dy = ±Zdx = ±x+ c2.

If we let, u2 = h− y, then 2udu = −dy, so that

−Z s

u2

A2 − u22udu = ±x+ c2,

or

∓x− c2 =Z 2u2√

A2 − u2du = −u

√A2 − u2 +A2 sin−1

µu

A

¶and so (after putting in the fact that u2 = h− y), we have

∓x− c2 = −qh− y

qA2 − (h− y) +A2 sin−1

Ã√h− yA

!

24

or

∓x− c2 = −q(h− y)(A2 − h+ y) +A2 sin−1

Ã√h− yA

!.

Putting in the condition, y = h when x = 0, we get

∓0− c2 = −q(h− h)(A2 − h+ h) +A2 sin−1

Ã√h− hA

!

which leads to c2 = 0, and so

∓x = −q(h− y)(A2 − h+ y) +A2 sin−1

Ã√h− yA

!. (19)

If we now let√h− yA

= sin(θ) or y = h−A2 sin2(θ),

then

∓x = −qA2 sin2(θ)(A2 −A2 sin2(θ)) +A2 sin−1(sin(θ))

= −qA2 sin2(θ)A2 cos2(θ) +A2θ

= −A2 sin(θ) cos(θ) +A2θ

or

x = ±A2µθ − 1

2sin(2θ)

¶= ±A

2

2(2θ − sin(2θ)).

Therefore we have

x = ±A2

2(2θ − sin(2θ))

and

y = h−A2 sin2(θ) = h−A2Ã1− cos(2θ)

2

!= h− A

2

2(1− cos(2θ)).

If we let ϕ = 2θ, then we have for x and y,

x = ±A2

2(ϕ− sin(ϕ)) (20a)

25

and

y = h− A2

2(1− cos(ϕ)) (20b)

Now by the construction of the problem, we see that x ≥ 0 and so the positivesign is to be used in Equation (20a), and if we call the constant A2/2, the positiveconstant B, we now have

x = B(ϕ− sin(ϕ)) (20c)

andy = h−B(1− cos(ϕ)) (20d)

for 0 ≤ ϕ ≤ β, where ϕ = β when the bead is at the point x = a, y = 0. Thissays that β is determined using the equations

a = B(β − sin(β)) and 0 = h−B(1− cos(β))

so thatB(β − sin(β))B(1− cos(β)) =

a

h

resulting in1− cos(β)β − sin(β) =

h

a(21a)

which gives the value of β. Once β is known then either

B =h

1− cos(β) or B =a

β − sin(β) (21b)

gives the value of B. Thus the solution to the Brachistochrome problem is givenby

x = a

Ãϕ− sin(ϕ)β − sin(β)

!(22a)

and

y = h− hÃ1− cos(ϕ)1− cos(β)

!= h

Ãcos(ϕ)− cos(β)1− cos(β)

!(22b)

for 0 ≤ ϕ ≤ β, where β is the solution to

F1(β) ≡1− cos(β)β − sin(β) =

h

a. (22c)

26

A plot of F1(β) versus β is shown in the figure below.

20151050

0.8

0.6

0.4

0.2

0

Plot of F1(β) versus β

Note that

limβ→0+

F1(β) = limβ→0+

Ã1− cos(β)β − sin(β)

!= lim

β→0+

Ãβ2/2

β3/6

!= lim

β→0+

Ã3

β

!→ +∞.

We may now solve for the minimum transit time in terms of β by first placingEquation (18a) into Equation (15) and getting

Tmin =Z a

0

vuut 1 + (y0)2

2g(h− y)dx =Z a

0

vuutA2/(h− y)2g(h− y) dx =

A√2g

Z a

0

dx

h− y

Then Putting in Equations (22a,b) we get

Tmin =A√2g

Z a

0

dx

h− y =A√2g

Z β

0

a³1−cos(ϕ)β−sin(β)

´dϕ

h³1−cos(ϕ)1−cos(β)

´=

Aa

h√2g

Ã1− cos(β)β − sin(β)

!Z β

0dϕ =

Aaβ

h√2g

Ã1− cos(β)β − sin(β)

!

But

A =√2B =

vuut 2(h− b)1− cos(β)

27

and so

Tmin =aβ

h√2g

Ã1− cos(β)β − sin(β)

!s2h

1− cos(β)

= a

Ãβ

β − sin(β)

!vuut1− cos(β)g(h− b)

or

Tmin =

Ãaβ

β − sin(β)

!s1− cos(β)

gh.

But from Equation (21a) we have

1− cos(β)β − sin(β) =

h

aand so

1− cos(β)h

=β − sin(β)

a

and so

Tmin =

Ãaβ

β − sin(β)

!sβ − sin(β)

agor

Tmin =

vuut aβ2

g(β − sin(β)) = F2(β)sa

g(23)

with

F2(β) =βq

β − sin(β)A plot of F2(β) versus β is shown below.

20151050

7

6

5

4

3

2

1

0

A Plot of F2(β) versus β for 0 < β < 20along with the

√β

28

Note that

limβ→0+

F2(β) = limβ→0+

⎛⎝ βqβ − sin(β)

⎞⎠ = limβ→0+

⎛⎝ βqβ3/6

⎞⎠ = limβ→0+

Ãs6

β

!→ +∞.

7.1 Some Examples of The Brachistochrome Problem

Let us now consider some examples or plots for the particle’s path. Supposethat h = 2 m, and a = 1m. Then Equation (22c) becomes,

1− cosββ − sinβ = 2 which gives β ' 1.401379455902

as the smallest solution. Note we choose the smallest solution since it will lead tothe least time as seen by the plot of F2(β) versus β. Using Equations (22a,b) weget

x =ϕ− sinϕβ − sinβ and y = 2

Ãcosϕ− cosβ1− cosβ

!and a plot of y versus x yields the figure below.

10.80.60.40.20

2

1.5

1

0.5

0

The Particle’s Path usingh = 2 m, a = 1 m

Using Equation (23), we get the minimum time of Tmin ' 0.6943 seconds. Thestraight-line time is

Tline =

s2(a2 + h2)

gh=

vuut2(12 + 22)(9.8)(2)

= 0.7143 seconds

29

which is larger than 0.6943 seconds.

Suppose that h = 1 m, and a = 2m. Then Equation (22c) becomes,

1− cosββ − sinβ =

1

2which gives β ' 3.508368768524

as the smallest solution. Using Equations (1122a,b) we get

x = 2

Ãϕ− sinϕβ − sinβ

!and y =

cosϕ− cosβ1− cosβ .

and so a plot of y versus x yields the figure below.

21.510.50

1

0.8

0.6

0.4

0.2

0

The Particle’s Path Usingh = 1 m, a = 2 m

Using Equation (23), we get the minimum time of Tmin ' 0.8060 seconds. Notehere that the particle’s path dips below the horizontal level at y = 0. The straight-line time is

Tline =

s2(a2 + h2)

gh=

vuut2(22 + 12)(9.8)(1)

= 1.0102 seconds

which is larger than 0.8060 seconds.

Suppose that h = a. Then Equation (22c) becomes,

1− cosββ − sinβ = 1 which gives β ' 2.412011143914

30

as the smallest solution. Using Equations (22a,b) we get

x = h

Ãϕ− sinϕβ − sinβ

!and y = h

Ãcosϕ− cosβ1− cos β

!

and so a plot of y versus x (in units of h) yields the figure below.

10.80.60.40.20

1

0.8

0.6

0.4

0.2

0

The Particle’s Path Using h = a

Using Equation (23), we get the minimum time of Tmin ' 0.5832 seconds whenh = 1 meter. The straight-line time is

Tline =

s2(a2 + h2)

gh=

vuut2(12 + 12)(9.8)(1)

= 0.6389 seconds

which is larger than 0.5832 seconds.

Suppose that h = 1 m and a = 5m. Then Equation (22c) becomes,

1− cosββ − sinβ =

1

5which gives β ' 4.594585712433

as the smallest solution. Using Equations (22a,b) we get

x = 5

Ãϕ− sinϕβ − sinβ

!and y =

cosϕ− cosβ1− cosβ

31

and so a plot of y versus x yields the figure below.

543210

1

0.8

0.6

0.4

0.2

0-0.2

-0.4

-0.6

-0.8

The Particle’s Path Usingh = 1 m and a = 5 m

Using Equation (23), we get the minimum time of Tmin ' 1.3884 seconds. Notethat once again, the particle’s path dips way below the horizontal level at y = 0.The straight-line time is

Tline =

s2(a2 + h2)

gh=

vuut2(52 + 12)(9.8)(1)

= 2.3035 seconds

which is larger than 1.3884 seconds.

Suppose that h = 1 m and a = 10m. Then Equation (22c) becomes,

1− cosββ − sinβ =

1

10which gives β ' 5.119770812559

as the smallest solution. Using Equations (22a,b) we get

x = 10

Ãϕ− sinϕβ − sinβ

!and y =

cosϕ− cos β1− cosβ

32

and so a plot of y versus x yields the figure below.

1086420

1

0.5

0-0.5

-1

-1.5

-2

The Particle’s Path Usingh = 1 m and a = 10 m

Using Equation (23), we get the minimum time of Tmin ' 2.1047 seconds and thatonce again, the particle’s path dips way below the horizontal level at y = 0. Thestraight-line time is

Tline =

s2(a2 + h2)

gh=

vuut2(102 + 12)(9.8)(1)

= 4.5401 seconds

which is larger than 2.1047 seconds.

Note that another solution to Equation (22c),

1− cosββ − sinβ =

1

10is β ' 7.5032475203025.

Using Equations (22a,b) we get

x = 10

Ãϕ− sinϕβ − sinβ

!and y =

cosϕ− cos β1− cosβ

33

and so a plot of y versus x yields the figure below.

1086420

1

0.5

0-0.5

-1

-1.5

-2

The Particle’s Path Usingthe Larger Value of β

Note how the particle goes back up to its original height of 1 meters before comingback down. It is clear that the larger value of β in the solution of Equation (22c)should not be used. Note also that using Equation (23), we get the time of thismotion to be Tmin ' 2.9583 seconds which is larger that the value of 2.1047seconds.

It should be clear that the particle’s path will dip below the horizontal levelat y = 0 when dy/dϕ > 0, which occurs when

dy

dϕ=d

Ãh− h

Ã1− cosϕ1− cosβ

!!= −h

Ãsinϕ

1− cosβ

!> 0

or when sinϕ < 0, which occurs when ϕ > π. Therefore, going back to the plotof

F1(β) =1− cosββ − sinβ

and finding when its value at β = π, we get

F1(π) =1− cosππ − sinπ =

2

π.

Therefore whenh− ba

<2

π

34

the particle’s path will not dip below the horizontal level at y = 0, and when

h− ba

>2

π

it does dip below the horizontal level at y = 0. In fact when

h− ba

=2

π

the path should be tangent to the horizontal level at y = 0. The following exampleshows this to be true. Suppose that h = 2 meters, a = πmeters, and b = 0. ThenEquation (22c) becomes,

1− cosββ − sinβ =

2

πwhich gives β = π

as the smallest solution. Using Equations (22a,b) we get

x = π

Ãϕ− sinϕβ − sinβ

!= x = π

µϕ− sinϕπ − sinπ

¶= ϕ− sinϕ

and

y = 2− 2Ã1− cosϕ1− cosβ

!= 2− 2

µ1− cosϕ1− cosπ

¶= 1 + cosϕ

and so a plot of y versus x yields the figure below.

32.521.510.50

2

1.5

1

0.5

0

The Particle’s Path Usingh = 2 m and a = π m

35

Using Equation (23), we get the minimum time

Tmin =

vuut aπ2

g(π − sinπ) =saπ

g' 1.003 seconds.

The straight-line time is

Tline =

s2(a2 + h2)

gh=

vuut2(π2 + 22)(9.8)(2)

= 1.1896 seconds

which is larger than 1.003 seconds.

8. Functional Dependent On Several Dependent Variables

Suppose a functional is dependent on several dependent variables such as

J [y1, y2, y3, . . . , yn] =Z x2

x1f(x, y1, y2, y3, . . . , yn, y

01, y

02, y

03, . . . , y

0n)dx

which we abbreviate by

J [yk] =Z x2

x1f(x, yk, y

0k)dx. (24)

Ify1 = y1(x) , y2 = y2(x) , ... , yn = yn(x)

are the functions which satisfy the boundary conditions

yk(x1) = yk1 and yk(x2) = yk2

for k = 1, 2, 3, . . . , n, and which maximize or minimize J , then we assume

yk(x,α) = yk(x) + αηk(x)

are small deviations from these, with

ηk(x1) = 0 and ηk(x2) = 0

for k = 1, 2, 3, . . . , n. This leads to

J(α) =Z x2

x1f(x, yk(x,α), y

0k(x,α))dx

36

and we have

∂J(α)

∂α=

∂α

Z x2

x1f(x, yk(x,α), y

0k(x,α))dx

=Z x2

x1

∂αf(x, yk(x,α), y

0k(x,α))dx

=Z x2

x1

(∂f

∂x

∂α

∂x+

nXk=1

̶f

∂yk(x,α)

∂yk(x,α)

∂x+

∂f

∂y0k(x,α)

∂y0k(x,α)

∂x

!)dx

=Z x2

x1

nXk=1

̶f

∂yk(x,α)ηk(x) +

∂f

∂y0k(x,α)η0k(x)

!dx

=nXk=1

Z x2

x1

̶f

∂yk(x,α)ηk(x) +

∂f

∂y0k(x,α)η0k(x)

!dx.

Using integration by parts on the second integral, we have

Z x2

x1

∂f

∂y0k(x,α)η0k(x)dx =

∂f

∂y0k(x,α)ηk(x)

¯¯x2

x1

−Z x2

x1

d

dx

̶f

∂y0k(x,α)

!ηk(x)dx

= −Z x2

x1

d

dx

̶f

∂y0k(x,α)

!ηk(x)dx

since ηk(x1) = 0 and ηk(x2) = 0 for k = 1, 2, 3, . . . , n. Thus we have

∂J(α)

∂α=

nXk=1

Z x2

x1

̶f

∂yk(x,α)− d

dx

̶f

∂y0k(x,α)

!!ηk(x)dx.

Then∂J(α)

∂α

¯¯α=0

=nXk=1

Z x2

x1

̶f

∂yk(x)− d

dx

̶f

∂y0k(x)

!!ηk(x)dx = 0

for all choices of ηk(x), leads only to

∂f

∂yk(x)− d

dx

̶f

∂y0k(x)

!= 0

ord

dx

̶f

∂y0k

!− ∂f

∂yk= 0 (25)

for k = 1, 2, 3, . . . , n. These are the Euler-Lagrange Equations.

37

It is important to point out that although the equations

d

dx

̶f

∂y0

!− ∂f

∂y= 0

andd

dx

Ãy0∂f

∂y0− f

!+

∂f

∂x= 0

are equivalent when f is a function of only one dependent variable, y, i.e., f =f(x, y, y0), IT IS NOT TRUE that

d

dx

̶f

∂y0k

!− ∂f

∂yk= 0

andd

dx

Ãy0k

∂f

∂y0k− f

!+

∂f

∂x= 0

are equivalent when f is a function of more than one dependent variable, y, i.e.,f = f(x, yk, y

0k).

What we can do, however, is derive an additional equation that follows fromthe n equations given in Equation (25). To do this we compute

df

dx=

∂f

∂x+

nXk=1

̶f

∂yk

dykdx

+∂f

∂y0k

dy0kdx

!

Replacing each of∂f

∂ykby

d

dx

̶f

∂y0k

!via Equation (25), we then have

df

dx=

∂f

∂x+

nXk=1

Ãd

dx

̶f

∂y0k

!dykdx

+∂f

∂y0k

dy0kdx

!

=∂f

∂x+

nXk=1

d

dx

Ãy0k

∂f

∂y0k

!

38

and so we find thatd

dx

ÃnXk=1

y0k∂f

∂y0k− f

!+

∂f

∂x= 0. (26)

If ∂f/∂x = 0, then we may say that

nXk=1

y0k∂f

∂y0k− f = constant. (27)

9. Extra Constraints and Lagrange Multipliers In Ordinary Calculus

In this section, we investigate the Lagrange Multiplier method for solving theextrema problems from ordinary calculus involving a function of n variables withm < n equality constraints. That is, we want to

max or min z = f(x1, x2, x3, . . . , xn) (Objective Function)subject to

g1(x1, x2, x3, . . . , xn) = b1 (Constraint #1)g2(x1, x2, x3, . . . , xn) = b2 (Constraint #2)g3(x1, x2, x3, . . . , xn) = b3 (Constraint #3)

......

gm(x1, x2, x3, . . . , xn) = bm (Constraint #m)

where all functions f and the gi’s are known and the bi’s are known constants.Note that in general, we have m < n. To introduce the Method of LagrangeMultipliers, we consider the following example, which we solve two different ways.

Example 9.1

Consider the following problem.

minimize z = f(x1, x2, x3) = x21 + x

22 + x

23 (Objective Function)

subject tox1 + 2x2 + x3 = 30 (Constraint #1)3x1 + 7x2 + x3 = 50 (Constraint #2)

One method of solution requires that we solve the two constraint equations for(say) x1 and x2 in terms of x3, and then place this into the objective function,

39

thereby yielding a function of a single variable. This leads to"x1x2

#=

"110− 5x3−40 + 2x3

#

andz = (110− 5x3)2 + (−40 + 2x3)2 + x23

which reduces toz = 13700− 1260x3 + 30x23 = f(x3).

Thenf 0(x3) = −1260 + 60x3 = 0 yields x3 = 21

and the fact that f 00(x3) = 60 > 0 for all x3, we know that this is a local minimum.If fact, it is a global minimum as well. Then"

x1x2

#=

"110− 5x3−40 + 2x3

#=

"110− 5(21)−40 + 2(21)

#=

"52

#

so that(x1, x2, x3) = (5, 2, 21)

is the point yielding zmin = 52+22+212 = 470. Note that the above critical pointcannot be obtained by simply solving

∂f(x1, x2, x3)

∂x1=

∂f(x1, x2, x3)

∂x2=

∂f(x1, x2, x3)

∂x3= 0.

This is because not all of x1, x2 and x3 are independent, since they are relatedvia the equality constraints, and hence

df =∂f(x1, x2, x3)

∂x1dx1 +

∂f(x1, x2, x3)

∂x2dx2 +

∂f(x1, x2, x3)

∂x3dx3 = 0

does not imply that

∂f(x1, x2, x3)

∂x1=

∂f(x1, x2, x3)

∂x2=

∂f(x1, x2, x3)

∂x3= 0.

40

The method of Lagrange Multipliers introduces for each constraint a multiplierand defines a new objective function

L = x21 + x22 + x

23 + λ1(30− (x1 + 2x2 + x3)) + λ2(50− (3x1 + 7x2 + x3)).

In this way x1, x2, x3 along with λ1 and λ2 can be treated as independent andthe critical point(s) can be obtained by solving the set of equations

∂L

∂x1=

∂L

∂x2=

∂L

∂x3=

∂L

∂λ1=

∂L

∂λ2= 0

To solve this unconstrained problem, we thus write

∂L

∂x1= 2x1 − λ1 − 3λ2 = 0

∂L

∂x2= 2x2 − 2λ1 − 7λ2 = 0

∂L

∂x3= 2x3 − λ1 − λ2 = 0

∂L

∂λ1= 30− (x1 + 2x2 + x3) = 0

∂L

∂λ2= 50− (3x1 + 7x2 + x3) = 0

with

L = x21 + x22 + x

23 + λ1(30− (x1 + 2x2 + x3)) + λ2(50− (3x1 + 7x2 + x3)).

This leads to the following system of 5 equations and 5 unknowns

2x1 = λ1 + 3λ2

2x2 = 2λ1 + 7λ2

2x3 = λ1 + λ2

x1 + 2x2 + x3 = 30

3x1 + 7x2 + x3 = 60.

To solve this system of 5 equations and 5 unknowns, we use the first three equa-tions to solve for x1, x2 and x3 in terms of λ1 and λ2. This leads to

x1 =λ1 + 3λ2

2, x2 =

2λ1 + 7λ22

& x3 =λ1 + λ22

.

41

These are then placed into the last two equations to yield two equations for thetwo unknowns λ1 and λ2. This leads to

30−Ãλ1 + 3λ2

2

!− 2

Ã2λ1 + 7λ2

2

!−Ãλ1 + λ22

!= 0

and

50− 3Ãλ1 + 3λ2

2

!− 7

Ã2λ1 + 7λ2

2

!−Ãλ1 + λ22

!= 0

or simplyλ1 + 3λ2 = 10

and18λ1 + 59λ2 = 100.

the solution to these then gives λ1 = 58 and λ2 = −16. Putting these into theequations for x1, x2 and x3 leads to (x1, x2, x3) = (5, 2, 21) which agrees with ourearlier results.

Consider now the extrema problem involving n decision variables andm equal-ity constraints (with m < n),

max or min z = f(x1, x2, x3, . . . , xn) (Objective Function)subject to

g1(x1, x2, x3, . . . , xn) = b1 (Constraint #1)g2(x1, x2, x3, . . . , xn) = b2 (Constraint #2)g3(x1, x2, x3, . . . , xn) = b3 (Constraint #3)

......

gm(x1, x2, x3, . . . , xn) = bm (Constraint #m)

where all functions f and the gi’s are known and the bi’s are known constants.The m equality constraints indicates that not all the n variables are independent.In fact, if all the equality constraints are independent, then only n − m of then variables are independent. This means that we cannot determine the criticalpoints by simply solving

∂f(x1, x2, x3, . . . , xn)

∂xj= 0

42

for j = 1, 2, 3, . . . , n. To get around this problem, we introduce a Lagrange Mul-tiplier λi for each constraint, and define the function

L(x,λ) = f(x) +mXi=1

λi(bi − gi(x)) (28a)

where x ≡ (x1, x2, x3, . . . , xn) and λ ≡ (λ1,λ2,λ3, . . . ,λm). Then, to determinethe critical points to the objective function f(x), we solve the system of n +mequations

∂L

∂xj= 0 and

∂L

∂λi= 0

for j = 1, 2, 3, . . . , n and i = 1, 2, 3, . . . ,m. This leads to the n+m equations

∂f(x)

∂xj−

mXi=1

λi∂gi(x)

∂xj= 0 (28b)

for j = 1, 2, 3, . . . , n andbi − gi(x) = 0 (28c)

for i = 1, 2, 3, . . . ,m. Note that the last set of m equations are just the mconstraint equations.

10. Extra Constraints and Lagrange Multipliers In Variational Calculus

The Lagrange Multiplier method discussed above for ordinary calculus prob-lems can be extended to problems in variational calculus and perhaps the simplestway to show this is by solving the Hanging Cable Problem introduced above.

Example 10.1: The Hanging Chain Problem - A Solution

A uniform flexible chain (having linear mass density ρ) hangs under its ownweight between two support. The chain has a fixed length of L and we seek thatshape y = y(x), which minimizes the total potential energy of the chain. Towardthis end we assume that the chain hangs between the two points (−a, 0) and (a, 0).We had seen that to solve this problem, we seek a function y = y(x) so that theboundary conditions

y(−a) = y(a) = 0 (29a)

43

are satisfied along with the extra constraint,

L[y] =Z a

−a

q1 + (y0)2dx = L = constant (29b)

for whichU [y] = ρg

Z a

−ayq1 + (y0)2dx (29c)

is as small as possible. The Lagrange Multiplier method considers the functional

J [y] =Z a

−aρgy

q1 + (y0)2dx+ λ

µZ a

−a

q1 + (y0)2dx− L

¶or

J [y] =Z a

−a(ρgy + λ)

q1 + (y0)2dx− λL.

Now λ and L are constants, but in some cases, it may be necessary to allow λ tobe a function of x. For here, since λL is a constant, let us consider just

J [y] =Z a

−a(ρgy + λ)

q1 + (y0)2dx

and using

f(x, y, y0) = (ρgy + λ)q1 + (y0)2 with

∂f

∂x= 0

in the Euler-Lagrange Equation (12b), we have

y0∂f

∂y0− f = constant

yielding

y0(ρgy + λ)y0q1 + (y0)2

− (ρgy + λ)q1 + (y0)2 = A

orρgy + λq1 + (y0)2

= −A.

Solving for y0, we get

y0 =

q(ρgy + λ)2 −A2

A

44

which leads to

dx =Adyq

(ρgy + λ)2 −A2

which leads to

x+B =A

ρgcosh−1

Ãρgy + λ

A

!.

Solving for y, we get

y(x) =A

ρgcosh

µρg

A(x+B)

¶− λ

ρg.

Setting y(−a) = y(a) = 0 requires that B = 0 so that

y(x) =A

ρgcosh

µρgx

A

¶− λ

ρg

and

y(a) =A

ρgcosh

µρga

A

¶− λ

ρg= 0

leads toλ = A cosh

µρga

A

¶.

Thus we have

y(x) =A

ρgcosh

µρgx

A

¶− A

ρgcosh

µρga

A

¶or simply

y(x) =cosh(Cx)− cosh(Ca)

Cwhere C =

ρg

A.

To determine C, we still have to require that

L =Z a

−a

q1 + (y0)2dx

which leads to

L =Z a

−a

q1 + sinh2(Cx)dx =

Z a

−acosh(Cx)dx =

2 sinh(Ca)

C

45

Therefore we find that the solution to the hanging chain problem is

y(x) =cosh(Cx)− cosh(Ca)

C

C is the solution to the equation

sinh(Ca)

Ca=L

2a.

A plot of the function sinh(z)/z is shown in the figure below.

z 3210-1-2-3

4

3

2

1

0

A Plot of the function sinh(z)/z

Since recall that 2a < L so that L/2a > 1, we see that there will always be asolution to the equation

sinh(Ca)

Ca=L

2a> 1.

Example 10.2: The Maximum Area Problem

Given a simply closed curve in the xy plane that has a fixed perimeter P ,we seek that curve which encloses the maximum area. Note that a simple closedcurve is one that DOES NOT intersect itself. For example, a figure ”8” is not asimple closed curve since it intersects itself at the center. A simple closed curvecan be written in parametric form as

r = r(t) = x(t)bi+ y(t)bj46

and its perimeter can be computed using

P =IC|dr| =

IC

qdx2 + dy2 =

Z t2

t1

qx2 + y2dt

where t1 6= t2 and r(t1) = r(t2). The area enclosed is computed using

A =1

2

IC|r× dr| = 1

2

IC(xdy − ydx) = 1

2

Z t2

t1(xy − yx)dt.

Thus we seek two functions x = x(t) and y = y(t) such that the boundaryconditions

x(t1) = x(t2) and y(t1) = y(t2)

are satisfied along with the constraint,

P [x, y] =Z t2

t1

qx2 + y2dt = P = constant

and for whichA[x, y] =

1

2

Z t2

t1(xy − yx)dt

is as large as possible. To solve this we introduce the Lagrange multiplier λ anddefine

L[x, y] =1

2

Z t2

t1(xy − yx)dt− λ

Z t2

t1

qx2 + y2dt

=Z t2

t1

½1

2(xy − yx)− λ

qx2 + y2

¾dt

orL[x, y] =

Z t2

t1f(x, y, x, y)dt

wheref =

1

2(xy − yx)− λ

qx2 + y2

Thend

dt

̶f

∂x

!− ∂f

∂x= 0 and

d

dt

̶f

∂y

!− ∂f

∂y= 0

yieldsd

dt

Ã−12y − λ

x√x2 + y2

!− 12y = 0

47

andd

dt

Ã1

2x− λ

y√x2 + y2

!−µ−12x¶= 0.

These reduce to

y +d

dt

Ãλx√x2 + y2

!= 0 and x− d

dt

Ãλy√x2 + y2

!= 0

or, after integrating with respect to time t,

y +λx√x2 + y2

= c2 and x− λy√x2 + y2

= c1

which says that

(y − c2)2 + (x− c1)2 =λ2x2

x2 + y2+

λ2y2

x2 + y2= λ2

or(x− c1)2 + (y − c2)2 = λ2

which is the equation of a circle of radius λ and center (c1, c2), and these can beobtained using the boundary conditions

x(t1) = x(t2) and y(t1) = y(t2)

along with the constraint,

P [x, y] =Z t2

t1

qx2 + y2dt = P = constant.

11. Variational Principles in Physics - Least Time In Optics

Many fundamental principles in physics are based on variational principles.For example, Fermat’s principle in optics states that when light travels betweentwo point A and B in space, the path taken by the light will be that path whichminimizes the time of travel. Consequently, if light is traveling through a mediumhaving index of refraction n(x, y, z), then the speed of light through this mediumis given by

v =c

n(x, y, z)

48

where c is the speed of light in vacuum. Therefore, if r = r(t) is the path followingby the light from point A to point B, then

v =c

n(x, y, z)=

¯¯drdt

¯¯ = |dr|

dt

resulting in

dt =n(x, y, z)

c|dr| = n(x, y, z)

cds

where ds = |dr| is arc length. Thus we have

T =1

c

Z B

An(x, y, z)ds (30a)

as the time of travel which is to be minimized. For planar motion, this is simplifiedto

T [y] =1

c

Z xB

xAn(x, y)

q1 + (y0)2dx (30b)

and can be minimized using the methods discussed above. In fact, using theEuler-Lagrange equations, we find that

d

dx

⎛⎝ n(x, y)y0q1 + (y0)2

⎞⎠− ∂n(x, y)

∂y

q1 + (y0)2 = 0

along with the boundary conditions gives the path followed by the light. After alittle algebra, this equation reduces to the non-linear differential equation

n(x, y)y00 +

̶n(x, y)

∂xy0 − ∂n(x, y)

∂y

!(1 + (y0)2) = 0. (31)

Of course if n is not an explicit function of y, then we have

∂y0

µn(x)

q1 + (y0)2

¶= constant

which reduces toy0q

1 + (y0)2=

1

An(x)

or

dy =dxq

A2n2(x)− 1(32a)

49

for a constant A. On the other hand, if n is not an explicit function of x, then wehave

y0∂

∂y0

µn(y)

q1 + (y0)2

¶− n(y)

q1 + (y0)2 = constant

or−n(y)q1 + (y0)2

=1

A

which reduces to

dx =dyq

A2n2(y)− 1(32b)

for a constant A.

12. Variational Principles in Physics - Least Action In Mechanics

Hamilton’s principle in mechanics involving a system of N particles with ndegrees of freedom described by the generalized coordinates q1, q2, q3, . . . , qn andcorresponding generalized speeds q1, q2, q3, . . . , qn can be stated as follows. Sup-pose that such a system can move from an initial configuration α at time tα toanother configuration β at time tβ. The action of the system between these twotimes is defined by

A =Z tβ

tαL(qk, qk, t)dt (33a)

where L = T − V is the Lagrangian of the system. Hamilton’s principle of leastaction simply states that a mechanical system evolves in time so that A is as smallas possible. Of course, using the ideas discussed above, we may now say that thisleads to the set of Lagrange equations

d

dt

Ã∂L∂qk

!− ∂L

∂qk= 0 (33b)

for k = 1, 2, 3, . . . , n.

50