The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain...

22
The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012 Calculus III (James Madison University) Math 237 October 31, 2012 1/6

Transcript of The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain...

Page 1: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

The Chain Rule and The Gradient

Department of Mathematics and Statistics

October 31, 2012

Calculus III (James Madison University) Math 237 October 31, 2012 1 / 6

Page 2: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

The Chain Rule

Their are various versions of the chain rule for multivariable functions.For example,

Theorem (Version I)

Given functions z = f (x , y), x = u(t) and y = v(t), for all values of t atwhich u and v are differentiable and f is differentiable at (u(t), v(t)), wehave

dzdt = ∂z

∂xdxdt + ∂z

∂ydydt .

Theorem (Version II)

Given functions z = f (x , y), x = u(s, t) and y = v(s, t), for all values of sand t at which u and v are differentiable and f is differentiable at(u(s, t), v(s, t)), we have

∂z∂s = ∂z

∂x∂x∂s + ∂z

∂y∂y∂s and ∂z

∂t = ∂z∂x

∂x∂t + ∂z

∂y∂y∂t .

Calculus III (James Madison University) Math 237 October 31, 2012 2 / 6

Page 3: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

The Chain Rule

Their are various versions of the chain rule for multivariable functions.For example,

Theorem (Version I)

Given functions z = f (x , y), x = u(t) and y = v(t), for all values of t atwhich u and v are differentiable and f is differentiable at (u(t), v(t)), wehave

dzdt = ∂z

∂xdxdt + ∂z

∂ydydt .

Theorem (Version II)

Given functions z = f (x , y), x = u(s, t) and y = v(s, t), for all values of sand t at which u and v are differentiable and f is differentiable at(u(s, t), v(s, t)), we have

∂z∂s = ∂z

∂x∂x∂s + ∂z

∂y∂y∂s and ∂z

∂t = ∂z∂x

∂x∂t + ∂z

∂y∂y∂t .

Calculus III (James Madison University) Math 237 October 31, 2012 2 / 6

Page 4: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

The Chain Rule

Their are various versions of the chain rule for multivariable functions.For example,

Theorem (Version I)

Given functions z = f (x , y), x = u(t) and y = v(t), for all values of t atwhich u and v are differentiable and f is differentiable at (u(t), v(t)), wehave

dzdt = ∂z

∂xdxdt + ∂z

∂ydydt .

Theorem (Version II)

Given functions z = f (x , y), x = u(s, t) and y = v(s, t), for all values of sand t at which u and v are differentiable and f is differentiable at(u(s, t), v(s, t)), we have

∂z∂s = ∂z

∂x∂x∂s + ∂z

∂y∂y∂s and ∂z

∂t = ∂z∂x

∂x∂t + ∂z

∂y∂y∂t .

Calculus III (James Madison University) Math 237 October 31, 2012 2 / 6

Page 5: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

The Chain Rule

Their are various versions of the chain rule for multivariable functions.For example,

Theorem (Version I)

Given functions z = f (x , y), x = u(t) and y = v(t), for all values of t atwhich u and v are differentiable and f is differentiable at (u(t), v(t)), wehave

dzdt = ∂z

∂xdxdt + ∂z

∂ydydt .

Theorem (Version II)

Given functions z = f (x , y), x = u(s, t) and y = v(s, t), for all values of sand t at which u and v are differentiable and f is differentiable at(u(s, t), v(s, t)), we have

∂z∂s = ∂z

∂x∂x∂s + ∂z

∂y∂y∂s and ∂z

∂t = ∂z∂x

∂x∂t + ∂z

∂y∂y∂t .

Calculus III (James Madison University) Math 237 October 31, 2012 2 / 6

Page 6: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

The Chain Rule

Their are various versions of the chain rule for multivariable functions.For example,

Theorem (Version I)

Given functions z = f (x , y), x = u(t) and y = v(t), for all values of t atwhich u and v are differentiable and f is differentiable at (u(t), v(t)), wehave

dzdt = ∂z

∂xdxdt + ∂z

∂ydydt .

Theorem (Version II)

Given functions z = f (x , y), x = u(s, t) and y = v(s, t), for all values of sand t at which u and v are differentiable and f is differentiable at(u(s, t), v(s, t)), we have

∂z∂s = ∂z

∂x∂x∂s + ∂z

∂y∂y∂s and ∂z

∂t = ∂z∂x

∂x∂t + ∂z

∂y∂y∂t .

Calculus III (James Madison University) Math 237 October 31, 2012 2 / 6

Page 7: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

The Gradient

Definition

Let z = f (x , y) be a function of two variables, the gradient of f is thevector function defined by

∇f (x , y) = ∂f∂x i + ∂f

∂y j = 〈fx(x , y), fy (x , y)〉.

Similarly, if w = f (x , y , z) is a function of three variables the gradient off is the vector function defined by

∇f (x , y , z) = ∂f∂x i + ∂f

∂y j + ∂f∂z k = 〈fx(x , y , z), fy (x , y , z), fz(x , y , z)〉.

The domain of the gradient is the set of all points in the domain of f atwhich the partial derivatives exist.

The symbol ∇f is read “the gradient of f ”, “grad f ” or “del f ”.

Calculus III (James Madison University) Math 237 October 31, 2012 3 / 6

Page 8: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

The Gradient

Definition

Let z = f (x , y) be a function of two variables, the gradient of f is thevector function defined by

∇f (x , y) = ∂f∂x i + ∂f

∂y j = 〈fx(x , y), fy (x , y)〉.

Similarly, if w = f (x , y , z) is a function of three variables the gradient off is the vector function defined by

∇f (x , y , z) = ∂f∂x i + ∂f

∂y j + ∂f∂z k = 〈fx(x , y , z), fy (x , y , z), fz(x , y , z)〉.

The domain of the gradient is the set of all points in the domain of f atwhich the partial derivatives exist.

The symbol ∇f is read “the gradient of f ”, “grad f ” or “del f ”.

Calculus III (James Madison University) Math 237 October 31, 2012 3 / 6

Page 9: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

The Gradient

Definition

Let z = f (x , y) be a function of two variables, the gradient of f is thevector function defined by

∇f (x , y) = ∂f∂x i + ∂f

∂y j = 〈fx(x , y), fy (x , y)〉.

Similarly, if w = f (x , y , z) is a function of three variables the gradient off is the vector function defined by

∇f (x , y , z) = ∂f∂x i + ∂f

∂y j + ∂f∂z k = 〈fx(x , y , z), fy (x , y , z), fz(x , y , z)〉.

The domain of the gradient is the set of all points in the domain of f atwhich the partial derivatives exist.

The symbol ∇f is read “the gradient of f ”, “grad f ” or “del f ”.

Calculus III (James Madison University) Math 237 October 31, 2012 3 / 6

Page 10: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

The Gradient

Definition

Let z = f (x , y) be a function of two variables, the gradient of f is thevector function defined by

∇f (x , y) = ∂f∂x i + ∂f

∂y j = 〈fx(x , y), fy (x , y)〉.

Similarly, if w = f (x , y , z) is a function of three variables the gradient off is the vector function defined by

∇f (x , y , z) = ∂f∂x i + ∂f

∂y j + ∂f∂z k = 〈fx(x , y , z), fy (x , y , z), fz(x , y , z)〉.

The domain of the gradient is the set of all points in the domain of f atwhich the partial derivatives exist.

The symbol ∇f is read “the gradient of f ”, “grad f ” or “del f ”.

Calculus III (James Madison University) Math 237 October 31, 2012 3 / 6

Page 11: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

Uses of the Gradient

Our primary uses for the gradient will be:

Locating extrema – places where ∇f (x , y) = 0 or where ∇f (x , y)doesn’t exist are candidates for the location of extrema.

Providing a shortcut for computing the directional derivative.

Finding the direction of most rapid increase and decrease of thefunction.

Calculus III (James Madison University) Math 237 October 31, 2012 4 / 6

Page 12: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

Uses of the Gradient

Our primary uses for the gradient will be:

Locating extrema – places where ∇f (x , y) = 0 or where ∇f (x , y)doesn’t exist are candidates for the location of extrema.

Providing a shortcut for computing the directional derivative.

Finding the direction of most rapid increase and decrease of thefunction.

Calculus III (James Madison University) Math 237 October 31, 2012 4 / 6

Page 13: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

Uses of the Gradient

Our primary uses for the gradient will be:

Locating extrema – places where ∇f (x , y) = 0 or where ∇f (x , y)doesn’t exist are candidates for the location of extrema.

Providing a shortcut for computing the directional derivative.

Finding the direction of most rapid increase and decrease of thefunction.

Calculus III (James Madison University) Math 237 October 31, 2012 4 / 6

Page 14: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

Uses of the Gradient

Our primary uses for the gradient will be:

Locating extrema – places where ∇f (x , y) = 0 or where ∇f (x , y)doesn’t exist are candidates for the location of extrema.

Providing a shortcut for computing the directional derivative.

Finding the direction of most rapid increase and decrease of thefunction.

Calculus III (James Madison University) Math 237 October 31, 2012 4 / 6

Page 15: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

Uses of the Gradient

Our primary uses for the gradient will be:

Locating extrema – places where ∇f (x , y) = 0 or where ∇f (x , y)doesn’t exist are candidates for the location of extrema.

Providing a shortcut for computing the directional derivative.

Finding the direction of most rapid increase and decrease of thefunction.

Calculus III (James Madison University) Math 237 October 31, 2012 4 / 6

Page 16: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

Uses of the Gradient

Our primary uses for the gradient will be:

Locating extrema – places where ∇f (x , y) = 0 or where ∇f (x , y)doesn’t exist are candidates for the location of extrema.

Providing a shortcut for computing the directional derivative.

Finding the direction of most rapid increase and decrease of thefunction.

Calculus III (James Madison University) Math 237 October 31, 2012 4 / 6

Page 17: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

Computing the Directional Derivative

Theorem

Let f (x , y) be a function of two variables and (x0, y0) be a point in thedomain of f at which the first-order partial derivatives of f exist. If u ∈ R2

is a unit vector for which the directional derivative Duf (x0, y0) also exists,then

Duf (x0, y0) = ∇f (x0, y0) · u.

Similarly, if f (x , y , z) is a function of three variables and (x0, y0, z0) is apoint in the domain of f at which the first-order partial derivatives of fexist and u ∈ R3 is a unit vector for which the directional derivativeDuf (x0, y0, z0) also exists, then

Duf (x0, y0, z0) = ∇f (x0, y0, z0) · u.

Calculus III (James Madison University) Math 237 October 31, 2012 5 / 6

Page 18: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

Computing the Directional Derivative

Theorem

Let f (x , y) be a function of two variables and (x0, y0) be a point in thedomain of f at which the first-order partial derivatives of f exist. If u ∈ R2

is a unit vector for which the directional derivative Duf (x0, y0) also exists,then

Duf (x0, y0) = ∇f (x0, y0) · u.

Similarly, if f (x , y , z) is a function of three variables and (x0, y0, z0) is apoint in the domain of f at which the first-order partial derivatives of fexist and u ∈ R3 is a unit vector for which the directional derivativeDuf (x0, y0, z0) also exists, then

Duf (x0, y0, z0) = ∇f (x0, y0, z0) · u.

Calculus III (James Madison University) Math 237 October 31, 2012 5 / 6

Page 19: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

Computing the Directional Derivative

Theorem

Let f (x , y) be a function of two variables and (x0, y0) be a point in thedomain of f at which the first-order partial derivatives of f exist. If u ∈ R2

is a unit vector for which the directional derivative Duf (x0, y0) also exists,then

Duf (x0, y0) = ∇f (x0, y0) · u.

Similarly, if f (x , y , z) is a function of three variables and (x0, y0, z0) is apoint in the domain of f at which the first-order partial derivatives of fexist and u ∈ R3 is a unit vector for which the directional derivativeDuf (x0, y0, z0) also exists, then

Duf (x0, y0, z0) = ∇f (x0, y0, z0) · u.

Calculus III (James Madison University) Math 237 October 31, 2012 5 / 6

Page 20: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

The Geometry of the Gradient

Theorem (The Gradient Points in the Direction of Greatest Increase)

Let f be a function of two or three variables and let P be a point in thedomain of f at which f is differentiable. The gradient of f at P points inthe direction in which f increases most rapidly.

Theorem (Gradient Vectors are Orthogonal to Level Curves)

Let f be a function of two variables and let (x0, y0) be a point in thedomain of f at which f is differentiable. If C is the level curve containingthe point c0 = f (x0, y0), then ∇f (x0, y0) and C are orthogonal at f (x0, y0).

Calculus III (James Madison University) Math 237 October 31, 2012 6 / 6

Page 21: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

The Geometry of the Gradient

Theorem (The Gradient Points in the Direction of Greatest Increase)

Let f be a function of two or three variables and let P be a point in thedomain of f at which f is differentiable. The gradient of f at P points inthe direction in which f increases most rapidly.

Theorem (Gradient Vectors are Orthogonal to Level Curves)

Let f be a function of two variables and let (x0, y0) be a point in thedomain of f at which f is differentiable. If C is the level curve containingthe point c0 = f (x0, y0), then ∇f (x0, y0) and C are orthogonal at f (x0, y0).

Calculus III (James Madison University) Math 237 October 31, 2012 6 / 6

Page 22: The Chain Rule and The Gradient - educ.jmu.edueduc.jmu.edu/~kohnpd/237/125gradient.pdf · The Chain Rule and The Gradient Department of Mathematics and Statistics October 31, 2012

The Geometry of the Gradient

Theorem (The Gradient Points in the Direction of Greatest Increase)

Let f be a function of two or three variables and let P be a point in thedomain of f at which f is differentiable. The gradient of f at P points inthe direction in which f increases most rapidly.

Theorem (Gradient Vectors are Orthogonal to Level Curves)

Let f be a function of two variables and let (x0, y0) be a point in thedomain of f at which f is differentiable. If C is the level curve containingthe point c0 = f (x0, y0), then ∇f (x0, y0) and C are orthogonal at f (x0, y0).

Calculus III (James Madison University) Math 237 October 31, 2012 6 / 6