Multidisciplinary System Design Optimization (MSDO) · Multidisciplinary System Design Optimization...

1 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox

Engineering Systems Division and Dept. of Aeronautics and Astronautics

Gradient Calculation and

Sensitivity Analysis

Lecture 9

Olivier de Weck

Karen Willcox

Multidisciplinary System

Design Optimization (MSDO)



Today‟s Topics

• Gradient calculation methods– Analytic and Symbolic

– Finite difference

– Complex step

– Adjoint method

– Automatic differentiation

• Post-Processing Sensitivity Analysis– effect of changing design variables

– effect of changing parameters

– effect of changing constraints



Definition of the Gradient

1

2

n

J

x

J

x

J

x

J

“How does the function J value change

locally as we change elements of the

design vector x?”

Compute partial derivatives

of J with respect to xii

J

xJ

Gradient vector points normal

to the tangent hyperplane of J(x)

1x

2x

3x



0 0.5 1 1.5 20

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

x1

x2

Contour plot

3.1

3.1

3.1

3.25

3.25

3.2

5

3.25 3.25

3.5

3.5

3.5

3.5

4

4

4

4 5

5

Geometry of Gradient vector (2D)

Example function:1 2 1 2

1 2

1,J x x x x

x x

2

1 1 2

2

2 1 2

11

11

J

x x xJ

J

x x x Gradient normal to contours



Geometry of Gradient vector (3D)

2 2 2

1 2 3J x x x

1

2

3

2

2

2

x

J x

x

increasing

values of J

1x

2x

3x

Tangent plane

1 2 32 2 2 6 0x x x

1 1 1To

x

2 2 2o

TJ

x

Gradient vector points to larger values of J

J=3

Example



Other Gradient-Related Quantities

• Jacobian: Matrix of derivatives of multiple functions

w.r.t. vector of variables

• Hessian: Matrix of second-order derivatives

1 2

1 1 1

1 2

2 2 2

1 2

z

z

z

n n n

J J J

x x x

J J J

x x x

J J J

x x x

J

1

2

z

J

J

J

J

z x 1 n x z

2

2

2

2

1

2

2

2

2

2

2

12

21

2

21

2

2

1

2

2

nnn

n

n

x

J

xx

J

xx

J

xx

J

x

J

xx

J

xx

J

xx

J

x

J

J

H n x n

Why Calculate Gradients

• Required by gradient-based optimization

algorithms

– Normally need gradient of objective function and

each constraint w.r.t. design variables at each

iteration

– Newton methods require Hessians as well

• Isoperformance/goal programming

• Robust design

• Post-processing sensitivity analysis

– determine if result is optimal

– sensitivity to parameters, constraint values





Analytical Sensitivities

If the objective function is known in closed form, we can often

compute the gradient vector(s) in closed form (analytically):

Example:1 2 1 2

1 2

1,J x x x x

x x

Analytical Gradient:

2

1 1 2

2

2 1 2

11

11

J

x x xJ

J

x x x

For complex systems analytical gradients are rarely available

Example

x1 = x2 =1

J(1,1)=3

0(1,1)

0J

Minimum



Symbolic Differentiation

• Use symbolic mathematics programs

• e.g. MATLAB®, Maple®, Mathematica®

» syms x1 x2

» J=x1+x2+1/(x1*x2);

» dJdx1=diff(J,x1)

dJdx1 =1-1/x1^2/x2

» dJdx2=diff(J,x2)

dJdx2 = 1-1/x1/x2^2

construct a symbolic object

difference operator



Finite Differences (I)

x

xo xo+ xxo- x

x xf

Function of a single variable f(x)

• First-order finite differenceapproximation of gradient:

• Second-order finite differenceapproximation of gradient:

' o o

o

f x x f xf x O x

x

' 2

2

o o

o

f x x f x xf x O x

x

Forward difference approximation to

the derivative

Truncation Error

Central difference approximation to

the derivative

Truncation Error



Finite Differences (II)

Approximations are derived from Taylor Series expansion:

Neglect second order and higher order terms; solve for

gradient vector:

' o o

o

f x x f xf x O x

x

''

2

o o

xO x f

x x x

Truncation Error

2' '' 2

2o o o o

xf x x f x xf x f x O x

3

Forward Difference



Finite Differences (III)

Take Taylor expansion backwards at ox x

2' '' 2

2o o o o

xf x x f x xf x f x O x

2' '' 2

2o o o o

xf x x f x xf x f x O x (1)

(2)

(1) - (2) and solve again for derivative

' 2

2

o o

o

f x x f x xf x O x

x

Central Difference

22 '''

6

o o

xO x f

x x x

Truncation Error



Finite Differences (IV)

1

1 1 1 1 1

1

1 1 1 1 1

o o o

o

J x J x J x x J xJ J

x x x x x

x1

J(x)

1

1x1

ox

1

1J x

1

oJ x

true, analytical

sensitivity

finite difference

approximation

1x

J

1

1x 1

ox-

© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox


Finite Differences (V)

x

xo xo+ xxo- x

x xf

• Second-order finite differenceapproximation of second derivative:

2

2''( )

o o o

o

f x x f x f x xf x

x



Errors of Finite Differencing

Caution: - Finite differencing always has errors

- Very dependent on perturbation size

1 2 1 2

1 2

1,J x x x x

x xx1 = x2 =1

J(1,1)=3

0(1,1)

0J

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20

1

2

3

4

5

6

7

8

9

10

x1

x1

10-9

10-8

10-7

10-610

-8

10-7

10-6

10-5

Truncation

Errors ~ x

Rounding

Errors ~ 1/ x

Perturbation Step Size x1

Gra

die

nt E

rror

Choice of x is critical

J



Perturbation Size x Choice

• Error Analysis

• Machine Precision

• Trial and Error – typical value ~ 0.1-1%

1/ 2

Ax f - Forward difference1/3

Ax f - Central difference

(Gill et al. 1981)

ox

theoretical function

computed

values

~ x

A

10 q

k kx xStep size

at k-th iterationq-# of digits of machine

Precision for real numbers



Computational Expense of FD

Cost of a single objective function

evaluation of Ji

Cost of gradient vector one-sided

finite difference approximation for Ji

for a design vector of length n

iF J

in F J

Cost of Jacobian finite

difference approximation with

z objective functionsiz n F J

Example: 6 objectives

30 design variables

1 sec per function evaluation

3 min of CPU time

for a single Jacobian

estimate - expensive !

Complex Step DerivativeComplex Step Derivative

• Similar to finite differences, but uses an imaginary step [ ]p [ ]

xxixfxf

ΔΔ+

≈)(Im)(' 0

0

• Second order accurate• Can use very small step sizes e.g. 2010−≈Δx

– Doesn’t have rounding error, since it doesn’t perform subtraction

• Limited application areasLimited application areas– Code must be able to handle complex step values

J.R.R.A. Martins, I.M. Kroo and J.J. Alonso, An automated method for sensitivity analysis using complex variables, AIAA Paper 2000-0689, Jan 2000

18 © Massachusetts Institute of Technology - Prof. de Weck and Prof. WillcoxEngineering Systems Division and Dept. of Aeronautics and Astronautics



Automatic Differentiation

• Mathematical formulae are built from a finite set of basic

functions, e.g. additions, sin x, exp x, etc.

• Using chain rule, differentiate analysis code: add

statements that generate derivatives of the basic

functions

• Tracks numerical values of derivatives, does not track

symbolically as discussed before

• Outputs modified program = original + derivative

capability

• e.g., ADIFOR (FORTRAN), TAPENADE (C, FORTRAN),

TOMLAB (MATLAB), many more…

• Resources at http://www.autodiff.org/

http://www.autodiff.org/

Adjoint Methods

),( uxJ

0uxR ),(

Consider the following problem:

where x are the design variables and u are the state variables.

The constraints represent the state equation.

e.g. wing design: x are shape variables, u are flow variables,

R(x,u)=0 represents the Navier Stokes equations.

We need to compute the gradients of J wrt x:

Typically the dimension of u is very high (thousands/millions).

Minimize

s.t.

x

u

uxx d

d

d

d JJJ

Adjoint Methods

• To compute du/dx, differentiate the state

equation:



x

u

uxx d

d

d

d JJJ

0x

u

u

R

x

R

x

R

d

d

d

d

x

R

x

u

u

R

d

d

x

R

u

R

x

u1

d

d

Adjoint Methods

• We have

• Now define

• Then to determine the gradient:

First solve (adjoint equation)

Then compute 22

x

R

u

R

uxx

u

uxx

1

d

d

d

d JJJJJ

T

J1

u

R

uλ

TTJ

uλ

u

R

x

Rλ

xx

TJJ

d

d

Tλ

Adjoint Methods

• Solving adjoint equation

about same cost as solving forward problem

(function evaluation)

• Adjoints widely used in aerodynamic shape optimization,

optimal flow control, geophysics applications, etc.

• Some automatic differentiation tools have „reverse mode‟

for computing adjoints



TTJ

uλ

u

R



Post-Processing Sensitivity Analysis

• A sensitivity analysis is an important component of post-processing

• Key to understanding which design variables, constraints, and parameters are important drivers for the optimum solution

• How sensitive is the “optimal” solution J* to changes or perturbations of the design variables x*?

• How sensitive is the “optimal” solution x* to changes in the constraints g(x), h(x) and fixed parameters p ?



Sensitivity Analysis: Aircraft

Questions for aircraft design:

How does my solution change if I

• change the cruise altitude?

• change the cruise speed?

• change the range?

• change material properties?

• relax the constraint on payload?

• ...



Sensitivity Analysis: Spacecraft

Questions for spacecraft design:

How does my solution change if I

• change the orbital altitude?

• change the transmission frequency?

• change the specific impulse of the propellant?

• change launch vehicle?

• Change desired mission lifetime?

• ...




Want to answer this question without having to solve the

optimization problem again.

“How does the optimal solution change as we change

the problem parameters?”

effect on design variables

effect on objective function

effect on constraints



Normalization

In order to compare sensitivities from different

design variables in terms of their relative sensitivity

it may be necessary to normalize:

i

J

x ox

“raw” - unnormalized sensitivity = partial

derivative evaluated at point xi,o

,

( )

i o

i i i

xJ J J

x x J x o

o

xx

Normalized sensitivity captures

relative sensitivity

~ % change in objective per

% change in design variable

Important for comparing effect between design variables

=



Example: Dairy Farm Problem

With respect to which

design variable is the

objective most sensitive?

“Dairy Farm” sample problem

L

RN

L – Length = 100 [m]

N - # of cows = 10

R – Radius = 50 [m]

cow

fence

22

2 2

100 /

A LR R

F L R

M A N

C f F n N

I N M m

P I C

Parameters:

f=100$/m

n=2000$/cow

m=2$/liter

xo

cow

cow



Dairy Farm Sensitivity

• Compute objective at xo

• Then compute raw sensitivities

• Normalize

• Show graphically withtornado chart

36.6

2225.4

588.4

P

L

PJ

N

P

R

10036.6

13092 0.2810

2225.4 1.7( ) 13092

2.2550

588.413092

o

oJ J

J

x

x

( ) 13092oJ x

Dairy Farm Normalized Sensitivities

0 0.5 1 1.5 2 2.5

L

N

R

De

sig

n V

ari

ab

le



Realistic Example: Spacecraft

XY

Z

What are the design variables that are “drivers”

of system performance ?

0 1 2

meters

SpacecraftCAD model

-60 -40 -20 0 20 40 60

-60

-40

-20

0

20

40

60

Centroid X [ m]C

en

troid

Y [

m]

J2= Centroid Jitter on Focal Plane [RSS LOS]

T=5 sec

14.97 m

1 pixel

Requirement: J2,req=5 m

NASA Nexus Spacecraft Concept

Finite ElementModel

Simulation

“x”-domain “J”-domain

Image by MIT OpenCourseWare.



Graphical Representation

-0.5 0 0.5 1 1.5

Kcf

Kc

fca

Mgs

QE

Ro

lambda

zeta

I_propt

I_ss

t_sp

K_zpet

m_bus

K_rISO

K_yPM

m_SM

Tgs

Sst

Srg

Tst

Qc

fc

Ud

Us

Ru

J1: Norm Sensitivities: RMMS WFE

Desig

n V

aria

ble

s

o 1,o 1x /J * J / x

analytical

finite difference

-0.5 0 0.5 1 1.5

Kcf

Kc

fca

Mgs

QE

Ro

lambda

zeta

I_propt

I_ss

t_sp

K_zpet

m_bus

K_rISO

K_yPM

m_SM

Tgs

Sst

Srg

Tst

Qc

fc

Ud

Us

Ru

J2: Norm Sensitivities: RSS LOS

xo /J2,o * J2 / x

dis

turb

ance

va

ria

ble

sstr

uctu

ral

va

ria

ble

s

co

ntr

ol

va

ria

ble

s

op

tics

va

ria

ble

s

Graphical Representation of

Jacobian evaluated at design

xo, normalized for comparison.

J1: RMMS WFE most sensitive to:

Ru - upper wheel speed limit [RPM]

Sst - star tracker noise 1 [asec]

K_rISO - isolator joint stiffness [Nm/rad]

K_zpet - deploy petal stiffness [N/m]

J2: RSS LOS most sensitive to:

Ud - dynamic wheel imbalance [gcm2]

K_rISO - isolator joint stiffness [Nm/rad]

zeta - proportional damping ratio [-]

Mgs - guide star magnitude [mag]

Kcf - FSM controller gain [-]

1 2

0

1 2

u u

o

cf cf

J J

R R

JJ

J J

K K

x



Parameters

Parameters p are the fixed assumptions.

How sensitive is the optimal solution x* with respect

to fixed parameters ?

Example:

“Dairy Farm” sample problem

L

RN

cow

fence

Optimal solution:

x* =[ R=106.1m, L=0m, N=17 cows]T

Fixed parameters:

Parameters:

f=100$/m - Cost of fence

n=2000$/cow - Cost of a single cow

m=2$/liter - Market price of milk

How does x* change as parameters

change? Maximize Profit

cow

cow



*ˆ( *) ( ) 0

ˆ ( *) 0,

0,

j j

j M

j

j

J g

g j M

j M

x x

x


KKT conditions:

For a small change in a parameter, p, we require

that the KKT conditions remain valid:

(KKT conditions)0

d

dpRewrite first equation:

* *ˆ

( ) ( ) 0, 1,...,j

j

j Mi i

gJi n

x xx x

Set of

active

constraints

Sensitivity AnalysisSensitivity AnalysisRecall chain rule. If: then

ndY Y Y x∂ ∂ ∂∑

( ))(, ppYY x=

1

i

k i

dY Y Y xdp p x p=

∂ ∂ ∂= +∂ ∂ ∂∑

A l i t fi t ti f KKT ditiApplying to first equation of KKT conditions:),(ˆ

)(),(⎟⎟⎞

⎜⎜⎛ ∂

+∂ ∑ j

j

pgppJd λ

xx

0ˆˆ

)(

2222 ∂∂∂⎟⎞

⎜⎛ ∂∂∂∂

⎟⎟⎠

⎜⎜⎝ ∂

+∂

∑∑ ∑∑

∑∈

jjkn

j

Mj ij

i

gxggJJ

xp

xdp

λλλ

λ

01

=∂∂

+∂⎟

⎟⎠

⎜⎜⎝ ∂∂

+∂∂

+∂∂

+∂∂

= ∑∑ ∑∑∈= ∈∈ Mj i

jjk

k Mj ki

jj

kiMj ij

i xg

ppxxg

xxg

pxpxλλ

∂∂n λ0

1=+

∂

∂+

∂∂ ∑∑

∈=i

Mj

jij

n

k

kik c

pB

pxA

λ

35 © Massachusetts Institute of Technology - Prof. de Weck and Prof. WillcoxEngineering Systems Division and Dept. of Aeronautics and Astronautics




Perform same procedure on equation: 0),( * pxg j

0ˆˆ

1 p

x

x

g

p

gk

n

k k

jj

01

jk

n

k

kj dp

xB



Sensitivity AnalysisIn matrix form we can write:

00T

A B c

B d

x

λ

n

M

Mn

22 ˆj

ik j

j Mi k i k

gJA

x x x x

ˆj

ij

i

gB

x22 ˆ

j

i j

j Mi i

gJc

x p x p

ˆj

j

gd

p

1

2

n

x

p

x

p

x

p

x

1

2

M

p

p

p

λ




We solve the system to find x and , then the sensitivity

of the objective function with respect to p can be found:

TdJ JJ

dp px

(first-order

approximation)

dJJ p

dp

px x

To assess the effect of changing a different parameter, we

only need to calculate a new RHS in the matrix system.



Sensitivity Analysis - Constraints• We also need to assess when an active constraint will

become inactive and vice versa

• An active constraint will become inactive when its

Lagrange multiplier goes to zero:

j

j jp pp

Find the p that makes j zero:

0j j p

j

j

p j M

This is the amount by which we can change p before the jth

constraint becomes inactive (to a first order approximation)



Sensitivity Analysis - Constraints

An inactive constraint will become active when gj(x)

goes to zero:

( ) ( *) ( *) 0T

j j jg g p gx x x x

Find the p that makes gj zero:

( *)

( *)

j

T

j

gp

g

x

x xfor all j not

active at x*

• This is the amount by which we can change p before the jth

constraint becomes active (to a first order approximation)

• If we want to change p by a larger amount, then the problem

must be solved again including the new constraint

• Only valid close to the optimum

41© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox


Lagrange Multiplier Interpretation

• Consider the problem:

minimize J(x) s.t. h(x)=0

with optimal solution x*

• What happens if we change constraint k by a small

amount?

• Differentiating w.r.t

)( *xkh kjh j ,0)( *

x

1*

d

dhk

xkj

d

dh j ,0

*x



Lagrange Multiplier Interpretation

• How does the objective function change?

• Using KKT conditions:

• Lagrange multiplier is negative of sensitivity of cost

function to constraint value. Also called shadow

price.

d

dJ

d

dJ *x

k

j

jj

j

jjd

dh

d

dh

d

dJ **xx



Lecture Summary

• Gradient calculation approaches

– Analytical and Symbolic

– Finite difference

– Automatic Differentiation

– Adjoint methods

• Sensitivity analysis

– Yields important information about the design space,

both as the optimization is proceeding and once the

“optimal” solution has been reached.

Reading

Papalambros – Section 8.2 Computing Derivatives

MIT OpenCourseWarehttp://ocw.mit.edu

ESD.77 / 16.888 Multidisciplinary System Design Optimization

Spring 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

http://ocw.mit.edu

http://ocw.mit.edu/terms

Multidisciplinary System Design Optimization (MSDO) · Multidisciplinary System Design Optimization...

Documents

Transcript of Multidisciplinary System Design Optimization (MSDO) · Multidisciplinary System Design Optimization...