Multidisciplinary System Design Optimization (MSDO) · Multidisciplinary System Design Optimization...
Transcript of Multidisciplinary System Design Optimization (MSDO) · Multidisciplinary System Design Optimization...
1 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Gradient Calculation and
Sensitivity Analysis
Lecture 9
Olivier de Weck
Karen Willcox
Multidisciplinary System
Design Optimization (MSDO)
2 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Today‟s Topics
• Gradient calculation methods– Analytic and Symbolic
– Finite difference
– Complex step
– Adjoint method
– Automatic differentiation
• Post-Processing Sensitivity Analysis– effect of changing design variables
– effect of changing parameters
– effect of changing constraints
3 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Definition of the Gradient
1
2
n
J
x
J
x
J
x
J
“How does the function J value change
locally as we change elements of the
design vector x?”
Compute partial derivatives
of J with respect to xii
J
xJ
Gradient vector points normal
to the tangent hyperplane of J(x)
1x
2x
3x
4 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
x1
x2
Contour plot
3.1
3.1
3.1
3.25
3.25
3.2
5
3.25 3.25
3.5
3.5
3.5
3.5
4
4
4
4 5
5
Geometry of Gradient vector (2D)
Example function:1 2 1 2
1 2
1,J x x x x
x x
2
1 1 2
2
2 1 2
11
11
J
x x xJ
J
x x x Gradient normal to contours
5 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Geometry of Gradient vector (3D)
2 2 2
1 2 3J x x x
1
2
3
2
2
2
x
J x
x
increasing
values of J
1x
2x
3x
Tangent plane
1 2 32 2 2 6 0x x x
1 1 1To
x
2 2 2o
TJ
x
Gradient vector points to larger values of J
J=3
Example
6 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Other Gradient-Related Quantities
• Jacobian: Matrix of derivatives of multiple functions
w.r.t. vector of variables
• Hessian: Matrix of second-order derivatives
1 2
1 1 1
1 2
2 2 2
1 2
z
z
z
n n n
J J J
x x x
J J J
x x x
J J J
x x x
J
1
2
z
J
J
J
J
z x 1 n x z
2
2
2
2
1
2
2
2
2
2
2
12
21
2
21
2
2
1
2
2
nnn
n
n
x
J
xx
J
xx
J
xx
J
x
J
xx
J
xx
J
xx
J
x
J
J
H n x n
Why Calculate Gradients
• Required by gradient-based optimization
algorithms
– Normally need gradient of objective function and
each constraint w.r.t. design variables at each
iteration
– Newton methods require Hessians as well
• Isoperformance/goal programming
• Robust design
• Post-processing sensitivity analysis
– determine if result is optimal
– sensitivity to parameters, constraint values
7 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
8 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Analytical Sensitivities
If the objective function is known in closed form, we can often
compute the gradient vector(s) in closed form (analytically):
Example:1 2 1 2
1 2
1,J x x x x
x x
Analytical Gradient:
2
1 1 2
2
2 1 2
11
11
J
x x xJ
J
x x x
For complex systems analytical gradients are rarely available
Example
x1 = x2 =1
J(1,1)=3
0(1,1)
0J
Minimum
9 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Symbolic Differentiation
• Use symbolic mathematics programs
• e.g. MATLAB®, Maple®, Mathematica®
» syms x1 x2
» J=x1+x2+1/(x1*x2);
» dJdx1=diff(J,x1)
dJdx1 =1-1/x1^2/x2
» dJdx2=diff(J,x2)
dJdx2 = 1-1/x1/x2^2
construct a symbolic object
difference operator
10 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Finite Differences (I)
x
xo xo+ xxo- x
x xf
Function of a single variable f(x)
• First-order finite differenceapproximation of gradient:
• Second-order finite differenceapproximation of gradient:
' o o
o
f x x f xf x O x
x
' 2
2
o o
o
f x x f x xf x O x
x
Forward difference approximation to
the derivative
Truncation Error
Central difference approximation to
the derivative
Truncation Error
11 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Finite Differences (II)
Approximations are derived from Taylor Series expansion:
Neglect second order and higher order terms; solve for
gradient vector:
' o o
o
f x x f xf x O x
x
''
2
o o
xO x f
x x x
Truncation Error
2' '' 2
2o o o o
xf x x f x xf x f x O x
3
Forward Difference
12 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Finite Differences (III)
Take Taylor expansion backwards at ox x
2' '' 2
2o o o o
xf x x f x xf x f x O x
2' '' 2
2o o o o
xf x x f x xf x f x O x (1)
(2)
(1) - (2) and solve again for derivative
' 2
2
o o
o
f x x f x xf x O x
x
Central Difference
22 '''
6
o o
xO x f
x x x
Truncation Error
13 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Finite Differences (IV)
1
1 1 1 1 1
1
1 1 1 1 1
o o o
o
J x J x J x x J xJ J
x x x x x
x1
J(x)
1
1x1
ox
1
1J x
1
oJ x
true, analytical
sensitivity
finite difference
approximation
1x
J
1
1x 1
ox-
© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Finite Differences (V)
x
xo xo+ xxo- x
x xf
• Second-order finite differenceapproximation of second derivative:
2
2''( )
o o o
o
f x x f x f x xf x
x
15 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Errors of Finite Differencing
Caution: - Finite differencing always has errors
- Very dependent on perturbation size
1 2 1 2
1 2
1,J x x x x
x xx1 = x2 =1
J(1,1)=3
0(1,1)
0J
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20
1
2
3
4
5
6
7
8
9
10
x1
x1
10-9
10-8
10-7
10-610
-8
10-7
10-6
10-5
Truncation
Errors ~ x
Rounding
Errors ~ 1/ x
Perturbation Step Size x1
Gra
die
nt E
rror
Choice of x is critical
J
16 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Perturbation Size x Choice
• Error Analysis
• Machine Precision
• Trial and Error – typical value ~ 0.1-1%
1/ 2
Ax f - Forward difference1/3
Ax f - Central difference
(Gill et al. 1981)
ox
theoretical function
computed
values
~ x
A
10 q
k kx xStep size
at k-th iterationq-# of digits of machine
Precision for real numbers
17 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Computational Expense of FD
Cost of a single objective function
evaluation of Ji
Cost of gradient vector one-sided
finite difference approximation for Ji
for a design vector of length n
iF J
in F J
Cost of Jacobian finite
difference approximation with
z objective functionsiz n F J
Example: 6 objectives
30 design variables
1 sec per function evaluation
3 min of CPU time
for a single Jacobian
estimate - expensive !
Complex Step DerivativeComplex Step Derivative
• Similar to finite differences, but uses an imaginary step [ ]p [ ]
xxixfxf
ΔΔ+
≈)(Im)(' 0
0
• Second order accurate• Can use very small step sizes e.g. 2010−≈Δx
– Doesn’t have rounding error, since it doesn’t perform subtraction
• Limited application areasLimited application areas– Code must be able to handle complex step values
J.R.R.A. Martins, I.M. Kroo and J.J. Alonso, An automated method for sensitivity analysis using complex variables, AIAA Paper 2000-0689, Jan 2000
18 © Massachusetts Institute of Technology - Prof. de Weck and Prof. WillcoxEngineering Systems Division and Dept. of Aeronautics and Astronautics
19 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Automatic Differentiation
• Mathematical formulae are built from a finite set of basic
functions, e.g. additions, sin x, exp x, etc.
• Using chain rule, differentiate analysis code: add
statements that generate derivatives of the basic
functions
• Tracks numerical values of derivatives, does not track
symbolically as discussed before
• Outputs modified program = original + derivative
capability
• e.g., ADIFOR (FORTRAN), TAPENADE (C, FORTRAN),
TOMLAB (MATLAB), many more…
• Resources at http://www.autodiff.org/
Adjoint Methods
),( uxJ
0uxR ),(
Consider the following problem:
where x are the design variables and u are the state variables.
The constraints represent the state equation.
e.g. wing design: x are shape variables, u are flow variables,
R(x,u)=0 represents the Navier Stokes equations.
We need to compute the gradients of J wrt x:
Typically the dimension of u is very high (thousands/millions).
Minimize
s.t.
x
u
uxx d
d
d
d JJJ
Adjoint Methods
• To compute du/dx, differentiate the state
equation:
21 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
x
u
uxx d
d
d
d JJJ
0x
u
u
R
x
R
x
R
d
d
d
d
x
R
x
u
u
R
d
d
x
R
u
R
x
u1
d
d
Adjoint Methods
• We have
• Now define
• Then to determine the gradient:
First solve (adjoint equation)
Then compute 22
x
R
u
R
uxx
u
uxx
1
d
d
d
d JJJJJ
T
J1
u
R
uλ
TTJ
uλ
u
R
x
Rλ
xx
TJJ
d
d
Tλ
Adjoint Methods
• Solving adjoint equation
about same cost as solving forward problem
(function evaluation)
• Adjoints widely used in aerodynamic shape optimization,
optimal flow control, geophysics applications, etc.
• Some automatic differentiation tools have „reverse mode‟
for computing adjoints
23 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
TTJ
uλ
u
R
24 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Post-Processing Sensitivity Analysis
• A sensitivity analysis is an important component of post-processing
• Key to understanding which design variables, constraints, and parameters are important drivers for the optimum solution
• How sensitive is the “optimal” solution J* to changes or perturbations of the design variables x*?
• How sensitive is the “optimal” solution x* to changes in the constraints g(x), h(x) and fixed parameters p ?
25 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis: Aircraft
Questions for aircraft design:
How does my solution change if I
• change the cruise altitude?
• change the cruise speed?
• change the range?
• change material properties?
• relax the constraint on payload?
• ...
26 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis: Spacecraft
Questions for spacecraft design:
How does my solution change if I
• change the orbital altitude?
• change the transmission frequency?
• change the specific impulse of the propellant?
• change launch vehicle?
• Change desired mission lifetime?
• ...
27 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis
Want to answer this question without having to solve the
optimization problem again.
“How does the optimal solution change as we change
the problem parameters?”
effect on design variables
effect on objective function
effect on constraints
28 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Normalization
In order to compare sensitivities from different
design variables in terms of their relative sensitivity
it may be necessary to normalize:
i
J
x ox
“raw” - unnormalized sensitivity = partial
derivative evaluated at point xi,o
,
( )
i o
i i i
xJ J J
x x J x o
o
xx
Normalized sensitivity captures
relative sensitivity
~ % change in objective per
% change in design variable
Important for comparing effect between design variables
=
29 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Example: Dairy Farm Problem
With respect to which
design variable is the
objective most sensitive?
“Dairy Farm” sample problem
L
RN
L – Length = 100 [m]
N - # of cows = 10
R – Radius = 50 [m]
cow
fence
22
2 2
100 /
A LR R
F L R
M A N
C f F n N
I N M m
P I C
Parameters:
f=100$/m
n=2000$/cow
m=2$/liter
xo
cow
cow
30 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Dairy Farm Sensitivity
• Compute objective at xo
• Then compute raw sensitivities
• Normalize
• Show graphically withtornado chart
36.6
2225.4
588.4
P
L
PJ
N
P
R
10036.6
13092 0.2810
2225.4 1.7( ) 13092
2.2550
588.413092
o
oJ J
J
x
x
( ) 13092oJ x
Dairy Farm Normalized Sensitivities
0 0.5 1 1.5 2 2.5
L
N
R
De
sig
n V
ari
ab
le
31 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Realistic Example: Spacecraft
XY
Z
What are the design variables that are “drivers”
of system performance ?
0 1 2
meters
SpacecraftCAD model
-60 -40 -20 0 20 40 60
-60
-40
-20
0
20
40
60
Centroid X [ m]C
en
troid
Y [
m]
J2= Centroid Jitter on Focal Plane [RSS LOS]
T=5 sec
14.97 m
1 pixel
Requirement: J2,req=5 m
NASA Nexus Spacecraft Concept
Finite ElementModel
Simulation
“x”-domain “J”-domain
Image by MIT OpenCourseWare.
32 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Graphical Representation
-0.5 0 0.5 1 1.5
Kcf
Kc
fca
Mgs
QE
Ro
lambda
zeta
I_propt
I_ss
t_sp
K_zpet
m_bus
K_rISO
K_yPM
m_SM
Tgs
Sst
Srg
Tst
Qc
fc
Ud
Us
Ru
J1: Norm Sensitivities: RMMS WFE
Desig
n V
aria
ble
s
o 1,o 1x /J * J / x
analytical
finite difference
-0.5 0 0.5 1 1.5
Kcf
Kc
fca
Mgs
QE
Ro
lambda
zeta
I_propt
I_ss
t_sp
K_zpet
m_bus
K_rISO
K_yPM
m_SM
Tgs
Sst
Srg
Tst
Qc
fc
Ud
Us
Ru
J2: Norm Sensitivities: RSS LOS
xo /J2,o * J2 / x
dis
turb
ance
va
ria
ble
sstr
uctu
ral
va
ria
ble
s
co
ntr
ol
va
ria
ble
s
op
tics
va
ria
ble
s
Graphical Representation of
Jacobian evaluated at design
xo, normalized for comparison.
J1: RMMS WFE most sensitive to:
Ru - upper wheel speed limit [RPM]
Sst - star tracker noise 1 [asec]
K_rISO - isolator joint stiffness [Nm/rad]
K_zpet - deploy petal stiffness [N/m]
J2: RSS LOS most sensitive to:
Ud - dynamic wheel imbalance [gcm2]
K_rISO - isolator joint stiffness [Nm/rad]
zeta - proportional damping ratio [-]
Mgs - guide star magnitude [mag]
Kcf - FSM controller gain [-]
1 2
0
1 2
u u
o
cf cf
J J
R R
JJ
J J
K K
x
33 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Parameters
Parameters p are the fixed assumptions.
How sensitive is the optimal solution x* with respect
to fixed parameters ?
Example:
“Dairy Farm” sample problem
L
RN
cow
fence
Optimal solution:
x* =[ R=106.1m, L=0m, N=17 cows]T
Fixed parameters:
Parameters:
f=100$/m - Cost of fence
n=2000$/cow - Cost of a single cow
m=2$/liter - Market price of milk
How does x* change as parameters
change? Maximize Profit
cow
cow
34 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
*ˆ( *) ( ) 0
ˆ ( *) 0,
0,
j j
j M
j
j
J g
g j M
j M
x x
x
Sensitivity Analysis
KKT conditions:
For a small change in a parameter, p, we require
that the KKT conditions remain valid:
(KKT conditions)0
d
dpRewrite first equation:
* *ˆ
( ) ( ) 0, 1,...,j
j
j Mi i
gJi n
x xx x
Set of
active
constraints
Sensitivity AnalysisSensitivity AnalysisRecall chain rule. If: then
ndY Y Y x∂ ∂ ∂∑
( ))(, ppYY x=
1
i
k i
dY Y Y xdp p x p=
∂ ∂ ∂= +∂ ∂ ∂∑
A l i t fi t ti f KKT ditiApplying to first equation of KKT conditions:),(ˆ
)(),(⎟⎟⎞
⎜⎜⎛ ∂
+∂ ∑ j
j
pgppJd λ
xx
0ˆˆ
)(
2222 ∂∂∂⎟⎞
⎜⎛ ∂∂∂∂
⎟⎟⎠
⎜⎜⎝ ∂
+∂
∑∑ ∑∑
∑∈
jjkn
j
Mj ij
i
gxggJJ
xp
xdp
λλλ
λ
01
=∂∂
+∂⎟
⎟⎠
⎜⎜⎝ ∂∂
+∂∂
+∂∂
+∂∂
= ∑∑ ∑∑∈= ∈∈ Mj i
jjk
k Mj ki
jj
kiMj ij
i xg
ppxxg
xxg
pxpxλλ
∂∂n λ0
1=+
∂
∂+
∂∂ ∑∑
∈=i
Mj
jij
n
k
kik c
pB
pxA
λ
35 © Massachusetts Institute of Technology - Prof. de Weck and Prof. WillcoxEngineering Systems Division and Dept. of Aeronautics and Astronautics
36 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis
Perform same procedure on equation: 0),( * pxg j
0ˆˆ
1 p
x
x
g
p
gk
n
k k
jj
01
jk
n
k
kj dp
xB
37 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity AnalysisIn matrix form we can write:
00T
A B c
B d
x
λ
n
M
Mn
22 ˆj
ik j
j Mi k i k
gJA
x x x x
ˆj
ij
i
gB
x22 ˆ
j
i j
j Mi i
gJc
x p x p
ˆj
j
gd
p
1
2
n
x
p
x
p
x
p
x
1
2
M
p
p
p
λ
38 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis
We solve the system to find x and , then the sensitivity
of the objective function with respect to p can be found:
TdJ JJ
dp px
(first-order
approximation)
dJJ p
dp
px x
To assess the effect of changing a different parameter, we
only need to calculate a new RHS in the matrix system.
39 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis - Constraints• We also need to assess when an active constraint will
become inactive and vice versa
• An active constraint will become inactive when its
Lagrange multiplier goes to zero:
j
j jp pp
Find the p that makes j zero:
0j j p
j
j
p j M
This is the amount by which we can change p before the jth
constraint becomes inactive (to a first order approximation)
40 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Sensitivity Analysis - Constraints
An inactive constraint will become active when gj(x)
goes to zero:
( ) ( *) ( *) 0T
j j jg g p gx x x x
Find the p that makes gj zero:
( *)
( *)
j
T
j
gp
g
x
x xfor all j not
active at x*
• This is the amount by which we can change p before the jth
constraint becomes active (to a first order approximation)
• If we want to change p by a larger amount, then the problem
must be solved again including the new constraint
• Only valid close to the optimum
41© Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Lagrange Multiplier Interpretation
• Consider the problem:
minimize J(x) s.t. h(x)=0
with optimal solution x*
• What happens if we change constraint k by a small
amount?
• Differentiating w.r.t
)( *xkh kjh j ,0)( *
x
1*
d
dhk
xkj
d
dh j ,0
*x
42 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Lagrange Multiplier Interpretation
• How does the objective function change?
• Using KKT conditions:
• Lagrange multiplier is negative of sensitivity of cost
function to constraint value. Also called shadow
price.
d
dJ
d
dJ *x
k
j
jj
j
jjd
dh
d
dh
d
dJ **xx
43 © Massachusetts Institute of Technology - Prof. de Weck and Prof. Willcox
Engineering Systems Division and Dept. of Aeronautics and Astronautics
Lecture Summary
• Gradient calculation approaches
– Analytical and Symbolic
– Finite difference
– Automatic Differentiation
– Adjoint methods
• Sensitivity analysis
– Yields important information about the design space,
both as the optimization is proceeding and once the
“optimal” solution has been reached.
Reading
Papalambros – Section 8.2 Computing Derivatives
MIT OpenCourseWarehttp://ocw.mit.edu
ESD.77 / 16.888 Multidisciplinary System Design Optimization
Spring 2010
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.