A Matrix-Free Algorithm for Multidisciplinary Design …...Abstract A Matrix-Free Algorithm for...
Transcript of A Matrix-Free Algorithm for Multidisciplinary Design …...Abstract A Matrix-Free Algorithm for...
-
A Matrix-Free Algorithm for Multidisciplinary DesignOptimization
by
Andrew Borean Lambe
A thesis submitted in conformity with the requirementsfor the degree of Doctor of Philosophy
Graduate Department of Aerospace Science and EngineeringUniversity of Toronto
Copyright c© 2015 by Andrew Borean Lambe
-
Abstract
A Matrix-Free Algorithm for Multidisciplinary Design Optimization
Andrew Borean Lambe
Doctor of Philosophy
Graduate Department of Aerospace Science and Engineering
University of Toronto
2015
Multidisciplinary design optimization (MDO) is an approach to engineering design that
exploits the coupling between components or knowledge disciplines in a complex system
to improve the final product. In aircraft design, MDO methods can be used to simul-
taneously design the outer shape of the aircraft and the internal structure, taking into
account the complex interaction between the aerodynamic forces and the structural flex-
ibility. Efficient strategies are needed to solve such design optimization problems and
guarantee convergence to an optimal design.
This work begins with a comprehensive review of MDO problem formulations and so-
lution algorithms. First, a fundamental MDO problem formulation is defined from which
other formulations may be obtained through simple transformations. Using these funda-
mental problem formulations, decomposition methods from the literature are reviewed
and classified. All MDO methods are presented in a unified mathematical notation to
facilitate greater understanding. In addition, a novel set of diagrams, called extended
design structure matrices, are used to simultaneously visualize both data communication
and process flow between the many software components of each method.
For aerostructural design optimization, modern decomposition-based MDO meth-
ods cannot efficiently handle the tight coupling between the aerodynamic and structural
states. This fact motivates the exploration of methods that can reduce the computational
cost. A particular structure in the direct and adjoint methods for gradient computation
ii
-
motivates the idea of a matrix-free optimization method. A simple matrix-free opti-
mizer is developed based on the augmented Lagrangian algorithm. This new matrix-free
optimizer is tested on two structural optimization problems and one aerostructural opti-
mization problem. The results indicate that the matrix-free optimizer is able to efficiently
solve structural and multidisciplinary design problems with thousands of variables and
constraints. On the aerostructural test problem formulated with thousands of constraints,
the matrix-free optimizer is estimated to reduce the total computational time by up to
90% compared to conventional optimizers.
iii
-
Acknowledgements
First and foremost, I would like to thank my supervisor, Dr. Joaquim Martins. I first
met Dr. Martins as an undergraduate looking for summer research project ideas. His
interests in aircraft design and optimization left a strong impression on me and led me,
ultimately, to his research lab and this thesis. Even after his departure for Michigan, he
continued to provide extraordinary support and encouragement to myself and the other
students from MDO Lab Toronto. Dr. Martins also deserves credit for introducing me
to this branch of mathematics called “optimization,” which has evolved into passion of
mine over the last couple of years.
I would like to thank the other members of my UTIAS committee, Dr. David Zingg
and Dr. Tim Barfoot. Their questions and insights in the annual committee meetings
greatly improved both the cohesion of this thesis and my presentation skills.
I would like to thank my many colleagues in both MDO Lab Toronto and MDO Lab
Michigan for providing a welcoming and stimulating research environment. In particular,
I would like to thank Dr. Graeme Kennedy for his insights into structural analysis and
his early and eager support of the matrix-free method contained in this thesis, and Dr.
Gaetan Kenway for his help on the software engineering side. Without their efforts, the
main structural and aerostructural results in this thesis would not be possible.
I would also like to thank Sylvain Arreckx and Dr. Dominique Orban for their assis-
tance and mathematical insights into the optimization algorithm developed in this thesis
and for hosting me at École Polytechnique de Montréal on several occasions.
Finally, I would like to thank my parents and my brother Geoff for their constant
encouragement and for reminding me that “if it were easy, someone would have done it
already.”
iv
-
Contents
1 Introduction 1
2 MDO Problem Formulations 82.1 Notation and Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 All-at-Once (AAO) Formulation . . . . . . . . . . . . . . . . . . . . . . . 102.3 Simultaneous Analysis and Design (SAND) . . . . . . . . . . . . . . . . . 142.4 Individual Discipline Feasible (IDF) . . . . . . . . . . . . . . . . . . . . . 162.5 Multidisciplinary Feasible (MDF) . . . . . . . . . . . . . . . . . . . . . . 172.6 Computing Gradients for IDF and MDF . . . . . . . . . . . . . . . . . . 192.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3 Visualizing MDO Architectures 273.1 Diagram Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.2 The Extended Design Structure Matrix (XDSM) . . . . . . . . . . . . . . 293.3 Monolithic MDO Architectures . . . . . . . . . . . . . . . . . . . . . . . 343.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4 Current MDO Architectures 384.1 Motivation for Decomposition . . . . . . . . . . . . . . . . . . . . . . . . 394.2 Distributed Architecture Classification . . . . . . . . . . . . . . . . . . . 404.3 Distributed MDF Architectures . . . . . . . . . . . . . . . . . . . . . . . 434.4 Distributed IDF Architectures . . . . . . . . . . . . . . . . . . . . . . . . 454.5 Architecture Benchmarking . . . . . . . . . . . . . . . . . . . . . . . . . 514.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5 Monolithic MDO Problem Structures 545.1 Sparsity in MDO Problem Formulations . . . . . . . . . . . . . . . . . . 545.2 Exploiting Sparsity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585.3 Exploiting Analytic Gradient Structures . . . . . . . . . . . . . . . . . . 665.4 Constraint Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6 Development of a Matrix-Free Optimizer 766.1 Algorithm Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 776.2 The Augmented Lagrangian Method . . . . . . . . . . . . . . . . . . . . 78
v
-
6.3 Estimating Second Derivatives . . . . . . . . . . . . . . . . . . . . . . . . 846.4 The Split Quasi-Newton Strategy . . . . . . . . . . . . . . . . . . . . . . 896.5 The Approximate Jacobian Strategy . . . . . . . . . . . . . . . . . . . . 906.6 Implementation Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 946.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7 Structural and Aerostructural Wing Design 977.1 Analysis and Optimization Software . . . . . . . . . . . . . . . . . . . . . 987.2 Benchmark Problem: Plate Design . . . . . . . . . . . . . . . . . . . . . 997.3 Structural Optimization of an Aircraft Wing . . . . . . . . . . . . . . . . 1067.4 Aerostructural Optimization of an Aircraft Wing . . . . . . . . . . . . . 1197.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
8 Conclusions and Recommendations 128
Bibliography 135
vi
-
List of Tables
7.1 Average run times for specific computations in the plate optimization prob-lem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.2 Average run times for specific computations in the wing structure opti-mization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.3 Aircraft specifications from [3, 101] . . . . . . . . . . . . . . . . . . . . . 1217.4 Average run times for specific computations in the aerostructural opti-
mization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
vii
-
List of Figures
1.1 Historical trends in processor clock speed, core count, and number of tran-sistors [71]. Recent gains in computational rates come from greater parallelprocessing, not higher clock speed. . . . . . . . . . . . . . . . . . . . . . 5
2.1 Groups of variables and constraints that are eliminated from AAO to ob-tain SAND, IDF, and MDF. . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1 Example design structure matrix for an automobile engine from Browning[33] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Generic, three-discipline, fully-coupled, multidisciplinary system. Eachdiscipline analysis i shares its state yi with other disciplines and requiresthe states of other disciplines in its own analysis. . . . . . . . . . . . . . 30
3.3 Gauss–Seidel MDA procedure. Each discipline analysis is evaluated insequence using the most recent state information from other disciplines anda fixed choice of design variables. The MDA block measures convergenceof the discipline states. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4 Jacobi MDA procedure with parallel execution of discipline analyses. Thesystem being analyzed is identical to that in Figure 3.3. . . . . . . . . . . 32
3.5 Jacobi MDA procedure with parallel execution of discipline analyses usingour convention for parallel diagram structure. The MDA process shownhere is identical to that shown in Fig. 3.4. . . . . . . . . . . . . . . . . . 33
3.6 Optimization algorithm where the optimizer requires gradients of both theobjective and the constraints. The gradients are calculated by a separatecomponent. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.7 XDSM for the SAND architecture. The locations in which the functionsof Problem (2.3) are evaluated are noted in the diagram. . . . . . . . . . 34
3.8 XDSM for the IDF architecture. The locations in which the functions ofProblem (2.4) are evaluated are noted in the diagram. . . . . . . . . . . . 35
3.9 XDSM for the MDF architecture. The locations in which the functions ofProblem (2.5) are evaluated are noted in the diagram. . . . . . . . . . . . 36
4.1 Classification and summary of the MDO architectures. . . . . . . . . . . 42
4.2 Diagram of the ASO architecture. . . . . . . . . . . . . . . . . . . . . . . 43
4.3 Diagram of the CO architecture. . . . . . . . . . . . . . . . . . . . . . . . 47
4.4 Diagram of the ATC architecture. . . . . . . . . . . . . . . . . . . . . . . 50
viii
-
5.1 Jacobian sparsity of the SAND problem formulation with three disciplines.The sparsity structure with respect to the problem disciplines is clearlyvisible. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2 Jacobian sparsity of the IDF problem formulation with three disciplines. 57
5.3 Jacobian sparsity of the alternative IDF problem formulation (5.1). Notethe similarity with Figure 5.1. . . . . . . . . . . . . . . . . . . . . . . . . 58
5.4 Jacobian sparsity of the MDF problem formulation with three disciplines.This formulation has no obvious sparsity structure to exploit. . . . . . . 58
5.5 SAND Jacobian sparsity of Figure 5.1 reordered to group disciplinary vari-ables together. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.6 IDF Jacobian sparsity of Figure 5.3 reordered to group disciplinary vari-ables together. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.7 Computational times needed to solve problem (5.6). . . . . . . . . . . . . 64
5.8 Function calls needed to solve problem (5.6). . . . . . . . . . . . . . . . . 65
5.9 KS aggregation of two bound constraints for various values of ρKS. As ρKSincreases, the accuracy of the boundary to the feasible region improves,and the optimal objective value decreases. . . . . . . . . . . . . . . . . . 72
5.10 For ρKS = 2, the gradient of the KS function changes gradually near theconstraint intersection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.11 For ρKS = 30, the gradient of the KS function changes abruptly near theconstraint intersection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.1 Performance profile comparing our matrix-free optimizer to LANCELOTon a collection of test problems. Among the optimizers tested, our matrix-free optimizer (AUGLAG-SBMIN) was able to solve 90% of the problemsin the test set and is competitive with LANCELOT when the LSR1 quasi-Newton method is used to estimate second derivatives. . . . . . . . . . . 87
7.1 Geometry and load condition of plate mass minimization problem . . . . 100
7.2 Final thickness distributions for the 400-, 1600-, and 3600-element plateproblems. These solutions were all obtained by the approximate Jacobianversion of the matrix-free optimizer. . . . . . . . . . . . . . . . . . . . . 101
7.3 Stress distributions as a fraction of the local yield stress for the 400-, 1600-, and 3600-element plate problems. These solutions were all obtained bythe approximate Jacobian version of the matrix-free optimizer. SNOPTsolutions are similar. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
7.4 Number of forward and adjoint linear solves required to solve the plate de-sign optimization problem and the corresponding run time for each prob-lem size. By construction, the number of variables is equal to the numberof constraints in all instances of the problem. Across a range of problemsizes, both versions of AUGLAG are more efficient than SNOPT in termsof number of linear solves, but are not competitive in terms of run time,even with parallel processing taken into account. . . . . . . . . . . . . . . 103
ix
-
7.5 Comparison of wall time fraction spent in optimizer to solve the plate prob-lem. As a fraction of total computational time, the split-quasi-NewtonAUGLAG software requires little computation for large problems com-pared to SNOPT. This result suggests that the implementation languageof the optimizer does not play a significant role in the run time resultshown in Figure 7.4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.6 Outer geometry and layout of the baseline wing structure. . . . . . . . . 107
7.7 Illustration of patches on the wing to which individual design variablesand failure constraints are assigned. . . . . . . . . . . . . . . . . . . . . . 107
7.8 Overestimate of the optimal mass of the test wing for various aggregationschemes and ρKS values using SNOPT. The mass is normalized with re-spect to the case of 2832 constraints and ρKS = 100. A 10% spread inoptimum mass is observed at ρKS = 50 while a 4% spread is observed atρKS = 100. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.9 Top skin thicknesses for ρKS = 50 and ρKS = 100 when different ag-gregation schemes are employed. Using more failure constraints in theoptimization problem allows certain parts of the wing to be designed witha thinner skin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.10 Stress distributions on the top surface of the wing for the 2.5g load caseusing optimal solutions for ρKS = 50 and ρKS = 100 and two differentaggregation schemes. The ‘lambda’ value indicates the ratio of stress toyield stress of the material. The wing design obtained using ρKS = 100and a large number of failure constraints is more fully stressed, indicatinga more efficient structure. . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.11 Number of linear solve operations and run time to optimize test wingusing SNOPT for several aggregation schemes and ρKS values. Constraintaggregation clearly reduces the computational effort to solve the designproblem, at a cost of a higher optimum mass shown in Figure 7.8. However,the relationship between run time and amount of aggregation is not simple.112
7.12 Trade-offs between mass overestimate and computational effort for optimalwing structure design. The lowest-cost estimates of optimum mass comefrom aggregating failure constraints as much as possible and using largevalues of ρKS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.13 Overestimate of the optimal mass of the test wing comparing two aggrega-tion schemes in SNOPT with AUGLAG. Despite using relaxed convergencetolerances, both versions of AUGLAG find optimum masses within 1-3%of the best estimate from SNOPT. . . . . . . . . . . . . . . . . . . . . . 114
7.14 Comparison of the number of linear solves and run time to optimize thetest wing for SNOPT and AUGLAG. AUGLAG can solve the design prob-lem with 2832 constraints using more than 80% fewer linear solves thanSNOPT for a range of ρKS values. However, only the split-quasi-Newtonversion of AUGLAG reduces the run time to solve the problem comparedto SNOPT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
x
-
7.15 Trade-offs between mass overestimate and computational effort for theoptimal wing structure design using AUGLAG. For a small increase in thenumber of linear solves, AUGLAG with the minimum amount of constraintaggregation can provide a lower optimum mass estimate than SNOPT withconstraint aggregation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.16 Comparison of the run time to optimize the test wing for SNOPT andAUGLAG Split QN using a selection of random starting points. On aver-age, the run time is independent of ρKS in this range of values and randomfluctuations are common. . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7.17 Overestimate of TOGW of the test aircraft comparing SNOPT with con-straint aggregation to AUGLAG Split QN without constraint aggregation.The TOGW is normalized with respect to the case of 4251 constraints andρKS = 100. For ρKS values higher than 50, the increase in TOGW causedby constraint aggregation is less than 1%. . . . . . . . . . . . . . . . . . . 123
7.18 Comparison of the wing deflection under 2.5g load for ρKS = 50 for thecases of 18 constraints (gray) and 4251 constraints (blue). Because of thesmall difference in computed TOGW, the difference in tip deflection iswithin the thickness of the tip airfoil. . . . . . . . . . . . . . . . . . . . . 124
7.19 Overestimate of the wing structure mass for the minimum-TOGW aircraft.The difference in structural mass caused by aggregation of the failure con-straints is similar to that found in Figure 7.8. . . . . . . . . . . . . . . . 124
7.20 Comparison of the number of linear solves and run time to optimize thetest wing for the split-quasi-Newton version of AUGLAG with the esti-mated equivalent cost in SNOPT. Even with parallel processing solvingthe problem with 4251 constraints is expected to take SNOPT more thantwo weeks. AUGLAG is, therefore, much more cost effective than SNOPTif a high priority is placed on avoiding constraint aggregation. . . . . . . 126
xi
-
List of Symbols and Abbreviations
Abbreviations
AAO All-at-OnceASO Asymmetric Subspace OptimizationATC Analytical Target CascadingAUGLAG A matrix-free augmented Lagrangian optimizerBFGS Broyden–Fletcher–Goldfarb–Shanno (quasi-Newton method)BLISS Bi-Level Integrated System SynthesisCO Collaborative OptimizationCRM Common Research ModelCSSO Concurrent Subspace OptimizationDFP Davidon–Fletcher–Powell (quasi-Newton method)DSM Design Structure MatrixECO Enhanced Collaborative OptimizationEPD Exact Penalty DecompositionGMRES Generalized Minimum ResidualIDF Individual Discipline FeasibleIPD Inexact Penalty DecompositionKKT Karush–Kuhn–Tucker (optimality conditions)KS Kreisselmeier–Steinhauser (constraint aggregation)LBFGS Limited-memory BFGSLSR1 Limited-memory SR1MDA Multidisciplinary AnalysisMDF Multidisciplinary FeasibleMDO Multidisciplinary Design OptimizationMDOIS Multidisciplinary Design Optimization based on Independent SubspacesNIP Nonlinear Interior PointQSD Quasi-Separable DecompositionSAND Simultaneous Analysis and DesignSBMIN A matrix-free optimizer for bound-constrained nonlinear problemsSNOPT Sparse Nonlinear Optimizer (software)SQP Sequential Quadratic Programming
xii
-
SR1 Symmetric Rank-One (quasi-Newton method)TACS Toolkit for the Analysis of Composite StructuresTOGW Takeoff Gross WeightTriPan A three-dimensional panel codeXDSM Extended Design Structure Matrix
Symbols, Chapters 2-5
c Vector of design constraint valuesC Vector of design constraint functionsCC Vector of consistency constraint functionsCJ Inconsistency objectives or constraints in CO architectureddx
Total derivative∂∂x
Partial derivativef Objective function valueF Design objective function (scalar)φ Penalty function value for ATC architecture constraintsΦ Penalty function for ATC architecture constraintsI Identity matrixKS Kresselmeier–Steinhauser (KS) aggregation functionm Number of constraintsM Number of state variablesn Number of design variablesN Number of disciplinesP p-norm aggregation functionr Vector of governing equation residual valuesR Governing equations of a discipline analysisρKS KS parameterw Penalty weight vectorx Design variable vectorx̂0i Discipline i copy of system design variables (CO and ATC architectures)y State variable vectorŷ Coupling state variable vector copyY Implicit function that computes state variables y
Common Superscripts and Subscripts, Chapters 2-5
Superscript (0) Initial dataSubscript 0 Vector or function is shared over the entire system
xiii
-
Subscript i Discipline index (MDO problems) or constraint index (gradientcomputation)
Subscript j Discipline index (MDO problems) or design variable index (gradientcomputation)
Subscript k State variable indexSuperscript ∗ Optimal value
Symbols, Chapter 6
A Approximate Jacobian matrixB Approximate Hessian matrixC Constraint function vector∆ Trust region radiusF Objective functionΦ Augmented Lagrangian functiong Augmented Lagrangian gradientH Exact Hessian matrixI Identity matrixI Infeasibility functionJ Exact Jacobian matrixL Lagrangian functionλ Lagrange multiplier vectorp Search direction vectorP Projection operatorQ Quadratic model of augmented Lagrangianρ Penalty parameters Search direction for quasi-Newton methodsσ Adjoint search direction for quasi-Newton methodst Slack variable vectorx Decision variable vectorxL Decision variable lower boundsxU Decision variable upper boundsy Change in gradient for quasi-Newton methods∇ Gradient operator∇2 Hessian operatorη Feasibility toleranceω Optimality toleranceΩ Bound-constrained region
xiv
-
Chapter 1
Introduction
Multidisciplinary design optimization (MDO) is a field of engineering that is concerned
with the application of numerical optimization methods to the design of engineering
systems. Here, “engineering systems” means any device with multiple interacting com-
ponents and where multiple knowledge bases, or disciplines, are needed to create design
solutions. In aeronautics, example disciplines include aerodynamics, structural mechan-
ics, combustion, heat transfer, vehicle dynamics, and control analysis. Knowledge of each
area is typically provided through computational models. Combining these models in a
suitable environment provides information about how decisions made by one discipline
affect other disciplines. By mating these models with mathematical optimization tech-
niques, we are able to tailor our designs to account for interdisciplinary interactions while
maximizing or minimizing a specific objective and satisfying design constraints. While
this thesis focuses on the application of MDO to aircraft design [13, 15, 88, 112, 128], we
emphasize that MDO methods can be applied to a wide variety of engineering systems,
and the methods developed herein are generally applicable. Example applications from
the literature include automobiles [108, 109], spacecraft [31, 36, 90], ships [91, 138, 145],
bridges [19], buildings [42, 70], wind turbines [17, 66, 67, 98], railway cars [64], en-
gines [133, 172], robots [86, 173], batteries [189, 190], and even microscopes [147].
1
-
Chapter 1. Introduction 2
MDO exhibits three principal advantages over traditional approaches to engineering
design. First, MDO makes extensive use of computational tools for analyzing and refining
design solutions. Using computational tools in place of physical experiments and proto-
type hardware greatly reduces the time and cost of the design process because the impact
of small changes to the design can be evaluated more quickly. Second, by combining the
computational tools, MDO allows designers to consider interdisciplinary interactions con-
current with the disciplines themselves. Designers thus have access to more information
about how local changes to a design can influence the behaviour of the whole system and
integration issues can be addressed earlier in the design process. Finally, by employing
numerical optimization tools, MDO allows designers to rapidly explore the design space.
Baseline designs can be automatically refined to exhibit better performance, and solu-
tions may be found that were not intuitive to the human designers. This last advantage
is particularly useful when the design being considered is a novel concept in which design
experience is lacking and empirical rules-of-thumb have yet to be developed.
In aeronautics, MDO can trace its origins back to the early work on structural op-
timization by Schmit [158, 183]. Initially, only simple truss structures were considered,
but the idea of applying optimization methods to structural design quickly spread. One
of the first truly multidisciplinary design problems solved by optimization was that of
an aircraft wing subject to constraints on the amount of lift generated and the allowable
stress on the structure [159]. Variations on this design problem are still solved to this
day, albeit with much more sophisticated wing models and optimization software. The
algorithms developed in this thesis are applied to one such version of this problem.
An aircraft wing provides an ideal example of the importance of considering multiple
disciplines of a system early in the design process. In conducting a purely aerodynamic
analysis of a wing, we implicitly assume that the shape of the wing is known and that
the wing is completely rigid. In other words, the computed aerodynamic load does
not change the wing shape. Likewise, in conducting a purely structural analysis of the
-
Chapter 1. Introduction 3
same wing, we assume that the load on the structure does not vary as the structure
is deflected. (This assumption is present even if we only consider the case of a linear
structure.) In practice, of course, neither of these assumptions hold true. A real wing
bends under aerodynamic loading and the change in shape itself changes the loading
on the structure. The more flexible the wing, the more strongly the aerodynamic and
structural analyses are coupled. It is only by considering the aerodynamic and structural
disciplines simultaneously, and transferring information between them that we are able
to accurately model and understand the behaviour of the wing system.
The use of MDO in aircraft design has taken on increasing importance in recent
years. In addition to reducing operating costs for the airlines, the aviation community is
increasingly concerned about the environmental impacts of aviation. Human industrial
activity, especially the burning of fossil fuels, is the widely-accepted cause of a rising level
of atmospheric carbon dioxide and subsequent warming of the planet [2]. While avia-
tion activities only contribute 2% of the carbon emissions worldwide [1], these emissions
necessarily occur at a high altitude, and the altitude of the emissions has been shown to
increase the effect on the warming process [1]. Furthermore, projected increases in the
demand for air travel mean that the number of aircraft in service is expected to double in
the next 20 years [4, 5]. These new aircraft must be even more efficient than the previous
generation to keep aviation’s relative contribution to global emissions constant.
Keeping the relative emissions constant or decreasing them can only be achieved
through developing new technologies to reduce drag, weight, and fuel consumption.
Equally important is the development of new design procedures for appropriately in-
tegrating these technologies into new aircraft for maximum efficiency [96]. Additional
research suggests that more efficient aircraft can be developed by moving away from the
traditional “tube-and-wing” configuration to more unconventional designs. For exam-
ple, Liebeck [122] proposed a concept called the blended wing body, in which the wing
and fuselage merge seamlessly to create a smoother aerodynamic shape [113, 124, 150].
-
Chapter 1. Introduction 4
Gallman et al. [68] studied a joined-wing aircraft, in which a second wing connects the
tips of the main wing to the top of a vertical tail. Gur et al. [81] studied strut-braced
and truss-braced wings, in which additional external structure is added near the wing
root to develop wings with higher aspect ratios. While all these concepts offer great
promise in improving efficiency, none of them have been built as a full-sized aircraft.
When these concepts are developed into full-sized aircraft, they have to be competitive
with the tube-and-wing configuration from the start, without all the accumulated years
of empirical design knowledge. It is in this environment of radical changes to design
objectives, aircraft configuration, and technology that MDO has become a promising
approach to the design of new aircraft.
A major consideration in MDO, and other computational methods in engineering, is
the computer hardware used to solve the design problems. Since the 1960s, the num-
ber of transistors that can be etched on a given circuit board has doubled every 24
months according to Moore’s Law [136]. For many years, this meant that the clock
speed of processors doubled at the same rate. However, to keep heat dissipation manage-
able on silicon circuit boards, the speed of sequential processing has stagnated in recent
years [171]. Rather than coming from higher clock speeds, the continued increase in the
rate of computation is expected to come from the increased use of specialized computer
architectures, such as multicore processors, that make greater use of parallel process-
ing [71, 171]. Figure 1 displays the historical trends in CPU technology from 1970 to the
present day. Special attention must be paid to developing those MDO methods that can
exploit parallel computing facilities to a high degree.
Outline and Contributions
This thesis is primarily concerned with the “how” of MDO: the formulation of the opti-
mization problem, the computation of design sensitivities, i.e., gradient information, and
-
Chapter 1. Introduction 5
Figure 1.1: Historical trends in processor clock speed, core count, and number of tran-sistors [71]. Recent gains in computational rates come from greater parallel processing,not higher clock speed.
the solution algorithm itself. In particular, we are interested in how the structure of the
optimization problem influences the selection of the optimization software and the larger
MDO architecture. We use the term “MDO architecture” to mean the combination of
the optimization problem formulation and the algorithm used to solve it. We focus our
discussion on MDO problems that contain no discrete or integer variables and whose
objective and constraint functions are smooth and have little noise.
The overarching goal of this thesis is to advance the state-of-the-art in MDO methods,
not just for aircraft design applications, but the design of complex engineering systems
in general. We start by surveying the range of approaches available to solving MDO
problems and develop a framework in which to describe them. This framework helps
to tie together key concepts in the literature and places individual architectures in the
proper context. We also introduce some new notation and diagrams to help newcomers
to the MDO field become acquainted with the various techniques. Both the survey itself
and a description of the diagrams are available in the literature [115, 131]. These papers
-
Chapter 1. Introduction 6
complement previous surveys of the MDO field by Sobieszczanski-Sobieski and Haftka
[165], and Agte et al. [8].
After developing this framework, we focus on a particular area of application: the
aerostructural design of aircraft wings. Using our framework, we show the typical archi-
tectures used to solve this problem and current challenges with using these architectures.
By observing a particular matrix structure in the gradient computation that is common to
many MDO problem formulations, we motivate and develop a numerical optimizer that is
“matrix-free” in that it does not require the computation of complete derivative matrices.
While matrix-free optimization algorithms have been considered by the optimization com-
munity for many years [49, 73, 87], robust implementations of these algorithms that solve
general optimization problems are essentially nonexistent. We conjecture that the lack of
a suitable application has hindered matrix-free algorithm development to this point. Our
work is the first time such an optimizer has been considered for MDO applications. We
apply our matrix-free optimizer to several test problems related to aerostructural design
and show how this matrix-free approach to MDO offers great promise for reducing the
computational effort required to solve large-scale MDO problems. These results are in
the process of being published in the peer-reviewed literature [16, 116].
This thesis is structured as follows. Chapter 2 outlines the three main problem for-
mulations in MDO including a common formulation from which they are all derived.
Chapter 2 also discusses issues surrounding the computation of gradient information in
MDO problems for the main problem formulations. Chapter 3 outlines a novel way of
visualizing MDO architectures, known as the extended design structure matrix (XDSM).
Chapter 4 surveys existing MDO architectures using the tools provided in the earlier
chapters, including a novel classification of MDO architectures that points out key simi-
larities between the architectures and areas for future research. Chapter 5 returns to the
three main problem formulations to identify specific ways of exploiting problem structure.
Chapter 6 motivates the use of matrix-free optimization methods for our design problem
-
Chapter 1. Introduction 7
of interest and outlines the main features of our matrix-free algorithm. Chapter 7 shows
the results of applying the matrix-free optimizer to structural and aerostructural design
problems, including trends in the computational effort needed to solve the problems. Fi-
nally, Chapter 8 summarizes our findings, the key contributions of this work, and future
research directions.
-
Chapter 2
MDO Problem Formulations
Many MDO problem formulations exhibit common structural patterns. In this chapter,
we show how these patterns can be unified into a single problem statement and how to
make specific transformations to adapt this problem statement to suit the information
available to the optimizer. In particular, the form in which each discipline analysis
is conducted, and whether or not the governing equations can be made available to the
optimizer dictates the form of the MDO problem to solve. We also discuss how to compute
derivatives for gradient-based optimization for cases in which the governing equations of
each discipline are solved separately from the optimizer. This chapter builds off the
seminal work of Cramer et al. [48] and synthesizes fundamental ideas in MDO problem
formulation and gradient computation [131]. Further details on derivative computation
methods are given by Martins and Hwang [130].
2.1 Notation and Terminology
We now define some basic terminology in MDO so that we can extend a basic nonlin-
ear optimization problem to apply to a typical engineering design problem. A design
variable is a variable that is always under the control of the optimizer regardless of the
problem formulation. These variables correspond to decisions or specifications made by
8
-
Chapter 2. MDO Problem Formulations 9
the designers of the system. Examples of these variables include component dimensions
and geometry specifications. In a multidisciplinary context, these variables may be local,
i.e., pertain to a specific discipline, or may be shared between multiple disciplines. A
variable like the sweep angle of a wing would affect both aerodynamic, structural, and
stability disciplines, so it is a good example of a shared design variable. We use the letter
x to denote a vector of design variables, x0 to denote a vector of shared variables, xi to
denote a vector of variables local to discipline i. We denote the complete set of all design
variables in the problem by x.
A discipline analysis is a simulation or computation that models the behaviour of one
aspect of a multidisciplinary system. Discipline analyses can range in complexity from
empirical rules-of-thumb or curve-fit data, to physics-based models that directly solve
a set of governing differential or integral equations. These analyses can be classified as
low-fidelity or high-fidelity, depending on how accurately and robustly they model the
real-world behaviour of a system. The output of a discipline analysis is a set of state
variables, also known as response variables. Examples include fluid density and velocity
at specific points in the flow field of an aerodynamic analysis; deformation, strain, and
stress in a structural analysis; and particle positions, velocities, and vibration frequencies
in a dynamics analysis. While state variables are often computed through the discipline
analysis process itself, some MDO problem formulations treat the state variables as
independent variables and the governing equations of the analysis as a set of constraints
in the optimization problem. (Sections 2.2 and 2.3 discuss those formulations.) We
denote the set of equations governing discipline analysis i by Ri and corresponding set
of state variables by yi. We denote the set of state variables computed by all disciplines
by y.
Like design variables, design objectives and constraints may also be treated as local
or shared. Local objectives and constraints may only depend on information available to
that discipline. For discipline i, the objective and constraint functions may only depend
-
Chapter 2. MDO Problem Formulations 10
on x0, xi, and yi. Shared objectives and constraints may only depend on shared design
variables x0 and the state information produced by all disciplines, y. Respectively, we
denote shared and local design objectives by F0 and Fi and the sets of shared and local
design constraints by C0 and Ci. We denote the complete set of both local and shared
constraints by C. For clarity, all functions are denoted by capital letters and all variables
are denoted by lower-case letters.
The key feature of MDO problems, as compared with single-discipline optimization
problems, is that the disciplines exchange state variables. The state variables of discipline
i become inputs to another discipline j and vice versa. In general, only a subset of all
state variables needs to be exchanged and we refer to this subset as the coupling variables.
To simplify the notation, we denote the coupling variables by the same symbol as the
state variables, yi. If individual discipline analyses are solved separately from each other,
the output of discipline i will not be the same as the coupling variable information input
to discipline j. We must, therefore, specify a copy of the discipline i coupling variables to
be used as input to the other disciplines. We denote this copy by ŷi. These variables are
sometimes referred to as target variables in the literature. Both the coupling variables
and their copies must converge to the same value at an optimal design, so we must specify
an additional set of consistency constraints in the problem formulation to enforce this
condition. These consistency constraints are denoted by Cci and take the form ŷi−yi = 0
for each discipline.
2.2 All-at-Once (AAO) Formulation
Having defined the terminology, we can now show the MDO problem in its most general
form. We refer to this problem as the all-at-once (AAO) problem because all the mathe-
matical relations needed to define the problem are present. We define the AAO problem
-
Chapter 2. MDO Problem Formulations 11
with N disciplines as
minimize F0 (x0, y) +N∑
i=1
Fi (x0, xi, yi)
with respect to x, ŷ, y
subject to C0 (x0, y) ≥ 0Ci (x0, xi, yi) ≥ 0 for i = 1, . . . , NCci (ŷi, yi) = ŷi − yi = 0 for i = 1, . . . , NRi (x0, xi, ŷj, yi) = 0 for i = 1, . . . , N, j 6= i.
(2.1)
To give a concrete example of an MDO problem, we refer back to the aircraft wing exam-
ple discussed in the introduction. A simple wing design problem contains two disciplines:
1) aerodynamics, and 2) structures. The governing equations of the aerodynamic and
structural analyses are contained in R1 and R2. The aerodynamic state y1 defines the
properties of the airflow over the wing. The structural state y2 defines the deflection
of the structure under the aerodynamic loads. The shared design objective F0 can be
an aircraft performance objective, such as range, endurance, or take-off gross weight
(TOGW) for a design mission, that requires both the aerodynamic and structural state
to evaluate. The problem objective could also be a linear combination of an aerodynamic
objective F1, such as minimum drag, and a structural objective F2, such as minimum
mass. The shared design variables of the problem, x0, define the wing geometry, such as
sweep, twist, taper, and span. Aerodynamic design variables x1 include the angle of at-
tack at each flight condition analyzed. Structural design variables x2 include thicknesses
of the ribs, spars, and skin. An example aerodynamic design constraint C1 is that the
wing must generate a prescribed lift for a prescribed flight condition. Structural design
constraints C2 include limits on the stress and deflection in different parts of the wing
structure. An example shared design constraint C0 is that the total aircraft weight, in-
cluding the structure, and the lift computed by the aerodynamics analysis must match
at a prescribed flight condition.
-
Chapter 2. MDO Problem Formulations 12
Note that Problem (2.1) is stated in a canonical form. Design constraints of a different
form, such as bounds on individual variables, general equality constraints, and general
range constraints, can be lumped into the appropriate vector of local constraints using
simple transformations. In addition, the discipline-specific objectives Fi are not always
necessary and may be set to zero. Similar general problem statements have appeared in
the literature before; Cramer et al. [48] designate one MDO problem formulation as AAO,
while a slightly different formulation is simply stated as “the most general form [48].”
While we have simplified the issue of coupling variable exchange discussed by Cramer et
al., Problem (2.1) is most like the problem statement of “the most general form.”
We emphasize that our nomenclature for Problem (2.1) is not the current standard in
the literature. Some literature attributes names like “all-at-once” or “all-in-one” to what
we refer to as the multidisciplinary feasible (MDF) formulation. This nomenclature comes
from the viewpoint that the discipline analyses and design optimization are handled
separately, which is not always the case. In other literature, Problem (2.1) is referred
to as simultaneous analysis and design (SAND) [20, 82]. While SAND and AAO are
identical for the case of N = 1, we have chosen to separate them to more naturally derive
the other fundamental problem formulations.
Because of the complexity of the minimization problems that arise in MDO, they
are solved numerically with specialized mathematical optimization algorithms and soft-
ware. This software requires the user to provide function values for all objective and
constraint functions given a choice of input variables. In the case of Problem (2.1), the
user would have to provide the output values of F0, Fi, C0, Ci, Cci , and Ri for all i
given a choice of x, y, and ŷ. Depending on the optimization software, first and second
derivative information may also be required. Algorithms that use derivative information
are called gradient-based algorithms, while algorithms that do not are called gradient-
free algorithms. Section 2.6 discusses how to obtain derivatives for MDO problems when
gradient-based algorithms are applicable.
-
Chapter 2. MDO Problem Formulations 13
AAO SAND
IDF MDF
Problem Formulationsminimize F0 (x, y) +
N∑
i=1
Fi (x0, xi, yi)
with respect to x, ŷ, y
subject to C0 (x, y) ≥ 0Ci (x0, xi, yi) ≥ 0 for i = 1, . . . , NCci = ŷi − yi = 0 for i = 1, . . . , NRi (x0, xi, ŷj 6=i, yi) = 0 for i = 1, . . . , N
Remove Cc, ŷ
RemoveR, y
RemoveR, y
Remove Cc, ŷ
Figure 2.1: Groups of variables and constraints that are eliminated from AAO to obtainSAND, IDF, and MDF.
In the broader optimization literature, Problem (2.1) may be categorized as a prob-
lem with complicating variables [43] or a quasiseparable problem [83]. Problems of this
form may be solved by decomposition methods. If we substitute ŷ for y in F0 within
Problem (2.1) and fix the values of the shared variables x0 and ŷ, we would obtain a
minimization problem of the form
minimizeN∑
i=1
Fi (xi, yi)
with respect to x, y
subject to Ci (xi, yi) ≥ 0 for i = 1, . . . , NRi (xi, yi) = 0 for i = 1, . . . , N.
(2.2)
Decomposing Problem (2.2) into N independent minimization problems is trivial. All the
objective and constraint functions depend on disjoint sets of local variables xi and yi, so
the N problems can be solved independently, possibly in parallel. Solving Problem (2.1)
with the shared variables included while exploiting the problem structure requires spe-
cialized decomposition methods. Chapter 4 discusses decomposition strategies in more
detail.
The power of the AAO problem statement as we have described it is that the SAND,
individual discipline feasible (IDF), and MDF formulations can be derived from it by
simply eliminating certain groups of constraints and corresponding sets of variables. Fig-
-
Chapter 2. MDO Problem Formulations 14
ure 2.1 sketches the transformations between AAO, SAND, IDF, and MDF. The elimi-
nated constraints are said to be closed within the formulation [10]. Satisfaction of these
closed constraints is not dependent upon the action of the optimizer but on some other
aspect of the problem. For example, the constraints Ri(x0, xi, ŷj, yi) = 0 can be closed
by solving discipline analysis i directly within the optimization process. In the following
sections, we detail the process of closing certain groups of constraints, how the prob-
lem formulation changes in each case, and the advantages and disadvantages of each
formulation.
2.3 Simultaneous Analysis and Design (SAND)
Problem (2.1) is never explicitly solved in practice. Due to the simple structure of the
consistency constraints in Problem (2.1), we can eliminate them by introducing a single
set of state variables. The resulting formulation is the Simultaneous Analysis and Design
(SAND) problem,
minimize F0 (x0, y)
with respect to x, y
subject to C0 (x0, y) ≥ 0Ci (x0, xi, yi) ≥ 0 for i = 1, . . . , NRi (x0, xi, y) = 0 for i = 1, . . . , N.
(2.3)
Because the optimizer maintains control over satisfaction of the discipline analyses di-
rectly, it has responsibility for simultaneously analyzing and designing the system. As
stated in Section 2.2, the SAND and AAO problems are identical if N = 1. If the disci-
pline analyses are discretized partial differential equations, (PDEs,) the SAND problem
is a general form of PDE-constrained optimization. (The texts by Biegler et al. [23] and
Borz̀ı and Schultz [26] provide comprehensive overviews of that field.)
The SAND formulation may be regarded as an “intrusive” problem formulation be-
-
Chapter 2. MDO Problem Formulations 15
cause it requires access to a lot of information from the disciplinary computational models
and control of many variables that would normally be handled by individual disciplines.
Rather than having each discipline complete its own analysis outside of the optimizer
given coupling and design variable inputs, the SAND formulation forces the optimizer to
choose the set of state variables that solve each discipline analysis. If specific software
was developed to solve the discipline analyses directly, this software may not be useful
in a SAND formulation without extensive modification. Furthermore, if the discipline
analyses themselves consist of millions of equations and state variables — a common oc-
currence when performing the highest-fidelity analyses, such as a three-dimensional CFD
simulation — then the optimization problem will be enormous.
If the discipline analysis software can accommodate the SAND formulation and the
computational environment can handle the size of the MDO problem under consideration,
the SAND formulation may result in the fastest times to obtain an optimal design. This is
because the optimizer is not restricted to searching regions of the design space where the
design must always be feasible with respect to certain sets of constraints. In other words,
at each new point chosen by the optimizer, we do not have to choose combinations of x0,
xi, and yi that solve Ri(x0, xi, y) = 0 in order to guarantee convergence to an optimal
design. The optimizer itself ensures that the governing equations are solved at the final
design solution. If the discipline analyses are nonlinear or require iterative methods
in their solution, this feature is particularly useful. Nearly all modern optimization
algorithms, especially the sequential quadratic programming (SQP) and nonlinear interior
point (NIP) algorithms, permit the exploration of infeasible regions of the design space
to converge to a solution more rapidly.
-
Chapter 2. MDO Problem Formulations 16
2.4 Individual Discipline Feasible (IDF)
Instead of eliminating the consistency constraints from Problem (2.1), we now choose
to eliminate the discipline analysis constraints. This nonlinear constraint elimination
results in the individual discipline feasible (IDF) formulation [48], given by
minimize F0 (x0, Y (x, ŷ))
with respect to x, ŷ
subject to C0 (x0, Y (x, ŷ)) ≥ 0Ci (x0, xi, Yi (x0, xi, ŷj)) ≥ 0 for i = 1, . . . , N, j 6= iCci (x0, xi, ŷ) = ŷi − Yi (x0, xi, ŷj) = 0 for i = 1, . . . , N, j 6= i.
(2.4)
The notation Yi is used to highlight the fact that the corresponding coupling variables yi
are no longer independent variables but are functions of other variables. (The distinction
between the output variable value yi and the functional mapping Yi becomes important
when discussing gradient computation.) Because the governing equations Ri are non-
linear in general, we use the Implicit Function Theorem to argue that the elimination
is valid in the vicinity of a local minimum. More precisely, if Ri(x0, xi, yi, ŷj) = 0 and
∂Ri/∂yi is nonsingular, then Yi is implicitly defined such that yi = Yi(x0, xi, ŷj) for j 6= i.
If N = 1, IDF is sometimes known as nested analysis and design (NAND) [20] to draw
a direct comparison with SAND.
Unlike the SAND formulation, IDF does permit separate discipline analysis software
to compute the state variables independent from the optimizer. IDF is therefore less
intrusive than SAND and is easier to use with existing analysis software. Another positive
feature of IDF is that the problem size may be much smaller than that of SAND. Not
only are large sets of state variables and constraints not present in the optimization
problem, but, of the state variables that remain, only the coupling variables are needed
in the problem formulation to enforce interdisciplinary consistency of the optimal design.
-
Chapter 2. MDO Problem Formulations 17
In other words, the optimizer can ignore state variables that are not exchanged between
disciplines or used to evaluate any design objectives or constraints. If the number of
coupling variables is small, the optimization problem is also small.
The drawback with using IDF is that, unlike with SAND, the discipline analyses
must be computed precisely to resolve the implicit constraints Ri(x0, xi, yi, ŷj) = 0 at
the optimal design. In principle, we could perform the discipline analyses inexactly
in the early stages of the optimization and gradually converge the analysis in concert
with the optimization. Some methods in the optimization literature do account for
inexact function evaluation [37, 93] but we are not aware of any applications in which
these methods have been applied. The näıve approach of solving each discipline analysis
precisely is still the recommended approach for IDF.
When gradient-based optimization is employed with IDF, the method used to com-
pute the gradients is an important factor in the total computational work to find an
optimal design. In SAND, only partial derivatives with respect to all variables are re-
quired because all variables are independent. In IDF, eliminating the governing equation
constraints results in some variables becoming functions of others. Therefore, the deriva-
tives of each function with respect to the independent variables must account for the
change in the dependent variables as well. These gradients are, in fact, the total deriva-
tives of the function output with respect to the independent variables. Because of the
importance of this subject, we will delve into it in more detail in Section 2.6.
2.5 Multidisciplinary Feasible (MDF)
If we eliminate both discipline analysis and consistency constraints from the AAO prob-
lem, we obtain the multidisciplinary feasible (MDF) formulation [48]. The MDF problem
-
Chapter 2. MDO Problem Formulations 18
statement is given by
minimize F0 (x, Y (x))
with respect to x
subject to C0 (x, Y (x)) ≥ 0Ci (x0, xi, Yi (x)) ≥ 0 for i = 1, . . . , N.
(2.5)
Like IDF, the nonlinear elimination of the constraint sets Ri(x0, xi, yi, ŷj) = 0 is based
on the governing equations satisfying the Implicit Function Theorem in some neighbour-
hood of the optimal solution. In this case, Ri(x0, xi, yi, ŷj) = 0 and yi is eliminated
simultaneously for all disciplines.
One way of interpreting the MDF problem statement is as a single-discipline opti-
mization problem in which the single discipline analysis is replaced by a multidisciplinary
analysis (MDA). Because of this simple structure, MDF has great intuitive appeal. If an
MDA procedure is already in place, the MDF formulation provides a simple approach to
adding optimization to the design process. Another advantage of MDF is that the opti-
mization problem is the smallest of the three fundamental formulations. The optimizer
is responsible for changing only design variables and satisfying only design constraints.
State variables and coupling variables can be computed with specialized software. Fur-
thermore, if the optimization procedure needs to be terminated early, the consistency
constraints are already satisfied so that, even if the design constraints are not all satis-
fied, the system behaviour is known at a new design point.
Similar to IDF, the main disadvantage lies in the need to compute the MDA accurately
at the optimal design. Once again, it should be possible, in principle, to solve the MDA
less accurately while the optimization is in progress and increase accuracy as optimality
is approached. At present, we do not know of any rules which can be used to specify the
accuracy of the MDA adaptively, so our general recommendation is to resolve the MDA
accurately at each iteration of the optimization process.
-
Chapter 2. MDO Problem Formulations 19
Like IDF, the computation of gradient information for MDF becomes more difficult
due to the presence of total derivatives rather than partial derivatives. However, in MDF,
the problem is compounded by the fact that we are now dealing with multidisciplinary
systems in which a change in any variable propagates through the entire system. For-
tunately, efficient gradient computation methods for the multiple-discipline case can be
constructed with knowledge of efficient methods from the single-discipline case. The
next section outlines how derivatives can be computed for both IDF and MDF problem
statements.
2.6 Computing Gradients for IDF and MDF
While the focus of this thesis is on MDO problems that can be solved using gradient-
based optimization methods, many interesting design problems contain discrete or in-
teger design variables or have objective or constraint functions that are very noisy.
For these problems, gradient-free optimization methods are the only option. Often,
the gradient-free algorithms chosen use heuristic approaches to optimization like sim-
ulated annealing [105, 175], genetic algorithms [55], particle swarm optimization [97],
ant colony optimization [63], and more obscure methods like biogeography-based op-
timization [7, 163] and grey wolf optimization [135]. More mathematically rigourous
gradient-free methods include mesh-adaptive direct search [119] and model-based inter-
polation methods [75, 156]. The textbook by Conn et al. [47] provides an introduction
to the state-of-the-art in derivative-free methods.
In some cases, researchers opt to use gradient-free optimizers to solve problems for
which gradients are available and can be computed reliably. Common reasons cited for
using a gradient-free algorithm are that gradients are expensive to compute, particularly
when discipline analyses are expensive, and that gradient-free algorithms can avoid get-
ting stuck in a local minimum. To address the former concern, we show later in this
-
Chapter 2. MDO Problem Formulations 20
section how gradients can be computed for a cost similar to that of evaluating the objec-
tive and constraint functions. Regarding the latter concern, Sigmund [162] points out,
using a topology optimization example, that gradient-free algorithms can also get stuck
in local minima on hard problems while still being far more computationally expensive
than gradient-based algorithms. Furthermore, sampling techniques can be used to choose
a range of starting points and efficiently search for multiple local minima [40, 126]. If
the optimization problems of interest have a large number of variables and gradients are
available, gradient-based optimizers are the most efficient way to solve them [125, 195].
We therefore prefer to use gradient-based optimization methods where possible.
We now present a review of the options for gradient computation, or local sensitivity
analysis, applied to single-discipline and IDF problems. The most straightforward way
to compute total derivatives for a single discipline analysis is using some type of finite-
differencing procedure. In this scheme, a design variable is perturbed by a small value
and the appropriate discipline analyses are evaluated at the new point to measure the
change in the discipline state. For n design variables, each discipline analysis would need
to be evaluated an additional n times to compute all the changes in state variables. If the
discipline analyses are expensive, or the problem contains many independent variables,
the finite-difference approach is inefficient. Furthermore, subtractive cancellation errors
can cause derivative estimates to be inaccurate if the design variable perturbation is
chosen to be too small. Nevertheless, the ease of implementation of finite-differencing
means that it is still a common approach to computing derivatives.
Several other approaches, requiring more implementation time, exist to improve
both the accuracy and efficiency of the derivative estimates. The complex-step ap-
proach [132, 168] provides a twist on traditional finite-differencing by using an imaginary-
valued variable perturbation rather than a real-valued one. If the discipline analysis soft-
ware can perform complex-number arithmetic, the resulting derivative approximation is
accurate to machine precision with a sufficiently small perturbation. Nevertheless, the
-
Chapter 2. MDO Problem Formulations 21
complex-step approximation still requires n discipline analyses to be executed. Algorith-
mic differentiation [78, 79] also achieves machine-precision accuracy, but by differentiat-
ing the discipline analysis code line-by-line. While this approach has a high overhead,
the total cost of computing all the derivatives of the problem is no more than a small
multiple of the cost of solving the discipline analyses once [139, Chapter 8]. Of course,
algorithmic differentiation cannot be applied without direct access to the source code.
One more alternative, if the discipline analyses are of a simple enough structure, is to
calculate derivatives symbolically and hard-code them for use by the optimizer.
For expensive discipline analyses, the best approach to computing total derivatives
in general is to compute them analytically by exploiting the structure of the optimiza-
tion problem. We do so by assembling matrices of partial derivatives obtained by any
of the methods described above. The following derivation is similar to that found in
Sobieszczanski-Sobieski [164] and Martins and Hwang [130].
Consider the matrix of total derivatives of a group of design constraints C with
respect to a group of design variables x. (The derivatives of the objective function
F0 are computed in an identical manner, replacing C with F0.) To be mathematically
precise, total derivatives only define relationships between variables, so let us define the
set of variables c = C(x, Y (x)) to represent the output of the constraint functions. By
convention, we will always use the upper-case letter for the function itself, and the lower-
case letter for the variable representing the output value. For the moment, we only focus
on a single discipline, as each analysis is uncoupled in Problem (2.4). We can compute
the total derivative of a single function Ci with respect to a single variable xj as
dcidxj
=∂Ci∂xj
+M∑
k=1
∂Ci∂yk
dykdxj
=∂Ci∂xj
+∂Ci∂y
dy
dxj,
(2.6)
-
Chapter 2. MDO Problem Formulations 22
where M is the total number of state variables. We adopt the shorthand notation
∂C
∂x=∂(C1, ..., Cm)
∂(x1, ..., xn)∈ Rm×n
to describe the Jacobian of a set of functions with respect to a set of variables. For
example, the ∂Ci/∂y term in Equation (2.6) is a row vector of length M containing all
∂Ci/∂yk values. In a finite-differencing scheme, the dy/ dxj vector would be computed
by perturbing xj and running the discipline analysis again to obtain a new state vector
y. Now, however, we recognize that the nonlinear system of equations r = R(x, y) = 0
has been solved for xj, and no change in xj alters this fact. Linearizing this system of
equations at the solution, we can state that
dr
dxj=∂R
∂xj+∂R
∂y
dy
dxj= 0. (2.7)
Rearranging the terms in Equation (2.7) yields
dy
dxj= −
[∂R
∂y
]−1∂R
∂xj. (2.8)
By direct substitution of Equation (2.8) into Equation (2.6), we obtain
dcidxj
=∂Ci∂xj−[∂Ci∂y
] [∂R
∂y
]−1∂R
∂xj. (2.9)
Finally, we drop the subscripts on c and x to obtain an expression for the full derivative
matrix.
dc
dx=∂C
∂x−[∂C
∂y
] [∂R
∂y
]−1∂R
∂x(2.10)
We emphasize again that dc/ dx is an m×n matrix, where m is the number of functions
(design constraints, in this case) and n is the number of design variables.
The only remaining issue with equation (2.10) is how to compute the action of the
-
Chapter 2. MDO Problem Formulations 23
inverse matrix [∂R/∂y]−1 on its neighbours in the formula. Here, the analytic approach
splits into two variations known as the direct and adjoint sensitivity methods [130]. In
the direct method, a sequence of linear systems of the form
[∂R
∂y
]dy
dxj= − ∂R
∂xj(2.11)
is solved to yield column vectors dy/ dxj which populate the dy/ dx matrix in equa-
tion (2.6). Another way to interpret the direct method is that the matrix dc/ dx is
assembled one column at a time, since the linear system (2.11) computes the change in
the entire state with respect to one design variable. In the adjoint method, an alternative
sequence of linear systems of the form
[∂R
∂y
]T [dcidr
]T= −
[∂Ci∂y
]T(2.12)
is solved to yield row vectors dci/dr. These vectors populate the matrix dc/dr in the
expression
dc
dx=∂C
∂x+
dc
dr
∂R
∂x. (2.13)
In contrast to the direct method, the adjoint method assembles the matrix dc/ dx one row
at a time, since each adjoint solution is associated with a function rather than a variable.
In both the direct and adjoint methods, the most costly operation is the solution of the
linear systems (2.11) and (2.12). We select the direct or adjoint method for a particular
problem based on which method requires fewer linear systems to be solved to compute
the entire matrix. If the problem has a large number of design variables but only a small
number of design constraints, fewer systems of the form (2.12) need to be assembled and
solved, so the adjoint method is the natural choice.
To summarize the discussion so far, numerous approaches exist for computing deriva-
tive information for both single-discipline optimization and IDF-type MDO problems.
-
Chapter 2. MDO Problem Formulations 24
Higher accuracy and efficiency in the derivative computation can be achieved with extra
upfront development time. We recommend combining either the direct or adjoint method
with an appropriate method to compute the partial derivatives to compute all gradient
information accurately and efficiently.
To compute total derivatives for the MDF problem statement, we could use any of the
techniques of finite-differencing, complex-step, symbolic differentiation, or algorithmic
differentiation on their own. However, as with IDF, specialized analytic methods are
preferred. We now generalize the direct and adjoint methods described above to the case
of multidisciplinary systems.
Unlike for single-discipline systems, there are two versions of the direct and adjoint
methods for computing multidisciplinary systems. Sobieszczanski-Sobieski [164] referred
to the two versions of the direct method as GSE1 and GSE2. We refer to both a functional
form and a residual form of both the direct and adjoint methods [130]. The derivations
of these methods are omitted here, but we refer the interested reader to the survey paper
by Martins and Hwang [130] for a more complete treatment. For a system with three
disciplines, the residual direct method is given by
∂R1∂y1
∂R1∂y2
∂R1∂y3
∂R2∂y1
∂R2∂y2
∂R2∂y3
∂R3∂y1
∂R3∂y2
∂R3∂y3
dy1dxjdy2dxjdy3dxj
=
−∂R1∂xj
−∂R2∂xj
−∂R3∂xj
(2.14)
while the functional direct method is given by
I −∂Y1∂y2
−∂Y1∂y3
−∂Y2∂y1
I −∂Y2∂y3
−∂Y3∂y1
−∂Y3∂y2
I
dy1dxjdy2dxjdy3dxj
=
∂Y1∂xj∂Y2∂xj∂Y3∂xj
(2.15)
-
Chapter 2. MDO Problem Formulations 25
for a particular variable xj. The functional form is most useful when partial derivatives of
the governing equations themselves are not available. In that case, we revert to computing
the appropriate partial derivatives in a “black-box” fashion, e.g., by a finite-differencing
procedure. The residual adjoint method is given by
∂R1∂y1
T ∂R2∂y1
T ∂R3∂y1
T
∂R1∂y2
T ∂R2∂y2
T ∂R3∂y2
T
∂R1∂y3
T ∂R2∂y3
T ∂R3∂y3
T
dcidr1
T
dcidr2
T
dcidr3
T
=
−∂Ci∂y1
T
−∂Ci∂y2
T
−∂Ci∂y3
T
(2.16)
while the functional adjoint method is given by
I −∂Y2∂y1
T
−∂Y3∂y1
T
−∂Y1∂y2
T
I −∂Y3∂y2
T
−∂Y1∂y3
T
−∂Y2∂y3
T
I
dcidr1
T
dcidr2
T
dcidr3
T
=
∂Ci∂y1
T
∂Ci∂y2
T
∂Ci∂y3
T
(2.17)
for a particular function ci. We note that the functional and residual forms of both
the direct and adjoint methods can be combined as the availability of partial derivative
information dictates by substituting the appropriate rows into the linear system. As
with the single-discipline direct and adjoint methods, to obtain the complete set of first
derivative information we must solve either Equation (2.14) or (2.15) once for every design
variable or solve Equation (2.16) or (2.17) once for every function.
2.7 Conclusion
In this chapter we presented a unified mathematical notation to describe a general MDO
problem and showed three forms of this fundamental problem statement. We also briefly
discussed the primary ways in which we obtain accurate first-derivative information even
for cases where the discipline analyses are solved outside of the optimization process.
-
Chapter 2. MDO Problem Formulations 26
Obviously, solving MDO problems requires many different software components to ex-
change information at appropriate times in the solution process. While it may be easy to
understand the solution process for the problem formulations described in this chapter,
it becomes much more difficult when that solution process involves decomposition of the
MDO problem and coordination among independent parallel processes. Before discussing
more complicated architectures for MDO, in the next chapter we present a novel diagram
to help us visualize these architectures and unify architecture presentation.
-
Chapter 3
Visualizing MDO Architectures
In researching the existing MDO architectures, we found that each author presented the
architectures in slightly different ways. While differences in mathematical notation are
common, they are straightforward to overcome. What is more difficult to communicate
is the sequence of operations in the solution algorithm itself, and the exchange of infor-
mation between distinct software modules in the computational framework. As with any
computational method, the implementation itself is critical to the performance of the
method. However, prior to this work, there had been no standardized way to describe
the solution algorithm and flow of data within the MDO architecture. Such a standard
would be of great use as both a research and educational tool to compare the approaches
of different architectures in a common framework. A common visualization approach
for MDO also has potential as a basis for the graphical interface of MDO integration
software, such as NASA’s OpenMDAO project [76].
This chapter details our visualization approach to MDO architectures. We call our
diagrams extended design structure matrices or XDSMs. We motivate the XDSM format
and compare it with other diagrams used in the literature. We then give examples of how
to apply the XDSM format to a variety of analysis and optimization processes, including
simple MDO architectures. We refer the interested reader to Lambe and Martins [115] for
27
-
Chapter 3. Visualizing MDO Architectures 28
a more detailed discussion of our design process and other applications of the diagram.
3.1 Diagram Motivation
To describe an MDO architecture, a candidate diagram should be able to track both
the flow of information and the flow of the algorithm at the same time. Therefore, fol-
lowing the basic philosophy of Tufte [181], we require a diagram that is reasonably easy
to understand, yet has a high density of information. We would also like the ability to
incorporate mathematical notation directly into the diagram so that the viewer can im-
mediately connect parts of the diagram with corresponding elements in the optimization
problem formulation.
Architecture diagrams in the MDO literature usually consist of some version of a
flowchart or unstructured block diagram. These diagrams are usually sufficient to display
either data flow or process flow, but not both. Adapting and standardizing these diagrams
to MDO architectures was considered, but the number of blocks and connections required
to display complex architectures could quickly turn these diagrams into “spaghetti.” The
unified modeling language (UML) [25] has the advantages of a standard notation and a
close connection with software development. However, depicting data and process flow
would require multiple diagrams and we want to avoid using more than one diagram for
one architecture.
Instead, we chose to develop a standard diagram based on the design structure matrix
(DSM) [33, 170] or N2 diagram [117]. Figure 3.1 shows an example DSM from the liter-
ature. The main attraction of the DSM is that the diagram is structured. In particular,
the relative position of each element of the diagram contributes to the interpretation of
the whole diagram. The components of a system are arranged along the diagonal of a
square matrix and the interfaces with any component are defined in the same row or
column as that component. Therefore, the components and their connections with each
-
Chapter 3. Visualizing MDO Architectures 29
Figure 3.1: Example design structure matrix for an automobile engine from Browning[33]
other can be rapidly identified. Furthermore, the DSM defines the direction of the in-
terface. The interface from component A to component B is on the opposite side of the
diagonal from the interface from component B to component A. If we treat the compo-
nents of the DSM as components of an MDO architecture, the DSM provides a natural
way of specifying the data communication between components. The XDSM extends
this notation to include a mechanism for specifying the solution process.
3.2 The Extended Design Structure Matrix (XDSM)
We now explain the XDSM notation using a series of increasingly complex examples.
Before modeling a more complex analysis or optimization process, let us present a generic
system with which to work. Figure 3.2 shows a generic, fully-coupled, three-discipline
system. Essentially, Figure 3.2 is just a DSM of our example system. The convention in
this diagram, and all subsequent diagrams, is that the outputs of the discipline analysis
are placed in the same row, while the inputs are placed in the same column. As an
added visual cue, thick, gray lines are used to denote data flow connections. The shapes
of each block were selected based on standard flowchart notation: rectangles for generic
-
Chapter 3. Visualizing MDO Architectures 30
Analysis 1 y1 y1
y2 Analysis 2 y2
y3 y3 Analysis 3
Figure 3.2: Generic, three-discipline, fully-coupled, multidisciplinary system. Each dis-cipline analysis i shares its state yi with other disciplines and requires the states of otherdisciplines in its own analysis.
processes, and parallelograms for input and output processes. Note, however, that the
choice of shapes and colors is redundant because the structure of the DSM dictates
whether the contents of a block in the diagram are a component or an interface.
To perform a multidisciplinary analysis (MDA) of the system in Figure 3.2, we add
to the XDSM an additional component that defines the iterative process, known as a
driver, and some notation to denote the order-of-execution. In a Gauss–Seidel-type
MDA process, each discipline is analyzed in sequence using the most up-to-date state
information from the other disciplines. Figure 3.3 depicts a Gauss–Seidel-type MDA
process for our generic system. The external inputs to the system — the design variables
and an initial estimate of the state variables — are placed in the top row and the system
outputs — the final, consistent, system state variables — are placed in the left-most
column. In Figure 3.3, the order-of-execution is denoted by the step numbers in each
component block, starting from zero. Loops within the sequence are defined by the
notation p→ q, where q and p are nonnegative integers with q < p. This notation means
that the sequence returns to step q until some completion criterion is satisfied, at which
point the algorithm proceeds to step p + 1. Step numbers are also introduced in the
data blocks to specify when the data is input to a particular component. In some MDO
architectures, it is possible for the same data to be taken from multiple sources, e.g., a
-
Chapter 3. Visualizing MDO Architectures 31
ŷ x0, x1 x0, x2 x0, x3
(no data)0,4→1:MDA
1 : ŷ2, ŷ3 2 : ŷ3
y1 4 : y11:
Analysis 12 : y1 3 : y1
y2 4 : y22:
Analysis 23 : y2
y3 4 : y33:
Analysis 3
Figure 3.3: Gauss–Seidel MDA procedure. Each discipline analysis is evaluated in se-quence using the most recent state information from other disciplines and a fixed choiceof design variables. The MDA block measures convergence of the discipline states.
discipline analysis and a surrogate model of that analysis, and this notation accounts for
that case. Note, however, that these numbers are not present in the external inputs and
outputs of the diagram because these data are fixed, respectively, at the beginning and
end of the process. Finally, for clarity, we use a different block shape to denote drivers
and thin black lines to connect consecutive components in the algorithm.
We can also depict parallel processes using the XDSM notation. As a simple exam-
ple, we will use a Jacobi MDA process. Unlike the Gauss–Seidel process, the Jacobi
MDA process forces each discipline to conduct their analyses using state information
from other disciplines from the previous iteration. While the state information used by
each discipline may be less up-to-date, it allows all discipline analyses to be executed in
parallel. Figure 3.4 shows the XDSM of the Jacobi MDA process. Note that processes
executed in parallel are all assigned the same step number. As a further simplification
to the diagram, we can “stack” similar parallel components to save space in a large sys-
tem. Figure 3.5 shows an example of this simplification. We adopt the convention that a
reference to component i implies a repeated structure across all disciplines. The stacked
-
Chapter 3. Visualizing MDO Architectures 32
ŷ x0, x1 x0, x2 x0, x3
(no data)0,2→1:MDA
1 : ŷ2, ŷ3 1 : ŷ1, ŷ3 1 : ŷ1, ŷ2
y1 2 : y11:
Analysis 1
y2 2 : y21:
Analysis 2
y3 2 : y31:
Analysis 3
Figure 3.4: Jacobi MDA procedure with parallel execution of discipline analyses. Thesystem being analyzed is identical to that in Figure 3.3.
analysis blocks in Figure 3.5 provide an added visual cue.
Displaying gradient-based optimization processes in an XDSM is easy to do with the
tools described above. We just need to define components for the optimizer itself and for
the function and gradient evaluation processes. Figure 3.6 shows this optimization pro-
cess. In this case, we have assumed that the gradients can be computed analytically given
only the current design point. If the gradients were computed by finite differencing, that
could also be depicted using a driver that would repeatedly call the objective and con-
straint functions to compute the derivatives. Finally, our choice of splitting the objective
and constraint components is arbitrary. Using a single “functions” component or even
a separate component for each function individually would be equally valid, provided
the optimizer is supplied with the same information. In practice, the grouping of com-
ponents that compute function and gradient information depends on the computational
environment.
As a final thought, we expect the XDSM to be applicable to a much wider range of
procedures than the examples presented in this thesis. Each component in the XDSM is
simply a computational element that produces an output based on some input. Drivers
-
Chapter 3. Visualizing MDO Architectures 33
ŷ x0, xi
(no data)0,2→1:MDA
1 : ŷj 6=i
yi 2 : yi1:
Analysis i
Figure 3.5: Jacobi MDA procedure with parallel execution of discipline analyses usingour convention for parallel diagram structure. The MDA process shown here is identicalto that shown in Fig. 3.4.
x(0)
x∗0,2→1:
Optimization1 : x 1 : x 1 : x
2 : f1:
Objective
2 : c1:
Constraints
2 : df/dx, dc/dx1:
Gradients
Figure 3.6: Optimization algorithm where the optimizer requires gradients of both theobjective and the constraints. The gradients are calculated by a separate component.
-
Chapter 3. Visualizing MDO Architectures 34
x(0), y(0)
x∗, y∗0,2→1:
Optimization1 : x, y 1 : x0, xi, y
2 : f0, c
1:Functions
F0(x0, y), C(x, y)
2 : ri
1:Residual iRi(x0, xi, yi)
Figure 3.7: XDSM for the SAND architecture. The locations in which the functions ofProblem (2.3) are evaluated are noted in the diagram.
are just components that perform additional computations to check a looping condition.
These components are then assembled into a matrix and connected together through
paths defined by the algorithm. We expect the nature of each process to determine what
components need to be defined and how they can be related back to the mathematical
statements of the problem and algorithm.
3.3 Monolithic MDO Architectures
Using the figures of Section 3.2 as a guide, we are now ready to depict full MDO archi-
tectures using the XDSM notation. Recall that an MDO architecture is defined by the
problem formulation together with the solution algorithm. This section details monolithic
architectures, those architectures that are based around a single optimization problem
statement. In particular, the three fundamental MDO problem formulations discussed
in Chapter 2 form the basis for three fundamental MDO architectures.
Figure 3.7 presents the SAND architecture. This architecture corresponds to the
SAND problem formulation in Equation (2.3). The SAND architecture contains compo-
-
Chapter 3. Visualizing MDO Architectures 35
x(0), ŷ(0)
x∗0,3→1:
Optimization1 : x0, xi, ŷj 6=i 2 : x, ŷ
y∗i
1:Analysis i
Yi(x0, xi, ŷj 6=i)2 : yi
3 : f0, c, cc
2:Functions
F0(x, y), C(x, y),Cc(ŷ, y)
Figure 3.8: XDSM for the IDF architecture. The locations in which the functions ofProblem (2.4) are evaluated are noted in the diagram.
nents that, rather than computing the state variables, evaluate the governing equation
residual values. Since these computations can be made independently from evaluating
design objectives and constraints, Figure 3.7 shows them being evaluated in parallel. If
the optimize