Systematic evaluation of ill-posed problems in model-based ......Systematic evaluation of ill-posed...

Systematic evaluation of ill-posed problems inmodel-based parameter estimation and

experimental design

vorgelegt von

M.Sc.

Diana Carolina Lopez Cardenas

aus Bucaramanga-Kolumbien

von der Fakultat III - Prozesswissenschaften

der Technischen Universitat Berlin

zur Erlangung des akademischen Grades

Doktor der Ingenieurwissenschaften

- Dr.-Ing. -

genehmigte Dissertation

Promotionsausschuss:

Vorsitzender: Prof. Dr. Peter Neubauer

1. Gutachter: Prof. Dr.-Ing. habil. Prof. h.c. Dr. h.c. Gunter Wozny

2. Gutachter: Dr.-Ing. Tilman Barz

3. Gutachter: Prof. Dr. Flavio Manenti

Tag der wissenschaftlichen Aussprache: 16.08.2016

Berlin 2016

Acknowledgements

To my dear adviser Prof. Gunter Wozny who gave me the incredible opportunity to work

for him and to know how is the work and life in Germany. This decision changed my life.

I am very grateful for the trust and freedom I received from him from the beginning of

my research to develop my own ideas. Thanks Prof. Wozny for always having kind words

to me and strong support when I needed it.

To my little baby girl Juliana Lucıa the most wonderful motivation to reach the sky

and to my Oliver, the most lovely daddy ever, to support, teach, encourage and love me

throughout this project.

To my lovely Mom Ana in Colombia to always send to me her best wishes and prayers

and to support all my issues in Colombia whilst I have been abroad. To my lovely aunts

Ceci and Fala because your prayers and best energy have been always with me. To my

dear aunts, uncles and cousins in Bucaramanga and Enciso, all of you are really close to

my heart. I hope my effort can be an example for the new generation and all of you can

reach your dreams. Mis logros siempre estan dedicados a ustedes.

To my dear Geli und Kalle to become my German Mom and Dad. Their continuous and

lovely support gave me the last energies to finish this writing. To my sweet Omi Gitti and

Opi Manny to include me in their beautiful family and think of each small but wonderful

detail to make my life nicer. Dankeschon fur die viel Liebe und Unterstutzung!

To my Latin American Girls (Aglys, Caro, Dani and my Janet) who were the best

friends and company during my adaptation to the German culture.

To Tilman who has always guided and wisely advised me. Nobody could have had

a better scientific mentor and friend...thanks for your always helpful, constructive and

honest feedback on my work.

To Victor Zavala who gave me the opportunity to know the other side of the force when I

did my internship at Argonne National Laboratory in Chicago, USA. The several technical

discussions and sharp ideas that we shared during this period were fundamentals to finish

my research.

To my dear Sandris to be the best friend at the office specially in the cold and dark winter

days. To Alejo to be the best colombian support at the DBTA. To my kind colleagues of

the old and new buildings to make a great place to work and being so kind to me since

we met.

To all my “team“ which is really numerous counting family, friends and colleagues in

two continents. You have played an important role during this phase of my life but in

general during all my life. And to those who gave me the best of their lives when I was

still in Colombia. Your mark is indelible.

Finally but not less important to my dear Lord. Thank you for giving to me my sweet

Juliana Lucıa, letting me dream and reaching my earthly goals. You are and will be my

company, strength and hope always!!!

Berlin, August 2016

III

Abstract

The lack of informative experimental data, the complexity of first-principles models, over-

parameterization and parameter correlations make, among others, the recovery of kinetic,

transport, and thermodynamic parameters complicated. These issues are sources of non-

identifiability and consequently ill-posedness. This research investigates the features,

sources, effects and treatments of ill-posedness in nonlinear parameter estimation (PE)

and optimal experimental design (OED). The connection between identifiability problems

and ill-posed problems is established. There are two main focuses: the detection of ill-

conditioning to diagnose identifiability issues, and the application and discussion of regu-

larization techniques. This thesis develops and tests the idea that a deep analysis of the

singular value spectrum of the sensitivity matrix enables the diagnosis of non-identifiability.

By using this approach the effects of regularization in PE and OED can be predicted.

Monte Carlo studies are accomplished in order to support intermediate conclusions ob-

tained by the singular value analysis. With all these components in mind this thesis

proposes a computational framework to systematically evaluate ill-posed problems in PE

and OED.

Multiple methods to assess estimator performance, to determine ill-conditioning and

non-identifiability and to regularize parameter estimations are employed. Techniques

such as singular value analysis, parameter variance-decomposition, orthogonal decompo-

sitions, dynamic sensitivity profiles among others are used to investigate ill-conditioning

and non-identifiability. Three regularization techniques, namely orthogonal decomposi-

tion based techniques (i.e., Subset Selection -SsS and Truncated Singular Value Decom-

position -TSVD) and the Tikhonov regularization are applied. In parameter estimation,

two paradigms to analyze an estimator are described. The first paradigm uses parameter-

output sensitivity information, whereas the second paradigm is conducted via Monte Carlo

studies. To illustrate the significance of the computational framework offline applications

in Lithium-ion batteries, bio-ethanol production and bioreactors for several purposes are

examined. One case in chromatography separation is studied in the context of online

parameter estimation and redesign of experiments.

The application of the framework emphasizes various deficiencies in the studied cases

and demonstrates how to handle them. In the Lithium-ion battery case it is demonstrated

that the use of voltage discharge curves only enables the identification of a small parameter

subset, regardless of the number of experiments considered. In the Bio-ethanol produc-

tion case it is shown that parameter estimations of over-parameterized and parameter-

correlated models can be successfully treated by restricting oneself to the estimation of

the identifiable parameters. For the bioreactor systems optimal design solutions are proven

to be ineffective and/or meaningless because they are obtained from unidentifiable models.

Finally, in the online liquid chromatography case the instabilities and poor robustness of

the parameter estimation algorithm due to scarce experimental data at the beginning of

the experiment is treated by means of regularization techniques.

V

Zusammenfassung

Unzureichende experimentelle Daten, die Komplexitat physikalischer Modelle, Uber-

parametrisierung und Korrelationen unter den Parametern, um nur eine kleine Auswahl

zu nennen, verkomplizieren die Schatzung von kinetischen, thermodynamischen und

Transport-Parametern. Diese Probleme sind Ursachen fur Nichtidentifizierbarkeit und fol-

glich “ill-posedness”(schlecht gestellte Probleme). In dieser Arbeit werden die Auswirkun-

gen, Ursachen und Behebungsstrategien von ill-posed, nicht-linearen Parameterschatzun-

gen (PE1) und optimaler Versuchsplanung (OED2) untersucht. Die Verbindung zwischen

Identifizierbarkeitsproblemen und ill-posed Problemen wird besprochen. Das Hauptaugen-

merk liegt dabei auf folgenden beiden Punkten: Die Bestimmung von “ill-conditioning“

(schlecht konditioniertes Problem) um Identifizierbarkeitsprobleme zu diagnostizieren und

die Anwendung sowie Diskussion von Regularisierungstechniken. In dieser Arbeit wird die

Idee entwickelt, dass eine profunde Analyse des Singularwertspektrums der Sensitivitats-

matrix die Diagnose von Nichtidentifizierbarkeit ermoglicht. Unter Verwendung dieses

Ansatzes konnen die Effekte der Regularisierung in PE und OED vorhergesagt werden.

Monte Carlo Tests werden durchgefuhrt um die Vorhersagen der Singularitatsanalyse zu

uberprufen. Mit den angefuhrten Methoden wird in dieser Arbeit ein Rahmen zur system-

atischen Evaluierung von ill-posed Problemen in PE und OED geschaffen.

Es werden verschiedene Methoden zur Bestimmung der Qualitat der Parameter-

schatzung, von ill-conditioning und Nichtidentifizierbarkeit und zur Regularisierung ver-

wendet. Techniken wie Singularitatsanalyse, Parameter-Varianz-Dekomposition, orthogo-

nale Dekomposition, die Auswertung von dynamischen Sensitivitatsprofilen und weitere

werden angewendet um ill-conditioning und Nichtidentifizierbarkeit zu untersuchen. Drei

Regularisierungstechniken, darunter zwei Techniken basierend auf orthogonale Dekomposi-

tion (Subgruppenauswahl -SsS3 und abgeschnittene Singularwertzerlegung -TSVD4) sowie

die Tikhonov-Regularisierung werden verwendet. In der Parameterschatzung werden zwei

Paradigmen zu deren Untersuchung beschrieben. Das erste Paradigma verwendet Sensi-

tivitatsinformationen, wohingegen das zwei Paradigma auf Monte Carlo Studien basiert.

Um die Anwendbarkeit und Bedeutung der zuvor besprochenen Methoden zu illustrieren

werden“offline“ Anwendungen in Lithium-Ionen-Batterien, in der Bio-Ethanol Produktion

sowie in Bioreaktoren untersucht. Des weiteren wird eine Anwendung zur chromatographis-

chen Trennung im Kontext von optimaler Versuchsplanung und Redesign untersucht.

Die Anwendung des vorgeschlagenen methodologischen Rahmens zeigt eine Vielzahl von

Mangeln in den untersuchten Fallen auf und bespricht deren Behebungsstrategien. Im Fall

der Lithium-Ionen-Batterien wird gezeigt, dass die alleinige Verwendung von Entladungs-

daten nur die verlassliche Identifizierung eines kleinen Teils der erforderlichen Modellpa-

rameter ermoglicht, und das unabhangig von der Anzahl der durchgefuhrten Experimente.

1Abkurzung der englischen Bezeichnung Parameter Estimation2Abkurzung der englischen Bezeichnung Optimal Experimental Design3Abkurzung der englischen Bezeichnung Subset Selection4Abkurzung der englischen Bezeichnung Truncated Singular Value Decomposition

VII

In der Bioethanolproduktion wird gezeigt, dass Probleme bei der Parameterschatzung

von uberparametrisierten Modellen mit korrelierten Parametern durch die Bestimmung

weniger, aber essentieller Parameter behoben werden konnen. Die Losungen zur optimalen

Versuchsplanung von Bioreaktoren stellen sich als ineffektiv heraus, da sie fur nicht identi-

fizierbare Modelle berechnet wurden. Schlussendlich werden im Fall der Chromotographie

Instabilitaten in der Parameterschatzung, verursacht durch die dunne Datenlage zu Beginn

des Experimentes, durch Regularisierungstechniken behoben.

VIII

Contents

Abstract V

Zusammenfassung VII

List of Figures VIII

List of Tables XVIII

Nomenclature XXIX

Co-authorship XXXIII

List of publications used for this thesis 1

1. Introduction 3

1.1. Research motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2. Research scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3. Research outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2. Theoretical background I: Model-based parameter estimation and experimental

design 9

2.1. Model development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2. Model formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.3. Parameter estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3.1. Mathematical formulation . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3.2. Parameter-output sensitivity matrix . . . . . . . . . . . . . . . . . . 15

Sensitivity measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.4. Singular value decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.4.1. SVD of the sensitivity matrix . . . . . . . . . . . . . . . . . . . . . . 17

2.4.2. SVD of the sensitivity matrix vs the eigensystem of Fisher-

information matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.5. Parameter estimator analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.5.1. Estimator precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Covariance matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Confidence interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.5.2. Estimator accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.5.3. Reliability tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Hypothesis test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Confidence interval test . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.6. Identifiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.6.1. Qualitative identifiability . . . . . . . . . . . . . . . . . . . . . . . . 23

IX

Contents

2.6.2. Quantitative identifiability . . . . . . . . . . . . . . . . . . . . . . . 24

2.7. Optimal experimental design . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.7.1. OED design criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.7.2. Graphical interpretation of OED design criteria . . . . . . . . . . . . 26

2.7.3. Observation about singular matrices in OED . . . . . . . . . . . . . 28

2.7.4. Sequential OED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.7.5. Online OED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Finite time horizon schemes . . . . . . . . . . . . . . . . . . . . . . . 29

Online mathematical formulation of PE and OED . . . . . . . . . . 31

2.8. Initial guess sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.8.1. Minimum bias Latin hypercube design (MBLHD) . . . . . . . . . . . 32

3. Theoretical background II: Ill-posed problems and numerical regularization 35

3.1. Direct and inverse problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.2. Ill-posed problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.2.1. Ill-conditioned problems . . . . . . . . . . . . . . . . . . . . . . . . . 37

Ill-conditioning and collinearity measures . . . . . . . . . . . . . . . 38

Classification of ill-conditioned problems . . . . . . . . . . . . . . . . 39

Relationship between identifiability problems and ill-conditioning . . 40

Effect of ill-conditioning on parameter estimation . . . . . . . . . . . 41

Effect of ill-conditioning on optimal experimental design . . . . . . . 42

3.2.2. Parameter variance-decomposition . . . . . . . . . . . . . . . . . . . 42

3.3. Numerical regularization for parameter estimation . . . . . . . . . . . . . . 43

3.3.1. Parameter subset selection (Reg=SsS) . . . . . . . . . . . . . . . . . 45

3.3.2. Truncated singular value decomposition (Reg=TSVD) . . . . . . . . 46

3.3.3. Tikhonov regularization (Reg=Tikh) . . . . . . . . . . . . . . . . . . 46

3.3.4. Regularized parameter covariance matrix . . . . . . . . . . . . . . . 47

4. Computational framework 49

4.1. Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.2. Analysis paradigms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.2.1. Sensitivity method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.2.2. Monte Carlo method . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.3. Estimator performance assessment . . . . . . . . . . . . . . . . . . . . . . . 56

4.3.1. Estimator precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Covariance based on the Sensitivity Matrix . . . . . . . . . . . . . . 57

Covariance based on Monte Carlo . . . . . . . . . . . . . . . . . . . 57

Confidence intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.3.2. Estimator accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.4. Structural Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

X

Contents

4.4.1. Ill-Conditioning Analysis . . . . . . . . . . . . . . . . . . . . . . . . 58

Sensitivity method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

Monte Carlo method . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.4.2. Identifiability diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . 60

Variance Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

SVD Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

QR Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.5. Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.5.1. Regularization in parameter estimation . . . . . . . . . . . . . . . . 63

4.5.2. Regularization in optimal experimental design . . . . . . . . . . . . 64

4.5.3. Selection of the regularization parameter . . . . . . . . . . . . . . . . 64

4.6. Other analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.6.1. Parameter sensitivity analysis . . . . . . . . . . . . . . . . . . . . . . 64

4.6.2. Selection of a parameter initial guess . . . . . . . . . . . . . . . . . . 65

5. Lithium-ion battery: Finding adequate experimental data 67

5.1. Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

5.2. Li-Ion Battery Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

5.3. Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5.3.1. Case 1: Single Discharge Curve. . . . . . . . . . . . . . . . . . . . . 75

Sensitivity Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Monte Carlo Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

5.3.2. Case 2: Multiple Discharge Curves. . . . . . . . . . . . . . . . . . . . 81

Sensitivity Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

Monte Carlo Method. . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5.3.3. Case 3: Discharge curves and electrolyte concentration profile. . . . 85

5.3.4. Computational Issues . . . . . . . . . . . . . . . . . . . . . . . . . . 88

5.4. Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . 90

6. Bioethanol: Identifying an over-parameterized model with large parameter cor-

relations 91

6.1. Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

6.2. Bio-ethanol from cane bagasse by SSF process . . . . . . . . . . . . . . . . . 92

6.2.1. Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

6.2.2. Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6.3. Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

6.3.1. Model selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

6.3.2. Parameter initial guess selection . . . . . . . . . . . . . . . . . . . . 100

6.3.3. Iterative parameter estimation with structural analysis . . . . . . . . 102

6.3.4. Estimator performance assessment . . . . . . . . . . . . . . . . . . . 107

XI

Contents

6.3.5. Validation of the identified model . . . . . . . . . . . . . . . . . . . . 109

Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

Assessment of the identifiable parameter subset . . . . . . . . . . . . 110

6.3.6. Discussion of the results . . . . . . . . . . . . . . . . . . . . . . . . . 113

6.4. Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

7. More cases from bioprocessing: the effect of ill-posed parameter estimation on

optimal experimental design 117

7.1. Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

7.2. Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

7.3. Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120


7.3.2. Ill-conditioning and identifiability diagnosis . . . . . . . . . . . . . . 121

E1 - Fed Batch Fermentation . . . . . . . . . . . . . . . . . . . . . . 121

E2 - Biochemical network . . . . . . . . . . . . . . . . . . . . . . . . 122

E3 - ASM3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

7.3.3. Optimal design without regularization . . . . . . . . . . . . . . . . . 125

E1 - Fed Batch Fermentation . . . . . . . . . . . . . . . . . . . . . . 125

E2 - Biochemical network . . . . . . . . . . . . . . . . . . . . . . . . 125

E3 - ASM3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

7.3.4. Optimal design with regularization . . . . . . . . . . . . . . . . . . . 128

Subset Selection (Reg=SsS) . . . . . . . . . . . . . . . . . . . . . . . 128

Truncated Singular Value Decomposition (Reg=TSVD) . . . . . . . 130

Tikhonov Regularization (Reg=Tikh) . . . . . . . . . . . . . . . . . 131

7.3.5. Influence of the available measurement information on ill-posedness

(New) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

7.3.6. Monte Carlo study (New) . . . . . . . . . . . . . . . . . . . . . . . . 136

Study of the initial design . . . . . . . . . . . . . . . . . . . . . . . . 136

Study of the optimal design without regularization . . . . . . . . . . 137

Study of the optimal design with regularization . . . . . . . . . . . . 138

7.3.7. Summary and conclusions . . . . . . . . . . . . . . . . . . . . . . . . 140

8. Chromatography system: Scarce experimental data in online estimation 143

8.1. Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

8.2. HPLC chromatography process . . . . . . . . . . . . . . . . . . . . . . . . . 144

8.2.1. Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

Experimental set-up . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

8.2.2. Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Manager and pump . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

XII

Contents

Chromatography column . . . . . . . . . . . . . . . . . . . . . . . . 149

UV sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

8.3. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

8.3.1. Assignment of variables . . . . . . . . . . . . . . . . . . . . . . . . . 150

8.3.2. Parameters of the time horizon schemes . . . . . . . . . . . . . . . . 151


8.3.4. Online Base Case: PE without regularization (New) . . . . . . . . . 152

8.3.5. Online Regularized Case: PE with regularization (New) . . . . . . . 155

Regularization parameter selection . . . . . . . . . . . . . . . . . . . 155

8.3.6. Online Redesign of Experiments . . . . . . . . . . . . . . . . . . . . 161

8.3.7. Validation of the parameter estimates by Frontal Analysis . . . . . . 162

8.3.8. Optimal input designs vs standard input designs . . . . . . . . . . . 164

8.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

9. Summary and Outlook 169

9.1. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

9.2. Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

A. Appendix 177

A.1. Own publications and presentations . . . . . . . . . . . . . . . . . . . . . . 177

A.1.1. Articles in Journals . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

A.1.2. Oral Presentations and Posters . . . . . . . . . . . . . . . . . . . . . 178

Oral Presentations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

Poster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

A.1.3. Proceedings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

A.2. Own publications used for the cumulative thesis . . . . . . . . . . . . . . . . 179

A.3. Bio-processes: Parameter variance and variance-decomposition . . . . . . . 180

A.4. Implication of structural properties of the sensitivity matrix on Fisher-

information matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

A.5. Matrices notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

A.5.1. Matrices and vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

A.5.2. Matrix operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

A.5.3. Some special matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 186

A.5.4. Matrix rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

A.5.5. Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . 188

A.5.6. Inverse of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

A.5.7. Matrix decompositions . . . . . . . . . . . . . . . . . . . . . . . . . . 191

A.5.8. Eigendecomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

A.5.9. Singular value decomposition (SVD) . . . . . . . . . . . . . . . . . . 192

A.5.10.QR-factorization with column pivoting (QRP) . . . . . . . . . . . . 193

XIII

Contents

A.5.11.Numerical rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

A.5.12.Condition number . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

A.5.13.Collinearity index . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

A.5.14.Covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

A.5.15.Fisher-information Matrix . . . . . . . . . . . . . . . . . . . . . . . . 198

Bibliography 199

XIV

List of Figures

1.1. Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1. Iterative work cycle of model-based experimentation for model development 10

2.2. Influence of alphabetic experimental design criteria on the singular value

spectrum (SVs) of the sensitivity matrix S. (Figure from publication III -

Lopez et al. (2015) - reprinted with permission from Elsevier Science) . . . 27

2.3. Discretization grids and time horizons used in the online algorithm. (Figure

taken from Barz et al. (2013) [8] - reprinted with permission from AIChE

Journal) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.1. Graphical representation of a) the direct problem and b) the inverse problem

in nonlinear modeling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.2. Singular value spectrum (SVs) of a rank-deficient problem. (Figure taken

from publication III - Lopez et al. (2015) - reprinted with permission from

Elsevier Science) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.3. Singular value spectrum (SVs) of an ill-determined rank problem. (Figure

taken from publication III - Lopez et al. (2015) - reprinted with permission

from Elsevier Science) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.1. Consolidated framework for development and experimental validation of

process models with possible ill-posed problems . . . . . . . . . . . . . . . . 50

4.2. Model parameters estimated and analyzed based on the sensitivity method 55

4.3. Model parameters estimated and analyzed based on the Monte Carlo method 56

4.4. Ill-conditioning analysis based on the sensitivity method. . . . . . . . . . . . 59

4.5. Identifiability diagnosis techniques based on the sensitivity method. . . . . 60

5.1. Li-Ion cell during discharge process. Cell consists of a LixC6 negative elec-

trode, a LiyMn2O4 positive electrode, and a separator with LiPF6 salt-

based electrolyte. (Figure taken from publication I - Lopez et al. (2016)

in Appendix A.2 - reprinted from Industrial & Engineering Chemistry Re-

search with permission from American Chemical Society). . . . . . . . . . . 68

5.2. Discharge curves for Case 1 (base rate I1 = 1C) and Case 2 simultaneously

considering fast (I2 = 2C, I3 = 3C, I4 = 4C), and slow rates (I5 = 0.5C

and I6 = 0.1C). Markers are experimental data and solid lines are model

predictions after parameter estimation at estimator θ. (Figure taken from

publication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Indus-

trial & Engineering Chemistry Research with permission from American

Chemical Society). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

XV

List of Figures

5.3. Singular value spectra. Left panel is Case 1 (single discharge curve) and

right panel is Case 2 (multiple discharge curves). (Figure taken from publi-

cation I - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial &

Engineering Chemistry Research with permission from American Chemical

Society). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

5.4. Case 1: Variance decomposition for SVD identifiability method of Section

4.4.2. (Figure taken from publication I - Lopez et al. (2016) in Appendix

A.2 - reprinted from Industrial & Engineering Chemistry Research with

permission from American Chemical Society). . . . . . . . . . . . . . . . . . 79

5.5. Sensitivity time profiles of cell voltage with respect to parameters

dVcell(t)/dθ at nominal discharge rate I1 for Case 1. (Figure taken from




5.6. Case 1: Marginal pdfs obtained from Monte Carlo. Solid-black lines and

filled regions represent the normal and the non-parametric distributions of

each estimator, respectively. Parameters with a star are nominated as iden-

tifiable. (Figure taken from publication I - Lopez et al. (2016) in Appendix




dVcell(t)/dθ at slow I6, nominal I1, and fast I4 discharge rates for scenario

Case 2-SC6. (Figure taken from publication I - Lopez et al. (2016) in Ap-

pendix A.2 - reprinted from Industrial & Engineering Chemistry Research

with permission from American Chemical Society). . . . . . . . . . . . . . . 83

5.8. Case 2: Marginal pdfs for parameters obtained with Monte Carlo analysis

for scenarios Case 2-SC4 (left) and Case 2-SC6 (right). (Figure taken from




5.9. Case 3: Voltage and electrolyte concentration profile at separator for sce-

nario Case 3-SC1. (Figure taken from publication I - Lopez et al. (2016)

in Appendix A.2 - reprinted from Industrial & Engineering Chemistry Re-

search with permission from American Chemical Society). . . . . . . . . . . 86

5.10. Case 3: Spectrum of singular values under different scenarios. (Figure

taken from publication I - Lopez et al. (2016) in Appendix A.2 - reprinted

from Industrial & Engineering Chemistry Research with permission from

American Chemical Society). . . . . . . . . . . . . . . . . . . . . . . . . . . 86

XVI

List of Figures


dVcell(t)/dθ at nominal I1 and fast I4 discharge rates for scenario Case

3-SC4. (Figure taken from publication I - Lopez et al. (2016) in Appendix



5.12. Sensitivity time profiles of electrolyte concentration in the separator with

respect to parameters dce(ℓa+ℓs/2, t)/dθ at nominal I1 and fast I4 discharge

rates for scenario Case 3-SC4. (Figure taken from publication I - Lopez

et al. (2016) in Appendix A.2 - reprinted from Industrial & Engineering

Chemistry Research with permission from American Chemical Society). . . 87

5.13. Case 3: Marginal pdfs for scenarios Case 3-SC1 (left) and Case 3-SC4

(right). (Figure taken from publication I - Lopez et al. (2016) in Appendix



6.1. Simplified reaction mechanisms in SSF processes [36]. (Figure taken from

publication II - Lopez et al. (2013) in Appendix A.2 - reprinted from

Biotechnology Progress with permission from American Institute of Chem-

ical Engineers). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

6.2. Model fitting to experimental data E1 by using (left panel) the original

model proposed in Ref. [36] and (right panel) using the finally selected

model after the model selection step. (Figure taken from publication II

- Lopez et al. (2013) in Appendix A.2 - reprinted from Biotechnology

Progress with permission from American Institute of Chemical Engineers). . 99

6.3. Fitting of experimental data using the model selected in step 1: (left panel)

results for E2 and (right panel) results for E3. Different initial guesses,

IGDrissen[35] and IGPhilippidis [97], were used for the solution of the pa-

rameter estimation problems where measured data from E2 and E3 was

considered simultaneously. (Figure taken from publication II - Lopez et

al. (2013) in Appendix A.2 - reprinted from Biotechnology Progress with

permission from American Institute of Chemical Engineers). . . . . . . . . . 101

6.4. Cost function (CF) of parameter estimation and numerical rank (rϵ) of

the sensitivity matrix obtained for 30 different initial guesses generated

by MBLHD considering data from E2 and E3. The maximum acceptable

cost function value to accept an initial guess is denominated “CF Bound“.

(Figure taken from publication II - Lopez et al. (2013) in Appendix A.2

- reprinted from Biotechnology Progress with permission from American

Institute of Chemical Engineers). . . . . . . . . . . . . . . . . . . . . . . . . 102

XVII

List of Figures

6.5. Results of parameter estimation with identifiability analysis. (Figure taken

from publication II - Lopez et al. (2013) in Appendix A.2 - reprinted from


ical Engineers). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

6.6. Experimental vs. predicted concentrations using the parameter vector θ

calculated in iteration k = 1, 4, 6 using experimental data of E2 and E3.

Results for E2 (left panel) and results for E3 (right panel). (Figure taken



ical Engineers). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

6.7. Ranking of parameters according to sensitivity (the most sensitive parame-

ter above in position 1). The identifiable parameter subset in every iteration

k is marked by shaded cells. (Figure taken from publication II - Lopez et



6.8. Estimator performance assessment: parameter statistical significance (tj -

value) and 95% confidence intervals (L ≤ θ(rk) ≤ U). Shady cells indicate

parameter with statistical significance of 95%. (Figure taken from publica-

tion II - Lopez et al. (2013) in Appendix A.2 - reprinted from Biotechnology


6.9. Cross-validation using parameter vectors obtained from parameter estima-

tion with E2&E3: experimental vs. predicted concentrations for E1 (left

panel), for E4 (middle panel), and for E5 (right panel). (Figure taken



ical Engineers). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

6.10. Validation of the identifiable parameter subset using parameter vectors

obtained after solving parameter estimation problems with Nθ = 4 and

Nθ = 14: experimental vs. predicted concentrations for E1 (left panel), for

E4 (middle panel), and for E5 (right panel). (Figure taken from publication

II - Lopez et al. (2013) in Appendix A.2 - reprinted from Biotechnology


7.1. Main procedure and nomenclature of Section 7.3. (Figure taken from publi-

cation III - Lopez et al. (2015) in Appendix A.2 - reprinted from Computers

& Chemical Engineering with permission from Elsevier). . . . . . . . . . . . 120

XVIII

List of Figures

7.2. Singular value spectrum (SVs) of the sensitivity matrix evaluated at the

initial design S(uIG) for problem (a) E1, (b) E2 and (c) E3. Each singular

value less than ϵ-threshold is considered ill-conditioned. (Figure taken from

publication III - Lopez et al. (2015) in Appendix A.2 - reprinted from

Computers & Chemical Engineering with permission from Elsevier). . . . . 123

7.3. Change in the singular value spectrum (SVs) of the sensitivity matrix for the

OED without regularization. Results are shown for the sensitivity matrix at

initial design S(uIG) and at optimal A-design S(uA) and E-design (S(uE))

for problem E1. (Figure taken from publication III - Lopez et al. (2015)

in Appendix A.2 - reprinted from Computers & Chemical Engineering with

permission from Elsevier). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

7.4. Change in the singular value spectrum (SVs) of the sensitivity matrix for the

OED without regularization. Results are shown for the sensitivity matrix

at initial design S(uIG) and at optimal design for problem E2 with: (a)

A-criterion S(uA), and (b) E-criterion S(uE). Note that the shown lower

bound ϵ is computed for S(ucrit). (Figure taken from publication III - Lopez

et al. (2015) in Appendix A.2 - reprinted from Computers & Chemical

Engineering with permission from Elsevier). . . . . . . . . . . . . . . . . . . 127

7.5. Singular value spectrum (SVs) of the original and the regularized sensitivity

matrix after applying Reg=SsS, S and SSsS , respectively. Results are shown

for the initial uIG and optimal ucrit experimental designs for problem: (a)

E2 where ucrit = uA, and (b) E3 where ucrit = uE . The solid-black curve

shows the SVs of the original sensitivity matrix without regularization at

initial design S(uIG). The black-cross and gray-cross markers show the SVs

of the regularized (reduced) matrix at initial and optimal designs SSsS(uIG)

and SSsS(ucrit), respectively. Note that the lower bound ϵκ is computed for

SSsS(ucrit). (Figure taken from publication III - Lopez et al. (2015) in

Appendix A.2 - reprinted from Computers & Chemical Engineering with


7.6. Singular value spectrum (SVs) of the original and the regularized sensitiv-

ity matrix after applying Reg=TSVD, S and STSV D, respectively. Results

are shown at initial uIGand optimal ucrit experimental designs, respectively

for problem: (a) E2 where ucrit = uA, and (b) E3 where ucrit = uE . The

solid-black curve shows the SVs of the original sensitivity matrix without

regularization at initial design S(uIG). The black-cross and gray-cross mark-

ers show the SVs of the regularized (approximated) matrix at the initial and

optimized designs STSV D(uIG) and STSV D(ucrit), respectively. Note that

the lower bound ϵκ is computed for STSV D(ucrit). (Figure taken from publi-

cation III - Lopez et al. (2015) in Appendix A.2 - reprinted from Computers

& Chemical Engineering with permission from Elsevier). . . . . . . . . . . . 131

XIX

List of Figures

7.7. Singular value spectrum (SVs) of the original and the regularized sensitiv-

ity matrix after applying Reg=Tikh, S and ST ikh, respectively for problem

E3 with: (a) λ1 = 0.001 (weak regularization), and (b) λ2 = 0.1 (strong

regularization). The solid-black curve shows the SVs of the original sensi-

tivity matrix without regularization at the the initial design S(uIG). The

black-cross markers show the SVs of the regularized matrix at the initial

design ST ikh(uIG). (Figure taken from publication III - Lopez et al. (2015)

in Appendix A.2 - reprinted from Computers & Chemical Engineering with


7.8. Singular value spectrum (SVs) of the original and the regularized sensitivity

matrix after applying Reg=Tikh, S and ST ikh, respectively. Results are

shown at initial and optimal experimental design, uIG and ucrit, respectively,

for problem E2 with: (a) ucrit = uA and (b) ucrit = uE . The solid-black

curve shows the SVs of the original sensitivity matrix without regularization

at the initial design S(uIG). The black-cross and gray-cross markers show

the SVs of the regularized matrix at initial design ST ikh(uIG) and at optimal

design ST ikh(ucrit), respectively. Note that the lower bound ϵκ is computed

for the SVs of ST ikh(ucrit). (Figure taken from publication III - Lopez



7.9. Comparison between the singular value spectrum (SVs) of the sensitivity

matrix S at the initial design uIG for example E1 with different experimental

data sets. E1a-c are adaptations of E1 including x1 and x2 as measurable

variables and increasing the number of experimental points to 20, 80 and

160, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

7.10. Comparison between the singular value spectrum (SVs) of the sensitivity

matrix S(ucrit) at the optimal design ucrit with crit=A,D,E for the well-

posed case E1c (no regularization). Spectrum labeled S is evaluated at the

initial design uIG. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

7.11. Monte Carlo problem E2: Box plots of normalized parameter estimates and

corresponding cost function norm obtained at (a) initial design uIG and (b)

E-optimal design uE without regularization. . . . . . . . . . . . . . . . . . 137

7.12. Monte Carlo problem E2: Box plots of normalized parameter estimates and

corresponding cost function norm obtained at regularized A-optimal design

uA with Reg=SsS by solving parameter estimations with (a) Reg=None

(Study I) and (b) Reg=SsS (Study II). . . . . . . . . . . . . . . . . . . . . . 139

8.1. Experimental set up of the chromatography system. (Figure taken from

publication IV- Barz et al. (2016) in Appendix A.2 - reprinted from Com-

puters & Chemical Engineering with permission from Elsevier) . . . . . . . 146

XX

List of Figures

8.2. Schematic flow sheet of the chromatography system in Fig. 8.1. (Figure

taken from publication IV- Barz et al. (2016) in Appendix A.2 - reprinted

from Computers & Chemical Engineering with permission from Elsevier) . 146

8.3. Units of the process model, input/ output variables and unknown model pa-

rameters. (Figure taken from publication IV- Barz et al. (2016) in Appendix

A.2 - reprinted from Computers & Chemical Engineering with permission

from Elsevier) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

8.4. Responses of the manager and pump outlet concentrations (equal to the

column inlet concentrations cini ) for arbitray steps in the feed concentration

cfeedi . Flow is kept constant at 1.5 ml/min. (Figure taken from publication

IV- Barz et al. (2016) in Appendix A.2 - reprinted from Computers &

Chemical Engineering with permission from Elsevier) . . . . . . . . . . . . . 149

8.5. Online Base Case (Reg=None): Model fitting using the parameter estimate

θ at tk = 30 (shown in Table 8.2) for the online parameter estimation with-

out regularization. The markers show the measured sum outlet concentra-

tions whereas the solid-line shows the corresponding simulated concentrations.152

8.6. Online Base Case (Reg=None): Relative Bias (%) of each estimated param-

eter (w.r.t. its corresponding true parameter value in Table 8.2) computed

at each sampling time after solving parameter estimations without regular-

ization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

8.7. Online Base Case (Reg=None): Singular value spectrum (SVsk) of each sen-

sitivity matrix S−k computed at each sampling time tk after solving param-

eter estimations without regularization. The horizontal solid plane labeled

ϵ = 6.7 (see Eq. 3.3 with γmax = 15 and κmax = 1000) is the typical thresh-

old to selected the ill-conditioned singular values in the sensitivity method

ill-conditioning analysis of Section 4.4.1. . . . . . . . . . . . . . . . . . . . . 154

8.8. Online Case SsS (Reg=SsS): Relative Bias (%) of the estimated parame-

ter vector θk (w.r.t. the true parameter vector in Table 8.2) computed

at each sampling time tk with k = 1, · · · , Nm after solving parameter es-

timations by using subset selection as regularization for several values of

the regularization parameter ϵ-threshold. Regularization parameters with

the best performance in terms of the accuracy of the estimated parame-

ter vector at the end of the experiment t = 30 min are enclosed in the

red box. “None“ makes reference to parameter estimations without regu-

larization (Reg=None). Note that ϵ-threshold is always defined by γmax

according to Eq. 3.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

XXI

List of Figures

8.9. Online Case SsS (Reg=SsS): Mean Relative Bias (%) for the whole ex-

periment duration of the Nm estimated parameter vectors θk with k =

1, · · · , Nm (w.r.t. the true parameter vector in Table 8.2) obtained after

solving the Nm parameter estimations by using subset selection as regular-

ization for several values of the regularization parameter ϵ-threshold. Reg-

ularization parameters with the best global performance in terms of the

accuracy of the estimated parameter vectors during the whole experiment

are enclosed in the red box. . . . . . . . . . . . . . . . . . . . . . . . . . . 157

8.10. Online Case SsS (Reg=SsS): Singular value spectrum SVstk of the sensitiv-

ity matrix S−k computed at each sampling time tk = 3, 5, 10, 15, 20, 25, 30

after solving parameter estimations by using subset selection as regulariza-

tion for several values of the regularization parameter, i.e., ϵ = 0.1, 2, 500.Large values of ϵ determine strong regularizations of PE otherwise the reg-

ularization is weak. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

8.11. Online Case Tikh (Reg=Tikh) for λ regularization parameter selection:

Relative Bias (%) of the estimated parameter vector θk (w.r.t. the true

parameter vector in Table 8.2) computed at each sampling time tk with

k = 1, /cdots,Nm after solving parameter estimations by using subset selec-

tion as regularization (Reg=Tikh) for several values of the regularization

parameter λ in the course of the online experiment. Regularization param-

eters with the best performance in terms of the accuracy of the estimated

parameter vector at the end of the experiment t = 30 min are enclosed

in the red box. “None“ makes reference to parameter estimations without

regularization (Reg=None). Note that λ is here defined by σθ according to

Eq. 8.11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159


Mean Relative Bias (%) for the whole experiment duration of the Nm esti-

mated parameter vectors θk with k = 1, · · · , Nm (w.r.t. the true parameter

vector in Table 8.2) obtained after solving Nm parameter estimations by us-

ing subset selection as regularization for several values of the regularization

parameter λ. Regularization parameters with the best global performance

in terms of the accuracy of the estimated parameter vectors during the

whole experiment are enclosed in the red box. “None“ makes reference to

parameter estimations without regularization (Reg=None). Note that λ is

defined by σθ according to Eq. 8.11. . . . . . . . . . . . . . . . . . . . . . . 160

XXII

List of Figures


Singular value spectrum SVsT ikhtk

and SVstk of the regularized and origi-

nal sensitivity matrices S−,T ikhk and S−

k computed at each sampling time

tk = 5, 10, 15, 20, 25, 30 after solving parameter estimations by using

Tikhonov as regularization for regularization parameter λ = 50. Notice

that singular values ςi ≤ λ are approximated to values around λ. . . . . . . 161

8.14. D-optimal adaptive input design for feeding strategy FS-1. Subfigures a,

b show the measured sum and predicted individual outlet concentrations.

Subfigure c shows the input design, i.e. the inlet concentrations. Subfigure d

shows the results of the identifiability analysis for the parameters θ1, · · · , θ6.If a parameter was identifiable and selected by the subset selection (SsS)

algorithm, this parameter was active and its activity was indicated by a

dot. If a dot was missing, the parameter was not active and not identifiable.

Subfigure e shows the course of the parameter estimates. (Figure taken

from publication IV- Barz et al. (2016) in Appendix A.2 - reprinted from

Computers & Chemical Engineering with permission from Elsevier) . . . . 162

8.15. D-optimal adaptive input design for feeding strategy FS-2. Subfigures a,

b show the measured sum and predicted individual outlet concentrations.

Subfigure c shows the input design, i.e. the inlet concentrations. Subfigure d

shows the results of the identifiability analysis for the parameters θ1, · · · , θ6.If a parameter was identifiable and selected by the subset selection (SsS)

algorithm, this parameter was active and its activity was indicated by a

dot. If a dot was missing, the parameter was not active and not identifiable.

Subfigure e shows the course of the parameter estimates. (Figure taken

from publication IV- Barz et al. (2016) in Appendix A.2 - reprinted from

Computers & Chemical Engineering with permission from Elsevier) . . . . 163

8.16. Validation of the parameter estimates for single component adsorption. Ad-

sorption isotherms obtained by FA are shown by calculated equilibrium

points. Predicted adsorption isotherms using the Langmuir model are

shown by lines. Predictions are made using parameter estimates from D-

optimal designs and standard input designs (uniform and pulse). (Figure


from Computers & Chemical Engineering with permission from Elsevier) . . 164

8.17. Input design (subfigure c) and outlet concentrations (subfigure a, b) for

feeding strategy FS-2 and a standard input design generated by a sum of

sinusoids; ’in silico’ experiment. (Figure taken from publication IV- Barz


Engineering with permission from Elsevier) . . . . . . . . . . . . . . . . . . 165

XXIII

List of Figures

8.18. Input design (subfigure c) and outlet concentrations (subfigure a, b) for

feeding strategy FS-2 and a standard input design generated by an uniform

sampling; real experiment. (Figure taken from publication IV- Barz et al.

(2016) in Appendix A.2 - reprinted from Computers & Chemical Engineer-

ing with permission from Elsevier) . . . . . . . . . . . . . . . . . . . . . . . 166

A.1. Probability distribution of the estimators Θ1 and Θ2 . . . . . . . . . . . . . 197

XXIV

List of Tables

3.1. Definition of the specific derivatives used for “SsS“, “TSVD“ and “Tikh“ reg-

ularization; regularization “Reg“ equal “None“ refers to the original param-

eter estimation problem. (Table from publication III - Lopez et al. (2015)

- reprinted with permission from Elsevier Science) . . . . . . . . . . . . . . 46

5.1. Variables in Li-Ion Model. (Table taken from publication I - Lopez et al.

(2016) in Appendix A.2 - reprinted from Industrial & Engineering Chem-

istry Research with permission from American Chemical Society). . . . . . 69

5.2. Estimated parameters in Li-Ion Model. . . . . . . . . . . . . . . . . . . . . . 69

5.3. Operating and design variables, constants and fixed parameters in Li-Ion

Model. (Table taken from publication I - Lopez et al. (2016) in Appendix



5.4. Governing equations for modified Li-Ion PDAE model. (Table taken from




5.5. Auxiliary equations of modified Li-Ion PDAE model. (Table taken from




5.6. Case 1 for Sensitivity method. (Table taken from publication I - Lopez

et al. (2016) in Appendix A.2 - reprinted from Industrial & Engineering

Chemistry Research with permission from American Chemical Society). . . 76

5.7. Case 1, 2, and 3: Confidence interval lengths for Sensitivity and Monte

Carlo methods. Lengths are expressed as percentages relative to the true

parameter. (Table taken from publication I - Lopez et al. (2016) in Ap-

pendix A.2 - reprinted from Industrial & Engineering Chemistry Research

with permission from American Chemical Society). . . . . . . . . . . . . . . 77

5.8. Case 1: Summary of results for Monte Carlo. (Table taken from publication

I - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial &

Engineering Chemistry Research with permission from American Chemical

Society). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.9. Case 2: Summary of results for Sensitivity and Monte Carlo methods.

(Table taken from publication I - Lopez et al. (2016) in Appendix A.2 -

reprinted from Industrial & Engineering Chemistry Research with permis-

sion from American Chemical Society). . . . . . . . . . . . . . . . . . . . . . 84

XXV

List of Tables

5.10. Computational results for simulation and parameter estimation problems.

(Table taken from publication I - Lopez et al. (2016) in Appendix A.2 -

reprinted from Industrial & Engineering Chemistry Research with permis-

sion from American Chemical Society). . . . . . . . . . . . . . . . . . . . . . 89

6.1. Experimental Conditions for Bio-Ethanol Production from Sugarcane

Bagasse in the SSF Process. (Table taken from publication II - Lopez et



6.2. Initial guesses taken from literature and generated by MBLHD. (Table taken



ical Engineers). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

6.3. Estimated parameter vector using E1, E4, and E5 experimental data after

solution of different PE problems. Column labeled “E2&E3“ contains the

estimated parameters after finishing the iterative parameter estimation with

structural analysis (see Figure 6.5). (Table taken from publication II - Lopez

et al. (2013) in Appendix A.2 - reprinted from Biotechnology Progress with


7.1. Problem description for case studies E1, E2 and E3. (Figure taken from

publication III - Lopez et al. (2015) in Appendix A.2 - reprinted from

Computers & Chemical Engineering with permission from Elsevier). . . . . 119

7.2. Thresholds for condition number (κmax) and collinearity index (γmax). (Fig-

ure taken from publication III - Lopez et al. (2015) in Appendix A.2 -

reprinted from Computers & Chemical Engineering with permission from

Elsevier). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

8.1. Different feeding strategies (FS) for the chromatography system. Abbrevi-

ations: ethyB = ethyl benzoate, propB = propyl benzoate, butyB = butyl

benzoate. (Table taken from publication IV- Barz et al. (2016) in Appendix

A.2 - reprinted from Computers & Chemical Engineering with permission

from Elsevier) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

8.2. Parameter estimates for the online parameter estimation using different

regularization strategies. ’True values’ refers to the best estimates obtained

from offline estimation without iteration limits. . . . . . . . . . . . . . . . . 153

8.3. Parameter accuracy for experimental data obtained from different input

designs and feeding strategies FS-1 and FS-2. Input designs marked by a

star were realized experimentally, all other are ’in silico’ experiments. (Table


from Computers & Chemical Engineering with permission from Elsevier) . . 165

XXVI

List of Tables

A.1. Variance-decomposition: E1 - Fed Batch Fermentation. (Figure taken from

publication III - Lopez et al. (2015) - reprinted from Computers & Chemical


A.2. Variance-decomposition: E2 - Biochemical network. (Figure taken from

publication III - Lopez et al. (2015) - reprinted from Computers & Chemical


A.3. Variance-decomposition: E3 - ASM3. (Figure taken from publication III -

Lopez et al. (2015) - reprinted from Computers & Chemical Engineering

with permission from Elsevier). . . . . . . . . . . . . . . . . . . . . . . . . . 182

XXVII

XXVIII

Nomenclature

Abbreviations

ANOVA Analysis of variance

DAE Differential and algebraic equation

DoF Degrees of freedom

FIM Fisher-information matrix

IG Initial guess

LHD Latin hypercube design

LSQ Least-squares

MBLHD Minimum bias latin hypercube design

MSE Mean square error

OED Optimal experimental design

PD Positive definite

PDAE Partial differential and algebraic equation

PE Parameter estimation

PSD Positive semi-definite

Pubs. Scientific publications

QRP QR decomposition with column pivoting

SSF Simultaneous saccharification and fermentation process

SsS Identifiable parameter subset selection

SVD Singular value decomposition

SVs Singular value spectrum

Sym Symmetric matrix

Tikh Tikhonov

TSVD Truncated singular value decomposition

Latin symbols

a Scaling factor

C Covariance matrix (which is related to parameters if having no subscript)

CF Cost function (objective function) of the optimization problem either

PE or OED

f Set of DAEs representing the process model

F Fisher-information matrix

h Set of relations between y and x

H0 Null hypothesis in the hypothesis test

H1 Alternative hypothesis in the hypothesis test

H Hessian matrix

XXIX

List of Tables

INθIdentity matrix with dimension Nθ ×Nθ

J Jacobian matrix

L Operator matrix in Tikhonov regularization

LB Lower confidence limit

Ne Number of experiments

NIG Number of IGs

Nm Number of experimental data sampling times

Nmod Number of available models

Nmu Number of input variable switching times

Nu Number of input variable

Nx Number of dependent state variables

Ny Number of measured response variables

Nθ Number of parameters

PD(Nθ) Subspace of the positive definite matrices with dimension Nθ ×Nθ

PSD(Nθ) Subspace of positive semi-definite matrices with dimension Nθ ×Nθ

Q Orthogonal matrix of QRP

r Numerical rank

R Upper triangular matrix of QRP

S Sensitivity matrix

s Component of the sensitivity matrix

Sv Rectangular diagonal matrix of SVD

Sym(Nθ) Space of symmetric matrices with dimension Nθ ×Nθ

t Independent variable time

T Standardized random variable for the confidence interval

texp Experiment duration

TH0 Test statistic for the hypothesis test

tα/2,(Ny ·Nm·Ne−Nθ) Critical value of the two-tails Student’s t-distribution for the confidence

level α and DoF equals to (Ny ·Nm ·Ne −Nθ)

u Input action or experiment design / left singular vector

U Real or complex unitary matrix of SVD

UB Upper confidence limit

V ar Variance metric

V Real or complex unitary matrix of SVD

v Right singular vector

x Dependent state variable

y Observed model response variable

Y Total observed model response vector

ym Measured response variable

Y m Total measured response vector

Z Residual vector of parameter estimation

XXX

List of Tables

Greek symbols

α Confidence level

β Bias

γ Collinearity index

δ Sensitivity measure

ϵ Singular value threshold / Measurement error

θ Unknown parameter vector

Θ Parameter estimator

κ Condition number

λ(A) Eigenvalue of matrix A

λ Tikhonov regularization parameter

ξ Experiment

π Parameter variance-decomposition proportion

Π Permutation matrix of QRP

ρ Parameter variance threshold / Step size

σ Standard deviation

σ2 Variance

σ2j |ςi Variance component of the j-th parameter associated to ςi

Σ Standard deviation matrix

ς Singular value

υ Step direction

Φ Cost function of the PE

Ψ Cost function of the OED

Additional Indices and Subscripts

IG Related to the initial guess

i, j Vector/Matrix indices

(j) Related to the j-th replication in Monte Carlo

k Iteration index

mach Related to the machine precision

trsh Related to the threshold

u Related to the input variables

x Related to the state variables

y Related to the measured variables

θ Related to the parameters

κ Related to the condition number

γ Related to the collinearity index

0 Related to the initial condition

XXXI

List of Tables

Additional Superscripts

crit Related to alphabetic criteria after conducting OED, crit = A,D,ELSQ Related to the weighted nonlinear least-squares

max Maximum value

min Minimum value

None Related to original problem without regularization

(Nθ−rϵ) Related to the Nθ − rϵ unidentifiable parameters

R Related to the predefined parameter vector in Tikhonov

Reg Related to the regularized problem

(rϵ) Related to the rϵ identifiable parameters

SsS Related to the regularized problem with SsS

T ikh Related to the regularized problem with Tikh

TSV D Related to the regularized problem with TSVD

∗ Related to the true value

− Related to the past intervals/time instants

+ Related to the future intervals/time instants

Related to the normalized value

Related to the point estimate or optimal solution

Related to the linear independence reordering if applied to θ or scaling

if applied to S

Related to the first derivative

Special symbols

E Set of experiments

E [·] Sample expectation

N Normal distribution

R Real numbers

C Complex numbers

T Set of time measurement points

T u Set of time switching input action points

∇ Gradient

XXXII

Co-authorship

The research presented in this thesis was conducted by myself under the supervision of

Prof. Dr.-Ing. habil. Prof. h.c. Dr. h.c. Gunter Wozny of the Chair of Process Dynamics

and Operation at the Technische Universitat Berlin. Research in Chapter 5 was published

in Industrial & Engineering Chemistry Research, research in Chapter 6 was published in

Biotechnology Progress, research in Chapters 7 and 8 were published in Computers &

Chemical Engineering.

The topics in each peer-reviewed article were investigated by myself. Then the articles

were drafted by myself and revised by Professor Gunter Wozny who is co-author. Professor

Victor Zavala made contributions to Chapter 5 editing and reviewing it. Professor Mari-

ana Penuela contributed with the experimental data in Chapter 6. Professor Silvia Ochoa

reviewed the results in Chapter 6 and M.Sc Adriana Villegas contributed in model devel-

opment of the same chapter. Dr. Tilman Barz collaborated in the model development,

experimental and numerical implementation, and writing of the peer-reviewed articles pre-

sented in Chapters 7 and 8. Dr. Stefan Korkel reviewed the mathematical background

of Chapters 7 and 8. Dr. M. Nicolas Cruz B. and Dr. Sebastian Walter reviewed the

contents of Chapter 8.

XXXIII

List of publications used for this thesis

This thesis is based on the following publications. The complete list of publications and

contributions in conferences which were co-authored during my PhD can be found in the

Appendix A.1.

Publication I:

D. C. Lopez C., G. Wozny, A. Flores-Tlacuahuac, R. Vasquez-Medrano, and V. M. Zavala.

A computational framework for identifiability and ill-conditioning analysis of lithium-ion

battery models. Industrial & Engineering Chemistry Research, 55(11):3026-3042, 2016.

http://pubs.acs.org/doi/abs/10.1021/acs.iecr.5b03910

Publication II:

D. C. Lopez C., T. Barz, M. Penuela, A. Villegas, S. Ochoa, and G. Wozny. Model-

based identifiable parameter determination applied to a simultaneous saccharification

and fermentation process model for bio-ethanol production. Biotechnology Progress,

29(4):1064-1082, 2013. http://onlinelibrary.wiley.com/doi/10.1002/btpr.1753/abstract

Publication III:

D. C. Lopez C., T. Barz, S. Korkel, and G. Wozny. Nonlinear ill-posed problem analysis

in model-based parameter estimation and experimental design. Computers & Chemical

Engineering, 77:24-42, 2015. http://dx.doi.org/10.1016/j.compchemeng.2015.03.002

Publication IV:

T. Barz, D. C. Lopez C., M. N. Cruz B., S. Korkel, and S. F. Walter. Real-

time adaptive input design for the determination of competitive adsorption isotherms

in liquid chromatography. Computers & Chemical Engineering, 94:104-116, 2016.

http://dx.doi.org/10.1016/j.compchemeng.2016.07.009

1

1. Introduction

1.1. Research motivation

Recovering unknown parameters (e.g., kinetic, transport and thermodynamic parameters)

from available experimental data is a key task in the development of process models. Their

determination involves the numeric solution of the parameter estimation (PE) problem.

This is typically a difficult task for various challenges arising from the model structure

and the experimental data. High non-linearity, poor model fitting, over-parameterized

and correlated-parameter models are challenges related to the model structure. The lack

of informative experimental data, correlated measured outputs, and highly corrupted ex-

perimental data are considered as challenges arising from the experiment.

The aforementioned challenges are well-accepted sources of poor or even non-

identifiability in linear and nonlinear estimations [3, 10, 41, 120, 124]. When this happens,

the estimation may be unstable showing convergence and numerical problems. In addition,

the recovered parameters may have large variance and are not reliable for further use. In

fact, identifiability deficiency may be mathematically represented by an ill-conditioned

problem. A problem is considered ill-conditioned if small errors in the data produce large

errors in the solution. In that case, the problem is said to be ill-posed due to the presence

of ill-conditioning. Over the last decades identifiability and ill-conditioning issues have

been separately addressed. Notwithstanding, the combined analysis of both in nonlinear

estimation and model-based experimental design has rarely been performed in theoretical

studies and much less in real applications.

Discrete ill-posed problems are often encountered in many fields and arise from discrete

available data or discretization of ill-posed problems. Examples of ill-posed inverse prob-

lems can be found in chemical engineering [5, 4, 8, 79, 78, 80, 83], damage monitoring [73],

image processing [59] and geophysics [48, 49] to name just a few of them. Specially in

engineering, their sound analysis is not yet common practice and difficulties arising from

the determination of uncertain parameters are typically not assigned properly. Regular-

ization techniques for the solution of ill-posed parameter estimation problems have been

developed and found wide application in numeric algorithms [13, 52, 63, 66]. However, a

critical analysis of the ill-posedness of a studied problem and the possible consequences

for its solution are frequently not carried out by the user.

To further improve the precision of parameter estimates, model-based or optimal exper-

imental design (OED) techniques have been developed with the aim to maximize the in-

formation content in experimental data [41, 101]. In recent years, approaches to handle ill-

posed problems in optimal experimental design have been addressed [4, 15, 48, 49, 59, 93].

3

1. Introduction

Nonetheless, due to their large complexity and computational burden they are only little in

use. Consequently, the results of common optimal designs, which turn out to be ill-posed,

might be doubtful. Furthermore, investigation about the estimator behavior in terms of

variance, bias and convergence in parameter estimations using regularized optimal designs

has been conducted neither. These matters will be addressed along this thesis.

1.2. Research scope

The focus of this thesis is to study the features, sources, consequences and treatments of

nonlinear ill-posed problems in the context of parameter estimation and optimal experi-

mental design of dynamical systems. Unstable estimations and poor robustness against

noise generated by ill-conditioning issues will be along this research the main character-

istic of the studied problems. They suffer from poor identifiability and large parameter

uncertainty.

The backbone of this research is the singular value analysis of the sensitivity matrix

which exploits the link between ill-conditioning and identifiability. With this in mind a con-

solidated computational framework to systematically evaluate ill-posedness in model-based

parameter estimation and experimental design is proposed. It includes ill-conditioning and

identifiability analysis as well as regularization techniques. After all, this thesis is based

on two major axes:

1. detection of ill-conditioning to diagnose identifiability issues (topics treated in Pub-

lication I [80], Publication II [79], Publication III [78] and Publication IV)

2. application and analysis of regularization techniques in ill-posed PE and OED prob-

lems (topics treated in Publication II [79], Publication III [78] and Publication IV).

The first major axis includes various techniques such as singular value analysis of the

sensitivity matrix to detect ill-conditioning and diagnose local identifiability issues; pa-

rameter variance-decomposition to determine the effect of ill-conditioning in parameter

precision; orthogonal decompositions to rank and determine identifiable parameters; dy-

namic sensitivity profile study to assess the excitation provided by different parameters

on different outputs among others techniques. Monte Carlo studies are accomplished in

order to support conclusions obtained by the singular value analysis.

The second major axis discusses different numerical approaches to deal with nonlinear

ill-posed problems. Hence, regularization in PE as well as in OED are investigated. Three

regularization techniques will be treated, namely parameter subset selection (SsS), Trun-

cated Singular Value Decomposition -TSVD (generalized inverse) and the Tikhonov regu-

larization. For the regularized optimality criteria in OED only the variance contribution

will be considered [3, 38, 78]. This approach reduces the computational effort compared to

the bi-level optimization approach proposed in [49, 59, 73]. Moreover, it allows to better

4

1.2. Research scope

understand the effects of ill-posedness and its numerical stabilization treatments on the

typical OED design criteria. The performance of mentioned regularization techniques in

PE is assessed by Monte Carlo Studies.

The above-mentioned axes are fundamental in model-based parameter estimation and

experimental design since ill-posedness is a common problem to deal with. They can be

independently applied in both optimizations. Although, in some case studies the possi-

bility to include one or both major axes will be shown either in parameter estimation

or experimental design, in the framework of Chapter 4 they are combined. This general

computational framework is conceived to be applied to the development of process mod-

els using model-based experimentation under ill-posedness. It should be noted that the

framework might be applied even to well-posed problems. In fact, it should be applied to

general problems regardless their ill-posedness state.

The framework considers the solution of parameter estimation and optimal experimen-

tal design as foremost computational tasks. In parameter estimation, two paradigms to

analyze an estimator are described. The first paradigm uses parameter-output sensitivity

information, whereas the second paradigm is conducted via Monte Carlo studies. Strate-

gies to mitigate the ill-conditioning and improve the model identification such as model

structure modification, experimental data analysis, parameter initial guess selection and

numerical regularization in the case studies are well illustrated. In optimal experimental

design a link between the singular value analysis to the common alphabetic design criteria

is established, and information for the solution of the ill-posed optimal experimental de-

sign is derived. The implications to perform the typical alphabetic optimal experimental

design under ill-conditioning and identifiability problems are also discussed. This situation

can even occur after the identifiability issues are previously detected. Nonetheless, there is

no clear understanding of the behavior of an optimal experimental design arising from an

ill-posed parameter estimation and its actual impact on the next estimations. Finally, it is

also shown how the experimental design performs when the parameter covariance matrix

is approximated by a sensitivity matrix from a regularized estimation.

In order to accomplish the goal of this thesis, several contributions in scientific journals

and conferences, which were achieved during this PhD research, are cited. Although, the

manuscript is essentially based on four full-text peer-review articles (for the list presented

see the Appendix A.2), the other contributions in which the author of this thesis was

co-authoring are here mentioned as further examples/applications. The complete list of

the contributions in journals as well as in conferences are displayed in the Appendix

A.1. Furthermore, supplementary data, computations and analysis (which have not been

published yet) along the manuscript are also provided. The combination of published and

new information is used to establish the entire computational framework and to present

an overall discussion and conclusion, which is more than just the summation of individual

publications.

5

1. Introduction

1.3. Research outline

The matter of the following chapters will contain summaries and a selection of results

of the associated publications which this thesis is based on. The structure of the thesis

is displayed in Figure 1.1. Note that each individual peer-reviewed article on the list of

Appendix A.2, which was produced during this project, has its own methodology and case

study which served to the purpose of the specific publication. This thesis however is con-

ceived as unifying the segregated methodologies in the mentioned peer-reviewed articles

in a consolidated computational framework displayed in Chapter 4. For doing so, specific

chapters are required, namely, theoretical background chapters (Chapters 2 and 3) and the

computational framework chapter (Chapter 4). Chapter 2 summarizes the mathematical

formulations, definitions and techniques related to well-established concepts about parame-

ter estimation and optimal experimental design. Chapter 3 deals with the major concepts

regarding ill-posed problems and regularization. The framework in Chapter 4 incorpo-

rates various levels with strategies to select parameter initial guess, to assess the estimator

performance, to detect ill-conditioning, to analyze identifiability issues, to regularize a pa-

rameter estimation and optimal experimental design. These levels are connected by the

singular value analysis of the output-sensitivity matrix, which provides an explanation of

the parameter variability.

Once unified the computational framework, the next chapters of this thesis exhibit six

case studies from bioprocess and chemical engineering which were the subject of the peer-

reviewed articles. Each case study illustrates fragments of the computational framework

and applications of them to overcome typical deficiencies or weak practices in parameter

estimation and optimal experimental design. In Chapter 5 the impact of poor informa-

tive experimental data on ill-conditioning and identifiability of an energy storage system,

namely a Lithium-ion battery is analyzed. In Chapter 6 the case of determining parameters

of an over-parameterized and parameter-correlated model in the context of Bio-ethanol pro-

duction is investigated. In Chapter 7 the optimal experimental design for three ill-posed

bio-processes, namely, a semi-continuous fermentation, a biochemical growth reactor and

water treatment is studied. In Chapter 8 the parameters and the experimental design for

a chromatography system are online optimized. Special attention to handle the deficiency

of scarce experimental data in online estimation is paid.

Model selection and reduction as well as parameter initial guess selection are briefly dis-

cussed in Chapter 6. The ill-conditioning analysis is formally applied in Chapters 5, 7 and

8. Practically identifiability diagnosis and estimator performance assessment are exhibited

in all case studies applied on energy storage systems in Chapter 5, bio-ethanol production

in Chapter 6, bioreactors for several purposes in Chapter 7 and liquid chromatography

separation in Chapter 8. Regularization in parameter estimation is treated in Chapters

6, 7 and 8. Moreover, the analysis of regularized optimal experimental design along with

some recommendations to select regularization parameters are shown in Chapters 7 and

6

1.3. Research outline

8. Dynamic sensitivity profiles to assess the excitation provided by different parameters

on different outputs are presented in Chapter 5. In order to support findings obtained

with the sensitivity method in Chapters 5, 7 and 8 Monte Carlo analyzes are conducted.

In addition, at the end of each chapter conclusions about the respective case study and

implications of the use of the framework are given. Finally, in Chapter 9 the work is

summarized and an outlook is given.

7

1. Introduction

TH

ES

IS:

Syste

matic e

valu

ation o

f ill

-posed p

roble

ms in m

odel-based

para

mete

r estim

ation a

nd e

xperim

enta

l desig

n

Th

eo

reti

ca

l b

ac

kg

rou

nd

I,

II

Ch

ap

ter

2 a

nd

3:

Co

mp

uta

tio

na

lfr

am

ew

ork

Ch

ap

ter

4:

Ca

se

stu

die

s

Intr

od

uc

tio

n

Ch

ap

ter

1:

Lit

hiu

m-i

on

ba

tte

ry

Ch

ap

ter

5:

Pu

blic

ati

on

I:

Ló

pez e

t a

l. 2

01

6

A c

om

puta

tional f

ram

ew

ork

for

identifiabili

ty a

nd ill-

conditio

nin

g a

naly

sis

of lit

hiu

m-ion b

att

ery

models

.

Bio

eth

an

ol

Ch

ap

ter

6:

Pu

blic

ati

on

II: L

óp

ez e

t a

l. 2

01

3

Model-based id

entifiable

para

mete

r

dete

rmin

ation a

pplie

d to a

sim

ultaneous

saccharification

and F

erm

enta

tion p

rocess

model f

or

bio

-eth

anol p

roduction.

Div

ers

e b

io-p

roc

es

se

s

Ch

ap

ter

7:

Pu

blic

ati

on

III:

Ló

pez e

t a

l. 2

01

5.

Nonlin

ear

ill-p

osed p

roble

m a

naly

sis

in

model-

based p

ara

mete

r estim

ation a

nd

experim

enta

l desig

n.

+

New

In

form

ati

on

Ch

rom

ato

gra

ph

y s

ys

tem

Ch

ap

ter

8:

Pu

blic

ati

on

IV

: B

arz

et

al. 2

01

6

Real-tim

e a

daptive in

put desig

n for

the

dete

rmin

ation o

f com

petitive a

dsorp

tion

isoth

erm

s in

liq

uid

chro

mato

gra

phy.

+

New

In

form

ati

on

Mo

de

l se

lection

Pu

bs

. II

Mo

de

l re

du

ction

Pu

bs

. II

Pa

r. in

itia

lg

uess

se

lection

Pu

bs

. II

Ill-

co

nditio

nin

g

an

aly

sis

Pu

bs

. I,

II,

III

, IV

Pra

ctica

lly

ide

ntifiabili

ty d

iag

nosis

Pu

bs

. I,

II,

III

, IV

Estim

ato

r p

erf

orm

ance

a

sse

ssm

ent

Pu

bs

. I,

II,

III

, IV

PE

with

reg

ula

rization

Pu

bs

. II

, II

I, IV

OE

D w

ith

reg

ula

rization

Pu

bs

. II

I, IV

Application:

Over-parameterized

and parameter-correlated model

Application:

Scarce experimental

data in online estimation

Application:

Improved selection of

experimental data

Application:

Effects of ill-posed

PE on OED

Re

gula

rization

pa

ram

ete

r se

lection

Pu

bs

. II

I, , IV

Ch

ap

ter

5, 6

, 7

, 8

:

Figure

1.1.:Structure

oftheThesis

8

2. Theoretical background I: Model-based

parameter estimation and experimental

design

This chapter summarizes the mathematical background, basic definitions as well as the

required notation of model-based parameter estimation and experimental design used in

the computational framework of Chapter 4 to systematically evaluate ill-posedness in

process models. It is important to point out that all information in this chapter is typical

of well-posed problems. The description of ill-posed problems and its implications in

model-based parameter estimation is made in Chapter (3).

In order to support all matrix concepts mentioned in this chapter, formal definitions,

propositions, theorems and lemas exposed in Appendix A.5 about matrix notions are

referenced. Special attention is paid to the iterative refinement of the experimental design

together with a repeated update of the parameter values. As an extension to this so-

called adaptive design, the idea of an online model-based redesign of experiments is also

explained.

The chapter is organized as follows. First, the basis to develop models throughout model-

based experimentation is presented. Second, generalities of the type of models concerning

to this thesis are given. Third, the optimization formulation for parameter estimation

is outlined. Fourth, the components to evaluate the performance of an estimator, i.e.,

precision and accuracy, are defined. Fifth, the concepts of identifiability are summarized

and finally the optimal experimental design including the sequential (adaptive) and online

approach is described.

2.1. Model development

When a process is running in any scaling either in laboratory, pilot plant or industry the ad-

vantages to model it are substantial. Benefits such as new process/product/configuration

design, profit increment, high quality product, energy saving, cost and loss reduction,

environmental damage mitigation, yield maximization among others are results of using

models in process and product design, control, operation and optimization. In literature

there exists several kinds of approaches to model a (dynamic) process. Black-box models

(e.g., neuronal networks, genetic algorithms, etc.), first-principles models or hybrid models

(which combines first-principles and empirical models) are some of them. The challenge is

to figure out how to develop the suitable model (in structure and quality of parameters) for

9

2. Theoretical background I: Model-based parameter estimation and experimental design

the specific process. To do so, the experimentation can be used to either select an adequate

model structure or to guarantee low uncertainty in model parameters instead of directly

enhancing the process itself. In Figure 2.1 the so-called model-based experimentation for

model development is depicted.

Optimal

experimental

design

max. discrimination criterion

or

min. par. variance

Experimentation

Parameter

estimation

min. model error

Modeling

Figure 2.1.: Iterative work cycle of model-based experimentation for model development

The work cycle in this figure considers the execution of experiments to collect mea-

surements (Experimentation stage), the mathematical formulation of models to compute

the process states (Modeling stage), the estimation of parameters (Parameter estimation

stage), and posterior (as required) design of new experiments (Optimal experimental de-

sign stage). This strategy is iterative and integrates experimental techniques, mathemat-

ical modeling, model identification and optimal experimental design. The estimation of

the model parameters is achieved by minimizing the model error to available experimen-

tal data. This estimation along with the model structure selection constitute the known

model identification task.

In Figure 2.1 two model-based experimentation work cycles are displayed. The inner

work cycle (dashed-line circle) deals with the selection of the model structure among several

model candidates. Whereas the outer work cycle (solid-line circle) treats the improvement

of the parameter precision for a fixed model structure. In order to pass from the inner to

10

2.2. Model formulation

the outer level, the best model structure should be selected. After having this fixed struc-

ture a reliable estimation of parameters from the available data is carried out. In both

levels the parameter estimation and optimal experimental design are required. Nonethe-

less, the optimal experimental design (which is an optimization problem) has different

cost function (objective function). In this optimization the process model structure is a

constraint and the degrees of freedom are the experimental conditions. For model struc-

ture selection (inner work cycle in Figure 2.1) the optimal design is asked to maximize

the difference of model predictions among the model candidates. This task is known as

model discrimination. Whilst for improving the parameter precision the experimental de-

sign is addressed to reduce a parameter variance metric (e.g., parameter variance average

or the largest parameter variance) of a unique model. Each level terminates when some

stop criterion, for instance, on predictive quality/model fitting or parameter variance, is

accomplished. Note, that in the model structure selection level the number of parameter

vectors to be estimated depends on the number of model candidates, whereas in the level

for improving parameter precision it is only one.

One of the interests of this thesis is to improve the understanding of the functionality

of the outer cycle in Figure 2.1 applied to nonlinear first-principles models additionally

contemplating diagnosis and treatments of ill-conditioning and identifiability issues. The

basis of optimal experimental design are given in Section 2.7. Further explanations about

the outer cycle in Figure 2.1 are summarized in the adaptive or sequential model-based

design of experiments of Section 2.7.4. Finally the extension to real-time applications of

the outer cycle is explained in Section 2.7.5.

2.2. Model formulation

Dynamic processes are frequently described by mechanistic models, which may be formu-

lated by using systems of ordinary differential equations (ODEs), differential and algebraic

equations (DAEs) or partial differential and algebraic equations (PDAEs). A PDAE sys-

tem is usually discretized in space (for instance, in radial and axial coordinates) to obtain

a set of DAEs.

The mechanistic model is here represented by a system of DAEs of the general implicit

form in Eq. 2.1. This representation considers OED models when there is no algebraic

portion in the formulation as well as PDAE models after discretization:

0 = f(t, x, x, u, θ) (2.1a)

x(t0) = x0(u0, θ), x(t0) = x0(u0, θ) (2.1b)

where x, x ∈ RNx are vectors containing the state variables and their first derivatives,

respectively, t ∈ R is the independent variable (here t is time); u ∈ RNu contains input

actions which are time-varying (i.e., experiment design vector or controls), and θ ∈ Ω ⊂

11


RNθ is the unknown parameter vector. Note that Ω is known as the parameter space and

that x is considered to have sub-vectors, denoted xd and xa, which are its differential and

algebraic parts, respectively. The mapping f(·) : RNx × RNu × RNθ → RNx is assumed

to be twice continuously differentiable. The initial values x0, x0 of the DAE system (Eqs.

2.1b) are both initialized to satisfy 0 = f(t0, x0, x0, u0, θ) for fixed u0 = u(t0) and θ. The

input actions u can be represented by piecewise polynomial approximations e.g., constant,

linear or quadratic functions. Here, a piecewise constant approximation is considered. In

addition to the system in Eq. 2.1 also consider the vector of the observed model responses

y ∈ RNy given by the mapping h(·) : RNx → RNy

y(t, u, θ) = h(x(t, u, θ)), (2.2)

which is also assumed twice continuously differentiable. The observation mapping h(·)relates the measured response variables y(t, u, θ) and the state variables x(t, u, θ). Fre-

quently, h(·) is just a function to select those state variables which are indeed measured.

The measured variables are in the vector ym ∈ RNy . The i-th entry of vectors y and ym is

denoted as y(i) and ym(i), respectively.

Due to in realistic experiments the response function in Eq. 2.2 is discretely ob-

served, a set of time measurement points T = t1, . . . , tNm is established. More-

over a set of time switching input action points T u =tu1 , · · · , tuNmu

and the set

of experiments E := ξ1, . . . , ξNe are also considered. Each experiment ξj ∈ E for

j = 1, · · · , Ne has a corresponding input action vector uξj ∈ RNu·Nmu and observed model

responses yξj (tk, uξj , θ) ∈ RNy , tk ∈ T , ξj ∈ E . The inputs are collected in the vector1

u := (uξ1 , · · · , uξNe) ∈ RNu·Nmu·Ne and the model responses are collected in the vector

Y (u, θ) := (yξ1(t1, uξ1 , θ), . . . , yξ1(tNm , uξ1 , θ), . . . , yξNe(t1, uξNe

, θ) . . . , yξNe(tNm , uξNe

, θ)),

(2.3)

where Y (u, θ) ∈ RNy ·Nm·Ne . The corresponding set of observed (measured) response vector

is Y m ∈ RNy ·Nm·Ne . It is also reasonably considered that each measurement ym(i)ξj(tk) is

affected by a random error ϵi such that

ym(i)ξj(tk) = y

(i)ξj(tk, uξj , θ

∗) + ϵi, i = 1, . . . , Ny, tk ∈ T , ξj ∈ E , (2.4)

where y(i)ξj(tk, uξj , θ

∗) is the i-th true underlying output at the true parameter vector

θ∗ ∈ RNθ , tk ∈ T , and ξj ∈ E corrupted by the measurement error ϵi. The errors ϵi for

i = 1, . . . , Ny are assumed to be realizations of the random variables Ei which satisfy the

following assumptions:

1For the sake of simplicity the same symbol u of the finite dimensional input variables in Eq. 2.1 are hereused.

12

2.3. Parameter estimation

• the errors Ei are normal-distributed with mean zero (E[Ei] = 0) and finite known

variance (V ar[Ei] = (σy,i)2 <∞), i.e., Ei ∼ N (0, σ2

y,i),

• the errors Ei, Ej are independent, i.e., Cov(Ei,Ej) = 0 whenever i = j, and are also

identically distributed.

According to Eq. 2.4 the observations ym(i)ξj(tk) are also random variables with normal

distribution. Having so, the total observed measured response vector Y m has a mean equal

to the model output at θ∗, i.e., E[Y m] = Y (u, θ∗), and a constant covariance matrix in time,

V ar[Y m] = Cy, i.e., Ym ∼ N (Y (u, θ∗), Cy). Under these assumptions the measurement

covariance matrix Cy is diagonal with entries given by the variances σ2y,i. It is also defined

Σy := C1/2y ∈ RNy ·Nm·Ne×Ny ·Nm·Ne as the measurement standard deviation matrix.


Parameter estimation is one of the major areas of statistical inference whose methods use

sample information for drawing conclusions about a population. In the inverse problem

of recovering the unknown parameters θ from finite experimental data Y m of the model

described in Eq. 2.1, the sample data is utilized to calculate a reasonable single value of

each parameter close to its true value in θ∗.

2.3.1. Mathematical formulation

A parameter estimation is an optimization problem with the following form

θ := argminθ

ΦLSQ(u, θ) (2.5a)

ΦLSQ(u, θ) :=1

2(Y (u, θ)− Y m)TC−1

y (Y (u, θ)− Y m), (2.5b)

where ΦLSQ denotes the cost function (objective function) of the weighted nonlinear

least-squares criterion, the weighting matrix C−1y ∈ RNy ·Nm·Ne×Ny ·Nm·Ne is the inverse of

the experimental error covariance matrix Cy of the measured data, and θ is a solution

of the problem in Eq. 2.5 called point estimate of a population parameter θ [89]. The

estimator of θ is represented by Θ. The assumptions in Section 2.2 of errors being unbiased,

independent, identical and normal distributed guarantees that the estimator Θ and then

the particular vector (or point estimate) θ of Eq. 2.5 are equivalent to the solution of the

maximum likelihood problem [3]. It means that the minimization of the sum of squares

error in Eq. 2.5 is equivalent to maximize the likelihood of obtaining the available observed

responses for a given parameter value. An estimator coming from the maximum likelihood

is an estimator with good statistical properties, i.e., the estimator is unbiased, has an

approximate normal distribution and is one with the smallest variance [89]. When the

estimation is well-posed, Θ is the asymptotically maximum likelihood unbiased estimator.

13


For the nonlinear parameter estimation the Eq. 2.5 is iteratively solved [3] by using

θk+1 = θk + ρkυk, (2.6)

where ρk is the step size and υk is the step direction. Steepest-descent and Newton

type algorithms (e.g., Gauss-Newton, Levenberg-Marquardt [82] and trust region methods

[110]) are typical gradient-based solution methods. In those approaches, the Jacobian

matrix Jθ(u, θ) ∈ RNθ

Jθ(u, θ) = ∇θΦLSQ (2.7a)

= ∇θY (u, θ)TC−1y (Y (u, θ)− Y m), (2.7b)

and the Hessian matrix Hθ(u, θ) ∈ RNθ×Nθ

Hθ(u, θ) = ∇2θΦ

LSQ (2.8a)

= ∇θY (u, θ)TC−1y ∇θY (u, θ) +∇2

θY (u, θ)TC−1y (Y (u, θ)− Y m) (2.8b)

of ΦLSQ in Eq. 2.5 are required to compute the step direction υk. For instance in

Newton’s method υk = Rkqk, where Rk = H−1θ and qk = Jθ [3].

As seen in Eqs. 2.7b and 2.8b, the Jacobian matrix Jθ(u, θ) and Hessian matrix

Hθ(u, θ) are based on the gradient of the predicted outputs with respect to the param-

eters ∇θY (u, θ) which is the well-known parameter-output sensitivity matrix S(u, θ) ∈RNy ·Nm·Ne×Nθ (see Section 2.3.2 to get details of its calculation)

S(u, θ) := ∇θY (u, θ), (2.9)

and the scaled residual vector Z = Σ−1y (Y (u, θ)− Y m) ∈ RNy ·Nm·Ne .

The Hessian Hθ(u, θ) in Eq. 2.8b can be further simplified following the Gauss approxi-

mation (i.e., neglecting the second term in Eq. 2.8b) as

Hθ(u, θ) ≈ ∇θY (u, θ)TC−1y ∇θY (u, θ) (2.10)

and using the definition of S(u, θ) it turns out as

Hθ(u, θ) ≈ STC−1y S = (Σ−1

y S)T (Σ−1y S) = ST S. (2.11)

Note that in Eq. 2.11 the sensitivity matrix is taken as

S = Σ−1y S, (2.12)

which will be called as the scaled2 form of the sensitivity matrix for the sake of simplicity.

2In the case of measurements with the same measurement variance, i.e., σy,i = σy for i = 1, · · · , Ny,

14


The simplified Hessian in Eq. 2.11 is identical to the Fisher-Information matrix

F (u, θ) = STC−1y S, (2.13)

where F is the Nθ × Nθ dispersion matrix, which belongs to the subspace of positive

semi-definite matrices PSD(Nθ) (see Section A.5.15 of Appendix A.5).

2.3.2. Parameter-output sensitivity matrix

The parameter-output sensitivity matrix S previously defined in Eq. 2.9 expresses the

local first-order derivatives (sensitivities) of the model responses collected in Y (u, θ) to

the parameters collected in θ. These derivatives show how much perturbations on the

estimated parameter vector θ impacts on the predicted outputs of the model Y (u, θ). The

j-th column of the sensitivity matrix S contains the dynamic sensitivities of each measured

response variables yi ∈ y with respect to the parameter θj , i.e.,∂yi∂θj|t for t ∈ T .

The first-order derivative information in S may be computed either by finite differences

or integrating the original model (Eq. 2.1) along with the so-called sensitivity equations [7,

81, 121], or by automatic differentiation. When finite differences are used, a differentiation

scheme such that the central finite differences

∇θY (u, θ) =∂Y (u, θ)

∂θ≈ Y (u, θ +∆θ)− Y (u, θ −∆θ)

2∆θ(2.14)

can be applied, where ∆θ =θ√U is the parameter perturbation and uses the machine

unit roundoff U .

When the sensitivity equations are employed, the so-called forward sensitivity analysis is

performed. This analysis formulates the (forward) sensitivity equations

0 =∂f

∂xJx

∂x

∂θSx

+∂f

∂xJx

∂x

∂θSx

+∂f

∂θJθ

(2.15a)

∂x

∂θ|t0 =

∂x0∂θ

,∂x

∂θ|t0 =

∂x0∂θ

, (2.15b)

which are obtained by applying the differentiation chain rule to the original DAEs in Eq.

2.1. The variable-coefficient matrices Jx ∈ RNy ·Nm·Ne×Ny ·Nm·Ne , Jx ∈ RNy ·Nm·Ne×Ny ·Nm·Ne

and Jθ ∈ RNy ·Nm·Ne×Nθ are the respective Jacobians of f with respect to x, x and θ. The

Jacobian of the system is represented by Jx. By simultaneously solving this linear matrix

system of differential equations along with the model in Eq. 2.1, the sensitivity matrix of

the state variables Sx is obtained. In order to compute the sensitivity matrix S (i.e., only

considering the observed model responses in y) the result Sx from Eq. 2.15 and the chain

the matrix S is indeed scaled by σ−1y such as S = σ−1

y S

15


rule applied to Eq. 2.2 with respect to θ are used

∂y

∂θS

=∂h

∂x

∂x

∂θSx

(2.16)

If the state variables are the only measured, i.e., y ⊆ x then S = Sx. The scaled form

of the sensitivity matrix (i.e., S) is also computed when precision of the measurements is

available. Note that S encompasses the influence of the parameters and the reliability of

the available measurements in one matrix.

The sensitivity information is used in this research to determine the most influential

parameters to the outputs, the most linear independent parameters, the state of ill-

conditioning of the estimation and the identifiability of the model. In the forthcoming

analysis the scaled sensitivity matrix S will be used.

Sensitivity measure

The sensitivity measure δj in Eq. 2.17 is introduced to assess the individual parameter

weight in terms of sensitivities in the parameter estimation problem. It is based on the

Euclidean Norm of the j-th column of S and measures the mean sensitivity of the total

observed model response vector Y to changes in the parameter θj [17]. A high value of

δj means that the parameter θj has a considerable impact on the simulation result of the

responses, while values near zero or zero mean that their simulation result does not depend

on the parameter θj . It implies that the parameter with the largest δj for j = 1, · · · , Nθ is

called the most sensitive parameter, whereas the parameter with the smallest δj is called

the most insensitive.

δj =

√ 1

Ny ·Nm ·Ne

Ny ·Nm·Ne∑i=1

s2ij (2.17)

2.4. Singular value decomposition

This section describes the singular value decomposition (SVD) of the sensitivity matrix

S, which will be the cornerstone to conduct identifiability and ill-conditioning analysis on

the parameter estimation problem of Eq. 2.5. The connection between S and F by means

of the singular values of S and the eigenvalues of F is here exploited.

16

2.4. Singular value decomposition

2.4.1. SVD of the sensitivity matrix

The SVD explain in Section A.5.9 and declared in the Definition A.5.9 of the Appendix

A.5 is here used to decompose the sensitivity matrix S

S = USvVT =

Nθ∑i=1

ςiuivTi , (2.18)

where Sv ∈ RNy ·Nm·Ne×Nθ is a diagonal matrix Sv = diag(ς1, ς2, · · · , ςNθ) with nonnegative

diagonal elements ς1 ≥ ς2,≥ · · · ,≥ ςNθ> 0 which are the singular values of S. The

matrices U ∈ CNy ·Nm·Ne×Ny ·Nm·Ne and V ∈ CNθ×Nθ are unitary matrices. The triplet

(U, Sv, V ) is called the singular value decomposition of S. The columns vi ∈ V with

i = 1, · · · , Nθ and uj ∈ U with j = 1, · · · , Ny ·Nm ·Ne are called the right and left singular

vectors of S, respectively.

2.4.2. SVD of the sensitivity matrix vs the eigensystem of Fisher-information

matrix

The Fisher-information matrix F in Eq. 2.13 may be also declared as a function of the

singular values of S by using the proposition in Section A.5.9 of the Appendix A.5, which

relates the SVD of S in Eq. 2.18 and the eigendecomposition of ST S, i.e., F . With relation

between SVD and eigendecomposition, F may be written as

F = V (STv Sv)V

T =

Nθ∑i=1

ς2i vivTi . (2.19)

The Eq. 2.19 is the eigensystem of the Fisher-information matrix F whose diagonal

elements of matrix S2v (the square of the singular values ςi) are the eigenvalues of the real

symmetric matrix F . This relation is shown in Eq. 2.20, where the eigenvalues of F are

denoted as λi(F ), with i = 1, · · · , Nθ.

ς2i = λi(F ) (2.20)

The SVD of S provides information that encompasses the information given by the

eigensystem of F . As a practical matter in identifiability and ill-conditioning analysis

(both lumped as the structural analysis) there are reasons for preferring the use of the

SVD of S than the eigenvalues of F [12]. Firstly, it directly applies to the sensitivity

matrix S instead of F , which is also used for local identifiability analysis of nonlinear

parameter estimation. Secondly, although the computation of the SVD is typically more

expensive than the eigensystem, in the context of larger problems with sparse matrices,

the SVD can be performed more efficiently [14, 130]. Moreover, the computation of F in

Eq. 2.13 corresponding to (Ny ·Nm) ·N2θ operations can be saved. Thirdly, the SVD of

17


S where S is ill-conditioned can be computed with much greater numerical stability than

the eigensystem of F [12]. In this context, when the system is extremely ill-conditioned

(see Section 3.2.1), the accuracy of the smallest eigenvalues of F is really doubtful [12].

Furthermore, additional numerical errors could take place in the computation of F in Eq.

2.13 which could also lead to the presence of unexpected negative eigenvalues [78].

2.5. Parameter estimator analysis

The performance of the estimator Θ given the sample data Y m of the parameter estimation

in Eq. 2.5 may be assessed analyzing some of its statistical properties. Here it is considered

the variance V ar(Θ) and the bias β(Θ). The former arises from the variability generated

by measurement errors and the latter is the difference between the expected value of the

estimator Θ and the true (or reference) parameter θ∗. The bias (also referred as accuracy)

measures the systematic error, whereas the variance (also referenced as precision) measures

the random error [3, 93, 126]

In practice, it is said that a good estimator is that with a small bias and a small variance

[51, 89]. A desirable estimator should be unbiased (β(Θ) = 0) and have minimum variance

(min V ar(Θ)). Unfortunately, these two properties are often unattainable (although the

theoretical maximum likelihood estimator has these natural properties). The best that it

can be done is to test them and keep them small as possible.

2.5.1. Estimator precision

The precision of the parameter estimator Θ may be assessed by using its parameter co-

variance matrix C and its confidence interval CI [3, 38, 74, 89, 127]. Before establishing

the corresponding definitions of C and CI, the main assumption regarding the parameter

probability distribution will be clarified.

In the parameter estimation problem of Eq. 2.5 the point estimate θ is a function of

the total observed measured response vector Y m. Therefore, θ is considered a random

vector because of Y m is also considered random. In that sense, Θ is an estimator of θ

such that θ ∈ Θ. Furthermore, under a number of regularity and sampling conditions, as

Ny ·Nm ·Ne →∞, the estimator Θ approximately follows a normal distribution

Θ ∼ N (θ∗, C) (2.21)

with mean the true parameter θ∗ and covariance matrix C [74, 89]. The probability

distribution of Θ is called a sampling distribution.

Covariance matrix

The general definition of a covariance matrix is set up in Section A.5.14 (Eq. A.17) of the

Appendix A.5. In the context of parameter estimation, the parameter covariance matrix

18


C ∈ RNθ×Nθ reads as follows:

C := E[(Θ− E(Θ))(Θ− E(Θ))T

](2.22)

This is a symmetric positive matrix whose (i, j)-element σ2θij is the covariance of the

i-th parameter θi with respect to the j-th parameter θj for i, j = 1, · · · , Nθ. The diagonal

element σ2θj is the variance of the j-th parameter θj .

Confidence interval

The confidence interval CI supplies the bounds of plausible values of the unknown param-

eter. CI is constructed to have a high confidence that it contains the unknown parameter

despite the fact it cannot be said that the interval contains the true, unknown parame-

ter. The interval is a measure of the uncertainty inherent in the estimator Θj . A narrow

confidence interval indicates that the effect size is known precisely. On the contrary, long

intervals show little knowledge about the true parameter (in well-posed estimations), and

that further information is needed. Another interpretation of a 95% CI says that for an

infinite number of solutions of the estimation problem using the same experimental con-

ditions and generating the same experimental data but considering random measurement

errors (normally and independently distributed with mean zero and constant variance),

the true value of the parameter θj will be contained within 95% of all observed confidence

intervals.

A confidence interval for the parameter θj is an interval of the form

lbj ≤ θ∗j ≤ ubj . (2.23)

where the end-points lbj and ubj are called the lower- and upper-confidence limits, respec-

tively. lbj and ubj change with each sample data which is analyzed. These end-points

are values of random variables LBj and UBj and CI is considered a random interval [89].

The probability of selecting a sample in which the interval contains the true value of θj is

called 1− α [84, 89]:

PLBj ≤ θ∗j ≤ UBj

= 1− α. (2.24)

This interval is known as the 100(1− α)% confidence interval with alpha being called the

confidence level. The CI in Eq. 2.24 because provides both the lower and upper confidence

limits is also referred as to a two-side CI.

In parameter estimation typically the true value and the variance of θj are unknown and

moreover the experimental data could be scarce. In that case, the form of the underlying

parameter distribution to obtain a valid CI must be assumed. A reasonable assumption

is a normal distribution [89]. The expected value or mean of the estimator Θj and its

variance, i.e., E[Θj ] and σ2θj, respectively, are then used to characterize this arbitrary

19


normal distribution. In order to construct a two-side CI on θj the normally (but not

standard) distributed E[Θj ] with unknown mean θj and unknown variance is usually

standardized with the random variable

T =(E[Θj ]− θ∗j )

σθj(2.25)

which has a t-distribution with Ny · Nm · Ne − Nθ degrees of freedom (DoF). σθj is the

standard deviation of Θj computed as the squared root of the j-th diagonal element of C.

If tα/2,(Ny ·Nm·Ne−Nθ) is the upper 100α/2 percentage point of the t-distribution with DoF

equals to Ny ·Nm ·Ne −Nθ, then the 1− α probability for the t distribution is

P−tα/2,(Ny ·Nm·Ne−Nθ) ≤ T ≤ tα/2,(Ny ·Nm·Ne−Nθ)

= 1− α. (2.26)

Substituting Eq. 2.25 in Eq. 2.26 and manipulating it, this last equation results in

PE[Θj ]− tα/2,(Ny ·Nm·Ne−Nθ)σθj ≤ θ∗j ≤ E[Θj ] + tα/2,(Ny ·Nm·Ne−Nθ)σθj

= 1− α.

(2.27)

The lower and upper limits of the inequalities in Eq. 2.27 corresponds to the lower-

and upper- confidence limits LBj and UBj in Eq. 2.24, respectively. This leads to the

formulation of the 100(1 − α) percent two-sided confidence interval on θj from a sample

data

E[Θj ]− tα/2,(Ny ·Nm·Ne−Nθ)σθj ≤ θ∗j ≤ E[Θj ] + tα/2,(Ny ·Nm·Ne−Nθ)σθj (2.28)

where lbj = E[Θj ]− tα/2,(Ny ·Nm·Ne−Nθ)σθj and ubj = E[Θj ] + tα/2,(Ny ·Nm·Ne−Nθ)σθj .

2.5.2. Estimator accuracy

Accuracy is a qualitative term defined as the distance between the estimated (or observed)

value and the true or reference value. Bias on the other hand is the quantitative term

describing the previous difference. The bias is given by,

β(Θ) = E[Θ]− θ∗. (2.29)

If an estimator is unbiased, the bias is zero and then E[Θ] = θ∗. However, an estimator

based on a finite sample, can additionally be expected to differ from the true parameter

due to systematic errors, for instance. This estimator is called a biased estimator.

A measure of the expected performance of the biased parameter estimator Θ may be

computed by means of the Mean Square Error MSE which is the expected squared differ-

ence between Θ and θ∗. MSE simultaneously considers the effect of the bias β(Θ) and

20


the variance V ar(Θ) and is given by,

MSE =β(Θ)

2 + var(Θ)2 . (2.30)

The first term on the right-hand side is the squared norm of the bias and the second

term is a metric of the parameter variance. One scalar measure of the parameter variance

is the trace of the parameter covariance matrix Tr[C] = sum(diag(C)) [57, 83]. In fact,

Tr[C] follows the A-optimality criterion used in optimal experimental design [101].

2.5.3. Reliability tests

Two statistical procedures are considered to assess the reliability of the model parameters

after performing an estimation, namely the hypothesis and confidence interval tests. Both

tests evaluate the statistical significance of the parameter θj by analyzing its corresponding

estimator Θj .

Hypothesis test

This test allows to evaluate if the parameter θj might be dropped from the model. The

statistical significance of θj with j = 1, · · · , Nθ is obtained by the hypothesis-testing

procedure in which the null hypothesis H0 is contrasted to its negation in the alternative

hypothesis H1:

H0 : θj = 0 (2.31a)

H1 : θj = 0, (2.31b)

This test relies on the comparison of a test statistic with a reference value from the two-

tails Student’s t-distribution with degrees of freedom equals to Ny ·Nm ·Ne −Nθ [41, 84].

In order to accept or reject the null hypothesis H0 the Student t-value (or test statistic)

is defined as

TH0 =E[Θj ]

σθj, (2.32)

which uses the expected value of the parameter estimator E[Θj ] of the j-parameter θj

and its corresponding standard deviation σθj (the squared root of the j-th entry of

the diagonal of the parameter covariance matrix C. The null hypothesis is rejected if

|TH0 | > tα/2,(Ny ·Nm·Ne−Nθ), meaning that the parameter θj contributes significantly to the

model. On the contrary, very low |TH0 |-values are evidence that the parameter could be

statistically excluded from the model. It should be noted that in Eq. 2.32 the value of TH0

is inversely proportional to the parameter standard deviation σθj . Hence, if a determined

parameter has large variance or uncertainty it might not pass the test and is a candidate to

21


leave the model. That is actually true when the observable predicted variables are not sen-

sitive to a parameter. Insensitive parameters will have by definition large variance (small

|TH0 |-value) and might then leave the model without any effect. A sensitivity analysis

should be then wisely performed. Notice that a local sensitivity analysis depends on the

current experimental data set, therefore any decision on it should include a wide range of

experimental conditions. In general, ill-conditioned estimates (generated by identifiability

problems) prompt low |TH0 |-values. This fact was already established in Refs. [41, 79]

specifically making reference to effects of high correlation between parameters (an iden-

tifiability problem cause). Nevertheless, any claim based on low |TH0 |-values should be

carefully formulated because some parameters might just have a badly computed variance

generated by the influence of the ill-conditioning (see variance-decomposition in Section

3.2.2). In that case, the parameters with large |TH0 |-values (small parameter variances)

should be considered statistically important, but no further conclusions about those with

small values should be released without an appropriate ill-conditioning analysis.

Confidence interval test

The second measure of reliability of the parameter θj is the 100(1−α)% confidence interval

obtained in Section 2.5.1. The rationale behind is that parameters with confidence intervals

which contain the zero-value does not have statistical significance at the α-level and might

be dropped from the model. According to Eq. 2.28 parameters with large variance have

the largest probability of including the zero-value in their confidence interval. In that case

the close relationship between the confidence intervals and the hypothesis testing would

indicate that the null hypothesis cannot be rejected at the same α-level. On the contrary,

if a parameter is significantly different from zero at the α-level (null hypothesis rejected),

then the 100(1 − α)% CI will no contain zero and the parameter should be remained in

the model structure.

2.6. Identifiability

The parameterization of the nonlinear process model described by Eq. 2.1 is conducted by

the solution of the parameter estimation problem in Eq. 2.5. This solution (the parame-

ters) might not be unique and is a specific attribute of the mathematical model structure

and the set of observations used for the fitting. These observations are collected in well-

defined stimulus-response experiments performed on the dynamic system [31]. On the one

side, suppose the model structure is distinguishable [124] and has been already selected

as the best candidate of this dynamic system. The answer to the question of whether the

corresponding parameters might be uniquely determined from the experimental data is

studied by the model identifiability. On the other side, if the model structure is not yet

selected, model identifiability might also answer the question whether the experimental

set-up enables the selection of the best model structure [124]. If the parameter uniqueness

22

2.6. Identifiability

cannot be warrantied then either the mathematical model or the experiment itself should

be modified [106]. Only a well-posed estimation problem will be free of identifiability

issues.

There are two ways to consider identifiability, namely qualitatively and quantitatively.

The qualitative identifiability assumes ideal conditions such as a correct model, an ideal

and continuous experiments [120]. Whereas, the quantitative identifiability deals with

more realistic experimental conditions with noise and discrete measurements.

2.6.1. Qualitative identifiability

The basic assumptions of the qualitative or deterministic identifiability are the ideal con-

ditions of an error-free model structure and noise-free and continuous observations of

measurable quantities [120]. In terms of the model in Eq. 2.1, it is assumed that the map-

pings F and h are real analytic functions at every θ ∈ Ω. Besides that, these assumptions

imply that the observed model responses in y at some nominal parameter θ1 for all t over

the time interval [0, texp] are equivalent to their corresponding observed response variables

in ym under the operating conditions in u, i.e.,

y(t, u, θ1) = ym(t). (2.33)

If there exists another (nominal) parameter θ2 which yields the same predicted response

in all feasible experiments, i.e.,

y(t, u, θ1) = y(t, u, θ2), (2.34)

for all t ∈ [0, texp], it is said that θ2 is indistinguishable from θ1. Identifiability attempts to

find the number of different solutions θ2 of the Eq. 2.34, first at the fixed nominal value θ1.

Having so, the model in Eq. 2.1 is considered globally identifiable at θ1 if for any θ1, θ2 ∈ Ω

the Eq. 2.34 implies θ1 = θ2. Furthermore, the model in Eq. 2.1 is locally identifiable at

θ1 if there exists an open neighborhood V of θ1 ∈ Ω such that for any θ2 ∈ V the Eq. 2.34

implies θ2 = θ1.

Moreover, if the model is identifiable for almost every θ1 ∈ Ω it is called structural

identifiable. In terms of global and local identifiability, the model is said to be structurally

globally (locally) identifiable if it is globally (locally) identifiable for almost any θ1 ∈ Ω.

Techniques to diagnose global identifiability of nonlinear models, namely Taylor series,

power series expansions, local state isomorphism or similarity transformation approach,

transform the differential system into an algebraic equation system to be solved for the

parameters. If this system has a unique solution then it is demonstrated the model is

globally identifiable [120, 124]. In order to diagnose local identifiability the linear inde-

pendence analysis at θ1 of the sensitivity functions ∂y(t, u, θ1)/∂θ1 could be used. If the

sensitivity functions are linear independent at θ1 then the model is locally identifiable at

23


θ1. Nevertheless, the linear independence of numerically computed sensitivity functions is

difficult to established and this condition is nor reliable [120] .

2.6.2. Quantitative identifiability

This analysis applies to discrete systems which is the typical case in real experiments and

has a local nature. Therefore, the model is considered locally identifiable or not. It deals

with parameters (from a possible qualitative identifiable model) with large variance gener-

ated, for instance, by low measurement precision, limited experimental data or deficiency

on the location of sample points [120].

Methods to detect this quantitative local identifiability refer to analyze the singularity

or near singularity of the Fisher-information matrix F in Eq. 2.13 [1, 31, 120] or to analyze

the linear dependence or nearly linear dependence of the columns of the sensitivity matrix

S in Eq. 2.9 (or its scaled form S in Eq. 2.12) [31, 78, 120]. The most common local

identifiability condition in literature is based on the analysis of the Fisher-information

matrix. This identifiability condition evaluated at some θ may be summarized as follows:

Definition 2.1 Local identifiability condition based on the Fisher matrix.

The model M is locally identifiable if and only if the Fisher-information matrix F is

nonsingular [31, 120].

The nonsingularity of F is typically assessed by testing its determinant such that

det(F ) = 0.

On the other hand, its equivalence in terms of the sensitivity matrix S evaluated at

some θ reads:

Definition 2.2 Local identifiability condition based on the Sensitivity matrix.

The model M is locally identifiable if and only if the matrix S has numerical rank rϵ equal

to the number of parameters Nθ [31, 78, 120].

It is important to highlight that the rank of a matrix exposes the number of its linearly

independent columns (if the matrix has less columns than rows). Consequently, evaluating

the rank of S means to determine the linearly independence of its columns (see QR method

in Section 4.4.2).

Both identifiability conditions are equivalent according to matrix theory and are called

(local) identifiability conditions (see Appendix A.4 for further justification of the link

between S and F ). They are practical tools to measure the quality of the information

available from real experimental data. In this thesis the structural analysis of the sensi-

tivity matrix S is the cornerstone, hence the identifiability will be analyzed based on the

structural properties of this matrix.

24

2.7. Optimal experimental design


This section focuses on the mathematical formulation of the optimal experimental design

(OED) for improving parameter precision of the fixed model structure given by Eq. 2.1.

It is an optimization problem formulated as

ucrit := argminu

Ψcrit(C), (2.35)

where ucrit contains the optimal experimental conditions and the cost function Ψcrit

is the information function defined on the positive semi-definite domain PSD(Nθ) into

the real numbers, i.e., Ψcrit : PSD(Nθ) −→ R. The information function is positively

homogeneous, superadditive, nonnegative, nonconstant, and upper semicontinuous [101].

The vector ucrit minimizes the size of the confidence region of the model parameters

through the metric of C (given by Ψcrit). The target of Ψcrit(C) is to capture the largeness

of the parameter covariance matrix C. This transformation in Ψcrit(C) from a high-

dimensional matrix to a real number unfortunately cannot completely retain all aspects

of the variance of multiple parameters and therefore it cannot suit all demands. The

question is to find the appropriate metric for the respective application. Furthermore, due

to the approximation of C via the Fisher-information matrix F in Eq. 4.3, the optimal

problem in Eq. 2.35 can be also seen as the maximization of the information content of

the experiment enclosed in a metric of F .

2.7.1. OED design criteria

The information function Ψcrit(C) is called the OED criterion which may be one of the

alphabetic criteria A, D and E such that crit ∈ A,D,E:

ΨA(C) =1

Nθ[tr(C)] (2.36)

ΨD(C) = [det(C)]1

Nθ (2.37)

ΨE(C) = λmax(C), (2.38)

where tr(C) in A-criterion (also called the average-variance criterion) is the trace of the

parameter covariance matrix C, det(C) in D-criterion (also called the generalized variance

criterion) is its determinant and λmax(C) in criterion E (also called the minimax criterion)

is its largest eigenvalue. As a matter of fact, the criteria A and D can also be declared as

a function of the eigenvalues of C. The former is a summation of eigenvalues whilst the

latter is a product of them (see first equality in Eqs. 2.39 and 2.40, respectively).

25


ΨA =1

Nθ[

Nθ∑i=1

λi(C)] =1

Nθ[

Nθ∑i=1

1

λi(F )] =

1

Nθ[

Nθ∑i=1

1

ς2i] (2.39)

ΨD = [

Nθ∏i=1

λi(C)]1

Nθ = [

Nθ∏i=1

1

λi(F )]

1Nθ = [

Nθ∏i=1

1

ς2i]

1Nθ (2.40)

ΨE = λmax(C) =1

λmin(F )=

1

ς2min

(2.41)

Taking into account the inverse relation between C, F and S (see Section 4.3.1 ) and

by applying the Theorem A.5.5 of the Appendix A.5, the eigenvalues of C and F , and the

singular values of S are related by

λi(C) =1

λi(F )=

1

ς2i, ∀i = 1, · · · , Nθ. (2.42)

The Eq. 2.42 uses the squared reciprocal of the singular values of S to simplistically

approximate the eigenvalues of C of the maximum likelihood estimator [15]. Having so,

all mentioned OED criteria may also be defined as a function of the eigenvalues of F (see

second equality of Eqs. 2.39 - 2.41), and declared as a function of the singular values of

the sensitivity matrix S (see third equality of Eqs. 2.39 - 2.41).

Finally, the invariance property under re-parameterizations of D [101], the reduced

computational complexity of A (only requiring the computation of the diagonal entries of

C) [101] and the ability to measure ill-conditioning of E [15] (see Section 3.2.1) are the

most important qualities of each criterion.

2.7.2. Graphical interpretation of OED design criteria

Being the Fisher-information matrix F positive definite, the approximation of the con-

fidence region in Eq. (17) has an ellipsoidal shape which is centered at the estimate θ

[3]. The Nθ eigenvectors of C are the Nθ principal axes of the ellipsoid whose lengths are

proportional to√

λi(C), for i = 1, · · · , Nθ, i.e., the reciprocal of the singular value of S.

The longest axis (the largest eigenvalue of C, i.e., λmax(C)), defines the worst-determined

direction in the parameter space Ω, and the shortest axis (the smallest eigenvalue of C,

i.e., λmin(C)) defines the best-determined direction. The target of the design problem in

Eq. 2.35 is to shrink this ellipsoidal region to an acceptable size.

The application of the different design criteria in Eqs. 2.36-2.38 can be interpreted

geometrically [41, 109]. The D-criterion aims at minimizing the volume of this confidence

region, the A-criterion the dimensions of the enclosing box around the confidence region

and the E-criterion the size of its major axis (i.e., λmax(C)).

26


D-design

A-design E-d

esig

n

1E-02

1E-01

1E+00

1E+01

1E+02

1 2 3 4 5 6 7

Singular value (ςςςςi)

Figure 2.2.: Influence of alphabetic experimental design criteria on the singular value spectrum(SVs) of the sensitivity matrix S. (Figure from publication III - Lopez et al. (2015) -reprinted with permission from Elsevier Science)

It is worthwhile to point out that each criterion in Eqs. 2.36-2.38 directly affects the

singular values ςi for i = 1, · · · , Nθ of S, because of the inverse relation between ς2i and

λi(C) in Eq. 2.42. The reciprocal of the square of the smallest singular value of S, i.e.,

1/ς2min = 1/ς2Nθis equal to the largest eigenvalue of C, i.e., λmax(C) = 1/ς2min. Therefore,

some or all singular values ςi of the sensitivity matrix S are, somehow, maximized and

the SVs of S is partially or completely improved when minimizing the eigenvalues of C.

For instance, the A-criterion preferably maximize the smallest singular values (specially

those smaller than 1) because their reciprocal contributes more in the summation of the

third equality of Eq. 2.39. The multiplicative nature of the D-criterion in Eq. 2.40 tends

to maximize the largest singular values (specially greater than 1). However, the smallest

singular values of the sensitivity matrix S could be indirectly reduced. In this case, S

would become worse-conditioned (see Section 3.2.1) possibly because of an intensification

of parameter correlations. This is a well-known feature of the D-optimal criterion [41, 109].

Accordingly, the D-design can lead to very large stretched confidence regions with a high

correlation of parameters, but with a small volume [109]. Moreover, the D-design tends

to give excessive importance to parameters to which the model is most sensitive [41]. It

is important to highlight that most of the sensitive parameters are related to the largest

singular values of S. Finally, E-criterion in Eq. 2.41 increases the smallest singular value

of S because of its effect on the largest eigenvalue of C. The tendency to improve the

small singular values of the SVs of S can therefore drive A- and E- optimal designs to

similar solutions. This is especially true for strong correlated parameters, see Ref. 109.

Fig. 2.2 shows the influence of the different criteria on the SVs of S in presence of

singular values smaller than 1. The largest values (top section) of the SVs determine the

D-criterion value and the smallest values (bottom section) the A- and E-criterion values,

27


with the E-criterion being solely determined by the smallest one. Note that the respective

portion of the SVs whose contribution can be neglected is improved only by chance.

2.7.3. Observation about singular matrices in OED

Most of the development in model-based experimental design is carried out for systems of

full-rank matrices [101]. In this context, the Fisher-information matrix F and consequently

the covariance matrix C (by properties of an invertible matrix in Section A.5.6 of Appendix

A.5 ) should be nonsingular. In this scenario the calculation of all design criteria in Eqs.

2.36-2.38 is consistent.

However, it might not be necessary true considering that F in Eq. 2.13 is a positive

semi-definite matrix. It means that F might be singular because of the presence of zero-

eigenvalues (see Proposition A.5.5 of Appendix A.5). If zero is an eigenvalue of F , then

it is not invertible and has determinant equals zero (see theorem A.5.6 of Appendix A.5).

In that scenario, C cannot be approximated by the inverse of F (see Eq. 4.3) and the

design criteria in Eqs. 2.36-2.38 are indefinite. Furthermore, the assumption of θ being an

unbiased estimator of θ does not hold anymore [114] and the squared distance between the

estimator θ and the true parameter value θ∗ is unbounded [120]. Thus, the parameters θ

will be highly uncertain (even biased) and the model is not locally identifiable (see Section

2.6).

This mathematically supported facts may be computationally a common reality because

of numerical errors related to the rounding errors and machine precision when matrix

operations on ill-conditioned matrices are performed. For instance, the computation of

C requires a matrix product in Eq. 2.13 to get the Fisher-information matrix F and

then its posterior inversion. If the sensitivity matrix S is ill-conditioned in Eq. 2.13 (the

causes of having ill-conditioned matrices related to identifiability problems in parameter

estimation are addressed in Section 3.2.1), F might be numerically inaccurate and loose its

positive semi-definiteness exhibiting some negative eigenvalues. The same features apply

to the parameter covariance matrix C. An inaccurate C harms the parameter precision

assessment and all posterior analysis. Inaccurate eigenvalues of C significantly affect the

computation of OED (see Section 2.7.1). This fact will be illustrated in in Section 7.3.3 of

Chapter 7 for a severe ill-posed problem on a sequencing batch reactor for water treatment.

2.7.4. Sequential OED

The quality of a computed optimal design ucrit in Eq. 2.35 depends on the quality of the

unknown parameters [40]. Usually previously estimated parameters which depended on

the initial guesses determine the future optimal design. Thus, the information content

of an OED is attached to the accuracy of the assumed parameter initial guess. In order

to diminish the effect of these uncertainties and guarantee robustness in computing an

OED for nonlinear models two methods are typically implemented, namely direct and

28


indirect approaches. In the direct method uncertainties are defined and considered a priori.

Whereas in the indirect method an iterative refinement of the experimental design together

with a repeated update of the parameter values is employed. The indirect also called

“sequential“ or “adaptive“ method is an iterative procedure where a local optimal design

is computed in each iteration. The experiment is carried out, an initial parameter guess

is defined, parameters are re-estimated from the available observations, and subsequent

optimized experiments are computed, executed and analyzed [8] as shown in the outer cycle

of Figure 2.1. The design is obtained by solving the optimization problem in Eq. 2.35

using the parameter variance criteria in Eqs. 2.36, 2.37 and 2.37. This sequential solution

of experimental design and parameter estimation problems is then continued until the

estimated parameters meet predefined parameter precision specifications (see Ref. 69, 6

for examples). The sequential approach can provide reliable estimates (high parameter

precision with small confidence regions) in the presence of uncertainties. In the case of a

dynamic process the collection in time of measurements has the potential of a real-time

redesign implementation which might drastically reduce the experimental effort. The basis

of the real-time (online) design of experiments is explained in the following.

2.7.5. Online OED

The online model-based redesign of experiments is an extension of the adaptive design,

which was first presented in the 70ies by Ref. 88. This approach was originally proposed

for nonphysical models, however in the actuality various studies have been achieved for

mechanistic models [113, 68, 43, 62, 136, 44]. In contrast to the offline adaptive design, in

the online approach, new measurement data is exploited as soon as it is available and before

designing a new experiment. Hence, the experiment is executed collecting one or several

measurement sampling points, the parameter values are updated and the experiment is

redesigned. This procedure iterates between local solutions of the OED and PE problems

and then repeated until the end of the experiment. It turns out in an iterative refinement

of both parameter estimates and input actions.

Some authors have adopted concepts from model predictive control (MPC) (see, e.g.

Ref. 22) and recursive state and parameter estimation methods for the implementation

of the online redesign of experiments. Those implementation schemes are also referred

to as receding horizon experiment design (see Ref. 113, 62, 136). In the following the

mathematical formulation of real-time experimentation, parameter estimation and design

of experiments (proposed and used in Refs. 8, 9) as well as the corresponding notation

are introduced.

Finite time horizon schemes

The real time implementation is based on a finite time horizon scheme, see Fig. 2.3. It is

considered a repeated update of current parameter estimates (solution of the PE problem)

29


and planned input actions (solution of the OED problem). Limited time horizons are

defined and measurement data is assumed to be taken at equally spaced sampling times

tk. Moreover, input actions are implemented as piece-wise constant trajectories which

match the time grid defined by the measurement sampling times.

t0 t1 … tk-1 tk tk+1 … tk+h-1 tk+h

uk|k

yk|k

increasing horizon receding horizon

yk-1|k

uk+1|k

yk+1|k

uk-1|k

yk+h|k

uk+h-1|k

implementation time

y1|k

u0|k

Figure 2.3.: Discretization grids and time horizons used in the online algorithm. (Figure takenfrom Barz et al. (2013) [8] - reprinted with permission from AIChE Journal)

During the experimentation for any sampling time instant tk, a new measurement ymk =

ym(tk) is obtained. Considering all prior measurements at instants t1, · · · , tk−1, the current

measurement vector reads Y mk =

(ym1 , ym2 , · · · , ymk−1| ymk

)T. Due to the growing number of

elements in the measurement vector with an ongoing experiment time, the time horizon

(t0, tk) is referred to here as increasing horizon. In Figure 2.3 a discretization grid with

elements of length ∆t = (tk − tk−1) is shown which meets the measurement sampling

instants tk.

In every instant tk all available measurements Y mk are used to update the current pa-

rameter estimate θk. This is done by fitting the vector of simulated response variables

Y −k =

(y1|k, y2|k, · · · , yk|k

)T 3 to the measurement vector Y mk (solution of problem in Eq.

2.43). The simulated response variables Y −k are obtained from the solution of Eq. 2.1 for

t = [t0, tk] and taking into account the discrete input actions in all past intervals within

the increasing horizon U−k = [u0|k, u1|k, · · · , uk−1|k]

T as well as the initial states x0 = x(t0).

In the same way, for any sampling instant tk, a prediction or receding horizon [tk+1, tk+h]

is considered, where the predicted outputs Y +k = [yk+2|k, yk+3|k, · · · , yk+h|k]

T are obtained

from the solution of Eq. 2.1 for t = [tk+1, tk+h] and taking into account the future discrete

input actions U+k = [uk+1|k, uk+2|k, · · · , uk+h−1|k]

T and the initial states x0 = x(tk+1). All

future input actions U+k are updated for each tk by the solution of an OED problem and

based on the current parameter estimate θk.

According to the iterative implementation strategy, the parameter estimation as well

3The notation yi|j indicates the values of the predicted variable vector at sampling instant i which iscalculated at instant j.

30


as the generation of new input actions are repeated at each time instant tk. Thus, all

corresponding computations have to be finished within one sampling interval ∆t. In Figure

2.3 this sampling interval is denoted as implementation time. It is important to point out

that during the implementation time the corresponding input actions uk|k are executed in

the experiment so these are not considered in the OED problem formulation (update of

the future input actions in the time horizon [tk+1, tk+h]). Moreover, the predicted output

yk+1|k is not considered in the PE problem (update of the current parameter estimate

using measurement from the time horizon [t0, tk]) [8]. However, initial states x0 = x(tk+1)

are needed for the definition of the OED problem. The initial states x(tk+1) are obtained

from a simulation step, solving 2.1 for t = [tk, tk+1] with uk|k, θk and x0 = x(tk).

Online mathematical formulation of PE and OED

In the adaptive online OED for increasing parameter precision at each sampling instant tk

parameter estimation and optimal experimental design problems are solved. It enhances

the current parameter estimate θk and updates the planned input actions U+k , respectively.

The discrete formulation of the PE problem in Eq. 2.5 reads

θk = argminθk

ΦLSQk (U−

k , θk) (2.43)

with ΦLSQk (U−

k , θk) =(Y −k (U−

k , θk)− Y mk

)T · (C−Y,k)

−1 ·(Y −k (U−

k , θk)− Y mk

),

where Y −k ∈ RNy ·k is the vector of past outputs and Y m

k and Cy,k are the available mea-

surement vector and the covariance matrix of the experimental errors, respectively. C−Y,k

is assumed to be a diagonal matrix with the variance σ2y,i of each measurement i in its

diagonal entries. Moreover, θk ∈ RNθ is the parameter (point) estimate at the k-th time

instant and U−k ∈ RNu·k are the input actions which were implemented in all past intervals.

The corresponding sensitivity matrix S−k (U

−k , θk) ∈ RNy ·k×Nθ is given as

S−k (U

−k , θk) = (Σ−

Y,k)−1 ∇θkY

−k (U−

k , θk), (2.44)

where the weighting matrix Σ−Y,k ∈ RNy ·k×Ny ·k is the standard deviation matrix of the

experimental errors of the measurements such that C−Y,k = Σ−

Y,kΣ−Y,k.

The discrete formulation of the OED problem in Eq. 2.35 reads

U+∗k = argmin

U+k

Ψcritk (U+

k , uk|k, U−k , θk) (2.45)

with Ψcritk ∈ ΨA,ΨD,ΨE,

where U+∗k ∈ RNu·(h−1) is the optimal input vector at the k-th time instant, Ψcrit

k is one of

the criteria described in Eqs. 2.36-2.38 evaluated in the k-th time instant, and θk is the

parameter estimate at the k-th time instant. The Fisher-information matrix Hθ,k at the

31


k-th iteration for the approximation of the parameter covariance matrix Ck according to

Eq. 4.3 is calculated as

Hθ,k(U+k , uk|k, U

−k , θk) = H−

θ,k(U−k , θk) +Hθ,k|k(uk|k, θk) +H+

θ,k(U+k , θk), (2.46)

where the first term H+θ,k is a non-constant contribution in Eq. 2.45 because depends on

the future input actions U+k , i.e., the decision variables in the optimization of Eq. 2.45.

The second and third terms Hθ,k|k and H−θ,k are constant contributions in the OED because

they depend on currently implemented and past input actions, respectively. Anyhow, all

contributions in Eq. 2.46 have to be updated for the current parameter estimate θk. It

is important to highlight that the computation of Hθ,k(U+k , uk|k, U

−k , θk) according to Eq.

2.13 is based on the corresponding scaled sensitivity matrix

Sk(U+k , uk|k, U

−k , θk) =

⎡⎢⎣ S−k (U

−k , θk)

Sk|k(uk|k, θk)

S+k (U

+k , θk).

⎤⎥⎦ (2.47)

Sk(U+k , uk|k, U

−k , θk) ∈ RNy ·(k+h)×Nθ is calculated using the last available estimated pa-

rameter vector θk at the time instant tk considering in its structure the past, present and

future measurements generated by the input actions U−k , uk|k and U+

k , respectively.

2.8. Initial guess sampling

In nonlinear optimization problems there exists a large probability of finding multiple local

minima. The nonlinear least squares estimation in Eq. 2.5 and the optimal experimental

design in Eq. 2.35 are subject to this issue. Applying gradient-based algorithms the finding

of the global solution depends mainly on the selection of the parameter initial guess (IG).

The high nonlinearity of the chemical and bioprocess process models of interest in this

thesis makes this case pretty important. Strategies to tackle this weakness consider data

from literature, priori calculations using the model, modeler experience or solutions of

different parameter estimation problems using several IGs as starting points. This section

explains a sampling strategy called Minimum bias Latin hypercube design to statistically

generate initial guesses for either PE or OED.

2.8.1. Minimum bias Latin hypercube design (MBLHD)

In order to find an appropriate IG for parameters in parameter estimation a sampling

type algorithm within a prescribed parameter range might be applied. The Minimum bias

Latin hypercube design (MBLHD) is based on Latin hypercube sampling (LHS), which

provides unique values for each point and exhibits better dispersion than other sampling

procedures such as the random and grid sampling [27, 86]-[94].

32

2.8. Initial guess sampling

According to Ref. [86] a LHS is constructed by dividing the range for each of the m

input variables into N strata of equal marginal probability 1/N . If each input variable

is assumed to be distributed uniformly on the interval [−1, 1], then the strata are all of

width 2/N . A single value is selected randomly from each stratum, producing N sample

values for each input variable. The values are randomly matched to create N sets of

values for the N simulator runs. MBLHD follows the same sampling structure of LHS

before but additionally it has found to satisfy the minimum bias conditions of symmetry

and orthogonality described by authors in Ref. [94] with sample values for each input

variable xiu with i = 1, · · · ,m and sample point u = 1, · · · , N given as follows:

xiu =2u−N − 1√

(N2 − 1), (2.48)

where N is the number of total sampling elements, which is chosen for each sample value.

The results of applying the MBLHD approach will be discussed in Chapter 6.

33

3. Theoretical background II: Ill-posed

problems and numerical regularization

In this chapter, the main characteristics of ill-posed problems and most common regular-

izations strategies are discussed.

3.1. Direct and inverse problems

Before giving the formal definition of well-posedness in the sense of Hadamard [50], let

define the general problem

g = A(q) (3.1)

in the functional spaces G,Q (e.g., Hilbert spaces), where q ∈ Q and g ∈ G and A (linear

or nonlinear) is an operator from Q into G. The data and the unknown space are G and

Q, respectively. It should be noted that g is a function of q or that A denotes the operator

which acts on q and produces f . This operator may be linear or nonlinear. In the linear

case Eq. 3.1 transforms into

g = Aq. (3.2)

From the formulation in Eq. 3.1 two problems can be considered:

1. the direct problem in which g is computed given q,

2. the inverse problem in which for some prescribed g the target is to find the solution

q.

Direct problems are basically concerned with finding a function that represents (models) a

process, phenomena or physical field at any point of a given domain at any instant of time

(if the field is nonstationary). Whilst, inverse problems are dealt with determining causes

for a desired or an observed effect [64]. Those causes might be the present state or physical

parameters of a system, and the effects might be futures observations or measurement data

of a process. A solution of Eq. 3.1 exists if and only if g is in the range of the operator A

[13]. Generally direct problems of mathematical physics are well-posed, whereas inverse

problems turn out to be ill-posed [13, 64].

Examples of inverse problems are present in physics (astronomy, quantum mechanics,

acoustics, electrodynamics, etc.), medicine (X-ray and NMR tomography, ultrasound test-

35

3. Theoretical background II: Ill-posed problems and numerical regularization

ing, etc.), geophysics (seismic exploration, electrical, magnetic and gravimetric prospect-

ing, logging, magnetotelluric sounding, etc.), economics (optimal control theory, financial

mathematics, etc.), ecology (air and water quality control, space monitoring, etc.), chemi-

cal engineering (chemical reactors, bioreactors, etc.) [64].

In the context of nonlinear modeling, the simulation using the model for a given param-

eter vector is considered the direct problem (see Figure 3.1). The parameter estimation

(identification) problem in Eq. 2.1, on the other hand is considered the inverse prob-

lem. The experimental measurement vector Y m (as the data of the inverse problem, i.e.,

g = Y m) is used to recover the unknown parameter vector θ, i.e., q = θ (see Figure 3.1)

which depends on the structure of the operator A, i.e., the Fisher-information matrix F

as a function of the sensitivity matrix S. In Figure 3.1 the relation between F and the

sensitivity matrix S is denoted by ϕ.

Input(parameters)

Output(measurements)

(S)

output(parameters)

Input(measurements)

(S)

(a)

(b)

Figure 3.1.: Graphical representation of a) the direct problem and b) the inverse problem in non-linear modeling.

3.2. Ill-posed problems

Hadamard [50] at the beginning of the 20th century established the definition of a well-

posed problem. That problem should be simultaneously fulfilled three conditions, namely

existence, uniqueness and continuity.

Definition 3.1 Well-posed problem in the sense of Hadamard [13, 50, 64].

The problem in Eq. 3.1 is well-posed if the following three conditions hold:

1. the existence condition: for each data g in a given class of functions G there exists

a solution q in a prescribed class Q,

2. the uniqueness condition: the solution q is unique in Q, i.e., there exists an inverse

operator A−1 from G into Q, and

3. the continuity condition: the dependence of q upon g is continuous, i.e., when the

error on the data g tends to zero, the induced error on the solution q tends also to

zero.

36


If at least one of the three previous conditions in Definition 3.1 is violated, the problem

is called ill-posed in the sense of Hadamard [50]. In other words, if the problem either

has no solutions in the desired class, or has many (two or more) solutions, or the solution

procedure is unstable (i.e., arbitrarily small changes of the experimental data may lead to

large perturbations of the solution).

Continuity is a necessary condition for stability or robustness of the solution [13]. Never-

theless, it is not sufficient especially for discrete ill-posed problems (coming from discrete

available data or discretization of ill-posed problems), where lack of robustness against

noise is usually evidenced [13]. Most difficulties in solving ill-posed problems are caused

by the instability of the solution. Consequently, the term ill-posed problems frequently

(but not only) makes reference to unstable problems.

In terms of stability, the third condition in Definition 3.1 can be formulated as a stability

condition which sets that for any neighborhood O(q) ⊂ Q of the solution q to the Eq. 3.1,

there is a neighborhood O(f) ⊂ F of the data f such that for all fδ ∈ O(f) the element

qδ belongs to the neighborhood O(q) [64].When the stability condition does not hold true, the error in the data is unlimited ampli-

fied in the solution because of the inverse of the operator in Eq. 3.1 is unbounded [13, 64].

However, some problems although formally well-posed (i.e., the three conditions in Defini-

tion 3.1 are fulfilled) suffer from practical instability when the system has ill-conditioned

matrices [13, 64]. In that case, a careful investigation about error propagation should be

conducted. To do so, the description of ill-conditioned problems and the concept of the

condition number as a measure of numerical stability of the problem must be introduced.

Other quantities as auxiliary ill-conditioning metrics [78] useful in the computational frame-

work of Chapter 4, namely the collinearity index and the sensitivity measure, will be also

introduced.

3.2.1. Ill-conditioned problems

Noisy measurement data may be harmfully propagated in the solution and lead to signifi-

cant misinterpretations of it [13, 64, 66]. A problem is considered ill-conditioned if small

errors in the data produce large errors in the solution. On the contrary, if small errors in

the data produce small errors in the solution, a problem is considered well-conditioned.

Ill-conditioning promotes numerical instability. In parameter estimation, the ill-

conditioning is determined by the linear dependencies or collinearity (see Definition A.5.1

of Appendix A.5) on the row/columns of an specific matrix. Thus, a practical analysis

of the ill-posedness of an estimation problem might be related to the ill-conditioning of

this specific matrix. If this matrix is severe ill-conditioned, the estimation problem is

treated as ill-posed. However, well-posedness in the sense of Hadamard is not a guarantee

of robustness against noise. Consequently, an ill-conditioning analysis is always highly

recommended.

37


In the general inverse problem of Eq. 3.1, the problem is considered ill-conditioned if

the operator A is ill-conditioned [12, 13, 52, 57, 64, 83]. For instance, in linear estimation

(the linear regression model y = Xβ + ϵ) the ill-conditioning is analyzed on the data

matrix X. On the other hand, in the nonlinear case as the least-squares estimation,

the ill-conditioning is locally analyzed based on the Fisher-information matrix F (see

Eq. 2.13) [1, 3, 10, 42, 60, 63, 83, 120]. However, because of the construction of F

in Eq. 2.13 the ill-conditioning properties of F are generated by the properties of the

sensitivity matrix S. Thus, in the nonlinear parameter identification context the matrix

A is the sensitivity matrix S. That means if S is far from orthogonal, then F is highly

ill-conditioned [78, 83, 120] and the estimation is considered ill-posed.

This ill-conditioning (generated by exact or near collinearity) is of paramount impor-

tance to the efficacy of least-squares estimation though it is a non-statistical problem [12].

In the presence of collinearity the uniqueness and stability of the least-squares estimator

may not be guaranteed. Moreover, the effect of linear dependencies (or collinearity) on

model regression is well-know for statisticians who claim that they inflate the variances of

regression coefficients and magnify the impact of errors on the regression variables [112].

Notice that, for instance, in the nonlinear estimation of Eq. 2.5 by using Newton-type

algorithms, the solution in Eq. 2.6 might be inflated and thus, unstable by inverting an

ill-conditioned Hessian (which is the Fisher-information matrix).

Ill-conditioning and collinearity measures

The indicator of ill-conditioning is the condition number κ (see Section A.5.12 of Appendix

A.5). When κ is not far from 1, the matrix is said to be well-conditioned, on the contrary

when κ is large the matrix is ill-conditioned. Only a matrix with orthonormal (perfect

linearly independent) columns has κ = 1. Otherwise, if the matrix has exact linear de-

pendencies among its columns it has at least one zero-singular value [12, 11] and then

κ =∞.

An ill-posed problem has infinity condition number, thus extremely ill-conditioned prob-

lems behave in practice as ill-posed problems and have to be treated by the same tech-

nique [13]. In most applications on nonlinear parameter estimation all singular values of

the sensitivity matrix are nonzero, because its columns are rarely exactly linearly depen-

dent (except the case that there are some structural non-identifiabilities). Nonetheless, if

among the columns of the sensitivity matrix there exists a near linear dependence (also

called collinearity) [12] this will manifest itself in a “small“ singular value and then a large

condition number. This also makes the condition number as a measure of near collinearity

[112].

The collinearity index (see Section A.5.13 of Appendix A.5), as its name makes reference,

measures collinearity of the columns of a matrix. In the extreme case, when A is said to

be singular then γ(A) = ∞. In other cases, when this quantity is large, that indicates A

38


has columns (nearly) linearly dependent and is ill-conditioned, and the parameter vector

is poorly identifiable [17].

Empirical upper bounds for the condition number and the collinearity index are defined

by κmax ≈ 1000 [46] and γmax ≈ 10 − 15 [17], respectively. The maximum condition

number κmax and collinearity index γmax are used to define a well-conditioned sensitivity

matrix and a well-posed problem. Therefore, this problem should not have numerical

instabilities due to error propagation and nearly linear dependencies. Accordingly, γmax

controls linear dependencies and κmax assures numerical stability. Furthermore, because

these thresholds are related to the definition of a “large“ ratio in the SVs and “smallness“

of singular values, the user should tune them according to the magnitude (scaling) of

singular values in the particular application. However the aforementioned values are good

references and could be used as a starting point.

When the matrix S is scaled by the scalar a, this scaling does not affect the condition

number because it is scale invariant, i.e., κ(aS) = (aς1)/(aςNθ) = κ(S). However, it

modifies the collinearity index, i.e., γ(aS) = 1/(aςNθ) = γ(S)/a. The scale-invariance

poses the condition number as a robust indicator of ill-conditioning.

Classification of ill-conditioned problems

For ill-conditioned problems a distinction is made, being either rank-deficient or having an

ill-determined rank [52, 53]. If the numerical rank (see Section A.5.11 of Appendix A.5)

of S can be reliably calculated, the problem is considered rank-deficient, otherwise it is of

ill-determined rank [52]. Problems with a SVs decaying gradually to zero without a gap

between singular values belong to the ill-determined rank problem class [52, 53].

In order to define a problem as one of the above classes, the first step is to compute

its numerical rank, specifically its ϵ-numerical rank (see Definition A.5.11). To do so, the

lower bound on the SVs, namely the ϵ-threshold, is calculated from the maximum condition

number (κmax) and collinearity index (γmax) defined in Section A.5.12 such that

ϵ = max ϵκ = ς1/κmax, ϵγ = 1/γmax (3.3)

In Fig. 3.2 and 3.3, SVs for both, rank-deficient and ill-determined rank problems are

depicted. For the rank-deficient problem in Fig. 3.2, the numerical ϵ-rank (i.e., number of

linear independent columns of S) can be well-determined by using the ϵ-threshold. Rank-

deficient problems have either a large condition number κ(S) and/or a large collinearity

index γ(S). Furthermore, the singular value ς1 is larger than 1 although the scaling of S

determines its magnitude.

For the ill-determined rank problem in Fig. 3.3, the dominant feature is the presence of

singular values close to the conventional or selected “zero“. Large values of both κ(S) and

γ(S) are expected. Moreover, almost the complete SVs is below 1 and even the largest

singular value ς1 is small. Although, ill-determined rank problems do not possess a well-

39


1E-06

1E-05

1E-04

1E-03

1E-02

1E-01

1E+00

1E+01

1E+02

1E+03

1 2 3 4 5 6 7 8 9 10


GAP

Sm

all

valu

es

Larg

e

valu

es

∈∈∈∈

well-conditioned ςςςςi

ill-conditioned ςςςςi

Figure 3.2.: Singular value spectrum (SVs) of a rank-deficient problem. (Figure taken from publi-cation III - Lopez et al. (2015) - reprinted with permission from Elsevier Science)

defined rϵ, the ϵ-threshold can still be used to establish a new well-conditioned problem

(with only well-conditioned singular values) which is related to the amount of available

linearly independent information.

As mentioned for the condition number, the rank of S is also scale invariant. However,

when applying the collinearity index to a scaled matrix aS, the ϵ-threshold for computing

rϵ must be modified as aϵ 1 such that raϵ(aS) = rϵ(S). This fact shows that the ill-

conditioning of a problem is not affected by this scaling, but indicators based on the

singular values (and also eigenvalues) need to be applied carefully. On the other hand,

it is recommended to conduct PE and OED normalizing predicted response variables

and parameters (see Eqs. 4.1 and 4.2, respectively) to decrease the influence of different

magnitudes in the optimization.

Relationship between identifiability problems and ill-conditioning

Typically, identifiability problems in nonlinear parameter estimation are detected analyz-

ing whether the Fisher-information matrix F (see Eq. 2.13) is singular or“almost“ singular

from a numerical point of view [1, 42, 120, 124]. When F is singular (a case rarely en-

countered in actual practice), it is either not invertible, or det(F ) = 0, or the smallest

eigenvalue λmin(F ) = 0, or is not of full rank, or has exact linear dependence among its

columns, or some column norms are equal zero (see Theorem A.5.6). When F is almost

singular its columns have nearly linear dependencies or has column norms close to zero

[12], or its det(F ) and λmin(F ) are close to zero. Having so, F is still invertible, but it

might numerically run into problems (ill-conditioning). Thus, the estimation problem is

1For the scaled sensitivity matrix aS the maximum collinearity index should be also scaled as γmax/asuch that ϵγ(aS) = aϵγ and ϵ(aS) = max ϵκ, aϵγ

40


1E-16

1E-14

1E-12

1E-10

1E-08

1E-06

1E-04

1E-02

1E+00

1E+02

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43


Sm

all

valu

es∈∈∈∈


wel

l-co

nd

itio

ned

ςς ςςi

Figure 3.3.: Singular value spectrum (SVs) of an ill-determined rank problem. (Figure taken frompublication III - Lopez et al. (2015) - reprinted with permission from Elsevier Science)

exactly or nearly singular. In those cases the assumption, that F is of full rank belonging

to the set of positive definite matrices PD(Nθ) (see Lemma A.5.5) does not hold [3].

The eigensystem (eigenvalues and eigenvectors) of F has been used to analyze local

identifiability problems [17, 19, 120, 122, 123]. The presence of a small eigenvalue of F

is a visible proof of identifiability problems. Indeed, small eigenvalues of F evidences

its ill-conditioning [12]. Nonetheless, in cases where F is extremely ill-conditioned the

computation of its eigensystem has weak numerical stability [12].

At this point, it is again needed to return to the intimate connection between F and S

in Eqs. 2.11 and 2.19. As explained in Section 2.6, the structural problems of S are trans-

ported to F . Consequently, it is clear that if F is ill-conditioned, it is a consequence of an

ill-conditioned S, which should not be of full rank. This happens, for example, for columns

of S with small values (even close to or equal zero) due to insensitive parameters or nearly

identical columns or columns which are linear combinations of others because of corre-

lated parameters. Differently expressed, if S is ill-conditioned then F is ill-conditioned,

the parameter estimation is ill-posed and at least has local identifiability problems.

Effect of ill-conditioning on parameter estimation

Indefinite Hessians (and even just positive definite Hessians with small eigenvalues) in non-

linear estimation can cause unacceptable (inflated) search directions [83], which increases

the estimation variability and instability. In order to better understand this claim, please

see Eq. 2.6, where the step direction υk along with the step size are controlling the value

of the new solution. In gradient-based methods υk is typically determined for a function

of the inverse of Hθ in Eq. 2.11. Therefore, if Hθ is ill-conditioned this inverse is degen-

erated leading to unstable solutions. Moreover, for problems where Hθ is ill-conditioned

41


(nearly singular problems), a large variance (uncertainty) in some sub-space of the pa-

rameter space Ω can be found [3, 120, 63]. Estimates may be obtained far away from

the true parameter values and thus some parameters cannot be accurately estimated and

are non-identifiable [120]. It may be explained taking a look at the approximation of the

parameter covariance matrix C in Eq. 4.3, where the inverse of F ≈ Hθ is also required.

It has to be noted, that well established solution methods as Levenberg-Marquardt

[82, 83] and trust region [90, 110] are adapted to handle indefinite matrices [3, 66].

Levenberg-Marquardt method is based on the Tikhonov regularization of the linearized

problem in Newton’s method and trust-region method is a modification of Newton’s

method with origin from Levenberg-Marquardt method. However, for nonlinear PE prob-

lems with highly ill-conditioned matrices, Levenberg-Marquardt methods exhibit reduced

convergence rates [66]. For trust-region methods efficiency and stability are negatively

affected.

Effect of ill-conditioning on optimal experimental design

It is strongly recommended to carefully analyze the conditioning of S when applying the

alphabetic design criteria in Section 2.7.1. For example, D-criterion cannot be calculated

if a SVs of S contains at least one zero-value. Meaning that an eigenvalue of F is zero (i.e.,

λi(F ) = 0), the F is not invertible and hence the covariance matrix C in Eq. 4.3 cannot be

evaluated. Even if the SVs of S does not have exactly zero-singular values, the presence of

one or several values near zero could degenerate the inversion and produce criterion values

approaching infinity. Moreover, negative eigenvalues could appear because of inherent

numerical errors from the computation of F (Eq. 2.10) which can affect the positive

semi-definiteness of that matrix. Indeed, the theoretically positive semi-definite matrix F

(see Definition A.5.15) could lose its positive semi-definite property despite conserving the

symmetry (Definition A.5.3) in its structure. In those cases, the calculated design criteria

are meaningless and an optimal experimental design should not be performed.

3.2.2. Parameter variance-decomposition

In Eq. 2.19 the Fisher-information matrix F is defined in terms of the singular values

of the sensitivity matrix S. Moreover, in Eq. 4.3 the relationship between F and C is

described. Combining the mentioned equations, C can be also expressed in terms of the

singular values of S. Doing so, the final expression of C in terms of the singular values of

S is as follows:

C = V (STv Sv)

−1V T =

Nθ∑i=1

vivTi

ς2i(3.4)

where ςi and vi ∈ RNθ with i = 1, ..., Nθ are the i-th singular value and the i-th right

vector of S, respectively. Once the parameter covariance matrix C is defined as a function

42

3.3. Numerical regularization for parameter estimation

of ςi the decomposition of the variance can be achieved. As can be observed in Eq. 3.4

the inverse of each singular value ςi contributes to build the parameter covariance matrix

C. Clearly, each parameter variance σ2θj

of the j-th parameter θj is inversely affected by

ςi in the following form

σ2θj

=

Nθ∑i=1

vi,j · vi,jς2i

(3.5)

where vi,j is the j-th element of the i-th right singular vector of S. The variance σ2θj

in Eq. 3.5 is obtained by a sum of componentsvi,j ·vi,j

ς2i, which are here called variance

components σ2θj|ςi , associated with one of the singular values ςi of S for i = 1, · · · , Nθ such

that

σ2θj

=

Nθ∑i=1

σ2θj|ςi . (3.6)

The Eq. 3.6 is known as the parameter variance-decomposition. Finally, in order to

quantify the contribution of each singular value ςi, i = 1, · · · , Nθ to the variance σ2θj

of

each j-th parameter θj , j = 1, · · · , Nθ the proportions

πi,j =σ2θj|ςi

σ2θj

(3.7)

are computed.


Handling ill-posed PE problems is aimed at “trying to control the parameter variance“

either imposing upper and lower limits on the admissible parameter values [25] or using

regularization techniques [13, 66, 52]. Although, the former seems to control the instability

(at amplitude as large as that allowed by the imposed limits) the solution may remain

meaningless merely oscillating between the limits [25]. The latter eliminates unwarranted

oscillations (with a meaningful variance level reduction) but the estimated parameters can

be biased [4, 48, 49, 52, 63, 83].

All prior knowledge or information about the desired solution may be incorporated into

the identification problem of Eq. 2.5 in the form of constrains. This action is the basic idea

of any regularization technique which numerically permits to deal with ill-posed parameter

estimations. These constraints or transformations stabilize the problem (because it is

better conditioned) and lead the regularized solution θReg to a useful and stable region

which will be hopefully near the desired (θ∗) [52, 63, 66].

When a regularization technique (Reg) is applied to Eq. 2.5 the solution norm, either

2-norm (i.e.,θReg

2) or an appropriate seminorm, and the residual value (i.e., Z =

43


(Y (u, θReg) − Y m)) should remain small. Nevertheless, the regularized solution θReg is

obtained from a modified problem. Hence θReg is generally biased but also has a reduced

variance as the regularization decreases the size of the solution’s covariance matrix [38, 52,

63].

For rank-deficient problems with matrix S whose rϵ exists, regularization strategies, for

instance parameter Subset Selection (SsS) and Truncated Singular Value Decomposition

(TSVD), which exploit the filtering of a subset either of the parameters or the singular

values can be applied. The key feature of these strategies is to extract the linearly inde-

pendent information of S and to transform the problem into a well-conditioned one [52].

Ill-determined rank problems (i.e., rϵ of S cannot exactly be determined independently of

the value of ϵ used) could generally be treated with strategies for rank-deficient problems.

However, better suited are regularization techniques which introduce a penalty term to

ensure smoothness of the solution, for instance the Tikhonov regularization (Tikh) known

in linear cases as ridge regression [4, 52, 63, 83]. Here the ill-conditioning of S is also

directly modified. Instead of selecting its well-conditioned information (maintaining the

formulation in Eq. 2.5), a new well-conditioned matrix (i.e., the regularization parameter

multiply by a predefined full row rank matrix) is added to S [52, 57, 56, 83].

The three regularization techniques SsS, TSVD and Tikh are studied in the sequel. The

choice of these techniques lies on their broad application, good performance and easy

implementation. Moreover, SsS and TSVD have a close relationship to the here referred

ill-conditioning analysis because they are also based on the singular value analysis. It is

important to point out, that all mentioned regularization techniques are aimed at getting

rid of the ill-conditioning of the original problem and turning it into a new well-posed

problem [52, 70]. In the field of parameter subset selection, more sophisticated methods

have been proposed recently which are also based on orthogonalization methods [28, 30,

129]. The main differences are in the chosen orthogonalization algorithm (e.g., Gram-

Schmidt or householder transformation) and how parameter combinations are selected for

estimation (possible criteria are maximum determinant of reduced FIM, largest column

norm of reduced sensitivity matrix or lowest MSE). In Ref. 30 the selection is formulated

as an optimization problem. However, all strategies coincide in finding a parameter subset

which generates a well-conditioned sensitivity matrix in the reduced parameter space.

Furthermore, each technique uses a regularization parameter (ϵ-threshold for SsS and

TSVD, and λ for Tikhonov) which must be tuned according to the gravity of the ill-

conditioning in the particular application. The regularized sensitivity matrix SReg for

Reg=SsS, TSVD, Tikh is computed and used to obtain HRegθ (see Eq. 2.11) and FReg ≈

HRegθ . The inverse of HReg

θ (more stable) is not only the basis for the calculation of the

step direction υk in Eq. 2.6 but also for the construction of the parameter covariance

matrix in Eq. 4.3.

44


3.3.1. Parameter subset selection (Reg=SsS)

The idea of dealing with an ill-conditioned estimation by selecting a well-conditioned

parameter subset and solving a new (but reduced) estimation has been applied first

in linear least squares problems with non-orthogonal data [56] and nonlinear estima-

tion with inflated step directions υk in Eq. 2.6 [83]. More recently, systematic

strategies for parameter subset selection have been proposed (see for instance Refs.

18, 19, 26, 28, 30, 45, 52, 87, 79, 122, 123, 129, 134).

In this thesis, the considered identifiable parameters to be used in this regularization are

the most linearly independent parameters selected by applying orthogonal projections on

the columns of S [45]. This orthogonal method (here called QR method) will be explained

in Section 4.4.2. This method is a popular way of determining identifiable parameters

computing the numerical rank rϵ (see Section A.5.11 of Appendix A.5) of the sensitivity

matrix [70, 78]. The use of the results of QR method appropriately guarantees a reliable

inversion of the Fisher-information matrix approximated in Eq. 2.13. This regularization

is based on only estimating the so-called identifiable parameters. Although this means

that only certain parameters are estimated, their values will be obtained with improved

reliability [120].

For the solution of nonlinear problems this local regularization technique should be

applied repeatedly during the search. The status of parameters (being active or nonactive)

along with estimator performance analysis are then iteratively refined and nonactive or

unidentifiable parameters (θ(Nθ−rϵ)) are fixed at currently best available estimate (ˆθ(Nθ−rϵ))

[79]. The corresponding reduced PE problem takes the following form:

θSsS := argminθ(rϵ)

ΦSsS(u, θ(rϵ)) (3.8a)

ΦSsS(u, θ(rϵ)) := ΦLSQ(u, θ(rϵ)). (3.8b)

The regularized Jacobian JSsSθ ∈ RNy ·Nm·Ne×rϵ and Hessian HSsS

θ ∈ Rrϵ×rϵ of the cost

function ΦSsS in Eq. 3.8a are given in Table 3.1. They are obtained from the regularized

sensitivity matrix SSsS ∈ RNy ·Nm·Ne×rϵ . The linearly independent columns of S (i.e.,

the sensitivities corresponding to the rϵ active parameters of θ, see Section 4.4.2) are

stored in the reduced (regularized) sensitivity matrix SSsS ∈ RNy·Nm·Ne×rϵ . Therefore, the

regularized matrix SSsS is well-conditioned and has full-column-rank with κ(SSsS) ≤ κmax

and γ(SSsS) ≤ γmax [79].

Furthermore, according to the interlacing inequality of the singular values, each i-th

singular value of the sub-matrix SSsS is at most as large as the corresponding i-th singular

value of the original matrix S, i.e., ςi(SSsS) ≤ ςi(S) [117, 24]. Also note that HSsS

θ is

computed according to the assumptions in Eq. 2.10. The application-depending threshold

ϵ (see Eq. 3.3) is considered the regularization parameter in this technique, which should

be tuned according to the gravity of the ill-conditioning in each case study.

45


3.3.2. Truncated singular value decomposition (Reg=TSVD)

In this regularization a new problem with a well-conditioned but rank-deficient sensitivity

matrix (STSVD ∈ RNy ·Nm·Ne×Nθ) is derived [52, 131]. The matrix STSVD is obtained by trun-

cating the SVs of S at rϵ and substituting the small nonzero singular values ςrϵ+1 , · · · , ςNθ

with exact zeros (see 3.1 ). It should be noted that the same truncation principle is used

to compute the generalized inverse of A = XTX when the existence of a distribution of

eigenvalues near zero complicates the calculus of its inverse [83]. In TSVD, the numerical

rank rϵ is computed as described in Section A.5.11. Thus, the regularization parameter

in this technique is once again the ϵ-threshold (see Eq. 3.3 for its determination). The

matrix STSVD is rank-deficient. It is the closest rϵ-rank approximation to S with residual

2-norm given byS − STSVD

2= ςrϵ+1 [52].

By applying the TSVD technique the parameter space dimension Ω and the cost function

of the original PE problem in Eq. 2.5b remains unchanged:

θTSVD := argminθ

ΦTSVD(u, θ) (3.9a)

ΦTSVD(u, θ) := ΦLSQ(u, θ). (3.9b)

The regularized Jacobian JTSVDθ ∈ RNy ·Nm·Ne×Nθ , and Hessian HTSVD

θ ∈ RNθ×Nθ matrices

are calculated based on the regularized sensitivity matrix STSVD according to Eq. 2.7b and

2.11, respectively, see Table 3.1. It has to be noted, that as for most parameter estimation

problems the parameters are somehow correlated, all parameter sensitivities are affected

by the truncation of one or several singular values of S.

Table 3.1.: Definition of the specific derivatives used for “SsS“, “TSVD“ and “Tikh“ regularization;regularization “Reg“ equal “None“ refers to the original parameter estimation problem.(Table from publication III - Lopez et al. (2015) - reprinted with permission fromElsevier Science)

Reg SReg JRegθ HReg

θ∂2ΦReg

∂θY m CReg

None (Σy)−1 ∂Y

∂θ (SNone)TZ (SNone)T (SNone)T −(SNone)T[HNone

θ

]−1

SsS (Σy)−1 ∂Y

∂θrϵ(SSsS)TZ (SSsS)T SSsS −(SSsS)T

[HSsS

θ

]−1

TSVD∑rϵ

i=1 ςiuivTi (STSVD)TZ

∑rϵi=1 ς

2i viv

Ti −

∑rϵi=1 ςiuiv

Ti

∑rϵi=1

vivTi

ς2i

Tikh

[SNone

λL

]JNoneθ + 1

2λ2JΩ(θ) HNone

θ + 12λ

2HΩ(θ) −(SNone)T[HTikh

θ

]−1

3.3.3. Tikhonov regularization (Reg=Tikh)

This regularization modifies the identification problem in Eq. 2.5 by incorporating assump-

tions about the size and smoothness of the desired solution [52, 63, 118]. This a priori

information is considered as additional penalty (regularizing) term in the cost function of

the regularized parameter estimation problem

46


θTikh := argminθ

ΦTikh(u, θ) (3.10a)

ΦTikh(u, θ) := ΦLSQ(u, θ) +1

2λ2Ω(θ), (3.10b)

where Ω(θ) :=L(θ − θR)

22is the penalty term also called the discrete smoothing

norm and assumed to be twice continuously differentiable. Its purpose is to attract non-

identifiable parameters towards their corresponding values in a predefined vector θR ∈ RNθ

[63]. θR should contain the most reliable parameter values (prior information) available.

The scalar parameter λ controls the contribution of the regularization term in the regular-

ized cost function ΦTikh relative to the original and ill-posed cost function ΦLSQ and it is

known as the regularization parameter. If λ is close to zero the regularization is weak and

the problem again becomes ill-posed; on the other hand, if λ is large the regularization

is strong and the solution degenerates to θ = θR. Heuristic and systematic techniques

to determine λ for the linear case are available in literature (e.g., L-curve [52] and ME-

TER criterion [4]). The matrix L ∈ RNθ×Nθ is a diagonal matrix typically the identity

matrix INθ. Other approaches [4, 52, 63, 118] define L as a discrete approximation of a

derivative operator (e.g., first or second derivative operators) or λL as the inverse of the

parameter variances [52]. The most common choice of the functional Ω(θ) is the ridge

regression stabilizer [82, 57, 56] where θR = 0Nθ. In this work, L = INθ

and no a priori

information, i.e., θR = 0Nθ(zero-vector) are considered. It is important to point out that

priori information makes reference to the best available estimate or value of the param-

eters. The Jacobian JT ikhθ ∈ RNy ·Nm·Ne×Nθ and Hessian HT ikh

θ ∈ RNθ×Nθ matrices are

given in Table 3.1. Therein, the first and second derivative of Ω(θ) with respect to θ read

JΩ(θ) = ∂Ω(θ)/∂θ = 2θ and HΩ(θ) = (∂2Ω(θ))/(∂θ2) = 2INθ, respectively.

In this regularization the model can be improved without modifying the model structure,

which means that the parameter space remains unchanged. Indeed, the ill-conditioning of

S is eliminated by introducing a new well-conditioned matrix ST ikh of full rank (see Table

3.1). The regularized sensitivity matrix ST ikh ∈ RNy ·Nm·Ne+Nθ×Nθ is a row-augmented

matrix where its last rows are formed by a user-defined full-row-rank matrix [52, 57, 56, 83].

3.3.4. Regularized parameter covariance matrix

The computation of the covariance matrix of the biased estimator θReg depends on the

chosen regularization technique. Here this matrix is denoted as CReg. The general approx-

imation which also gives origin to Eq. 4.3 is given in Eq. 3.11 (Bard, 1974; Fessler, 1996).

The specific terms used for each regularization technique are given in Table 3.1.

47


CReg ≈[HReg

θ

]−1[∂2ΦReg

∂θ∂Y m

] [∂2ΦReg

∂θ∂Y m

]T [HReg

θ

]−1(3.11)

Using Tikhonov regularization, especially for small regularization parameter values (λ→0) the parameter covariance matrix CT ikh in Eq. 3.11 can be approximated by [HT ikh

θ ]−1

(as shown in Table 3.1). Hereafter this simplified version is referred along this manuscript

as Tikhonov regularization (“Reg=Tikh“).

48

4. Computational framework

This chapter presents a computational framework for the analysis of ill-conditioning and

identifiability issues as well as the implementation of numerical regularization in several

stages of the model development (see Figure 4.1).

The framework may be used as a whole piece or segregated to work either on parameter

estimation or optimal experimental design of a previously selected model structure. It

is also conceived to treat adaptive designs and online parameterizations. The existence

of ill-posed problems is a challenge in the experimental implementation. To deal with

the robustification of PE and OED for ill-posed problems the purpose of this chapter is

to systematically combine techniques described in Chapters 2 and 3 to enable the main

procedural components of the computation framework (see Figure 4.1), i.e.,

1. the estimator performance assessment

2. the ill-conditioning analysis, and

3. the identifiability diagnosis.

The framework takes into consideration the option to analyze its estimator by using two

major paradigms, namely, the sensitivity and Monte Carlo methods described in Publica-

tion I [80]. The former uses numerical techniques that perform singular value analysis of

the sensitivity matrix. The latter performs statistical techniques based on Monte Carlo

simulations. Furthermore, in model-based experimental design, the framework has regard

to the most common optimal criteria for parameter variance reduction (i.e., the alphabetic

criteria) even under regularization.

In the sequel the algorithm of the consolidated global computational framework dis-

played in Figure 4.1 is described. Additional theoretical details to assess the estimator

according to the sensitivity and Monte Carlo paradigms of analysis are given. Then, guide-

lines for conducting ill-conditioning analysis, local identifiability diagnosis, implementation

of regularization are exposed. The chapter ends with two additional guidelines regarding

sensitivity analysis and parameter initial guess selection.

4.1. Algorithm

In Figure 4.1 the algorithm of the mentioned computational framework is outlined. It is

based on the iterative work cycle of model-based experimentation for model development

explained in Section 2.1. Consequently, it considers the four major phases, namely

49


No

Parameter

estimation

Identifiability

diagnosis

Estimator

performance

assessment

Ill-

conditioning

analysis

ModelingExperiment

Optimal

experimental design

→ (, θ)

Up

dat

ing:

Implement initial guess

Collect experimental data

Select model

Select initial guess θ

Compute model outputs

( , θ)

→ ( , θ)

→ ( , θ)

Compute

Optimal design

Original Sensitivity

matrix

Ill-posed

Problem?Reg = Reg =

Yes No

Precision

reached?

Yes !" $%&'"

Precision

reached?

Yes !" $%&'"

Ill-posed

Problem?Reg =

Reg =

Yes

No

No

1

2

3

2

2

1

1

Parameter estimate θ

3

3

Regularized

sensitivity matrix

Reg = Reg =

Yes NoIll-posed

Problem?

→ ( , θ)

Original

sensitivity matrix

23

θ ← θ

Figure 4.1.: Consolidated framework for development and experimental validation of process mod-els with possible ill-posed problems

50

4.1. Algorithm

the experiment, modeling, parameter estimation and optimal experimental design.

Furthermore, it includes the evaluation of quality of the estimator, the analysis of

latent ill-posedness and the constantly diagnosis of identifiability before and after the

parameter estimation. The implementation of regularization in parameter estimation as

well as in optimal design is included. The iterative algorithm finishes when the desired

parameter precision is reached. The complete framework is originally depicted for the

sensitivity method (see Section 4.2.1), thus in Figure 4.1 the three components (the

estimator performance assessment, the ill-conditioning analysis, and the identifiability

diagnosis) are based on the sensitivity matrix. That is so structured to consider the online

redesign of experiments (Section 2.7.5) which needs fast strategies for the experimental

implementation. Notwithstanding, the algorithm may be executed using the Monte Carlo

method described in Section 4.2.2. When that is so, the sample mean (expected value) of

parameter estimates (Eq. 4.5) are used to optimally design the new experiment in the

OED sequential approach (Section 2.7.4).

The algorithm is presented as follows:

Experiment: Before the starting of an experiment, an initial experimental design

vector uIG has to be chosen. Then these experimental conditions are implemented and

the experimental data Y m are collected.

Modeling: The structure of the model m should have been previously selected and

fixed. An initial parameter guess θIG should be defined and the model should be solved to

get the corresponding model outputs Y (uIG, θIG) as well as the corresponding sensitivity

matrix S(uIG, θIG). With this information starts the evaluation regarding estimator

quality, ill-conditioning and identifiability.

1. Estimator performance assessment: The computing of precision measures,

namely parameter variances throughout the covariance matrix (Section 4.3.1) and confi-

dence intervals (Section 4.3.1) is achieved. 2. Ill-conditioning analysis: The singular

value analysis of the sensitivity matrix in Section 4.4.1 to detect structural problems

is conducted. If the sensitivity matrix is ill-conditioned the ill-conditioned singular

values, the numerical rank and the class of ill-posedness are determined to be used in

the identifiability diagnosis in 3. (when applicable). 3. Identifiability diagnosis: The

determination of the unidentifiable parameters (by using the guidelines in Section 4.4.2)

is accomplished. The results from 2. (when applicable) are here used.

Precision reached?: The precision of the initial guess θIG (parameter variances

calculated from 1.) is compared to a predefined variance threshold. Yes: Current

parameter variances are less than threshold, then available parameter initial guess is

adequate for the model and the model calibration is finished. No: Current parameter

51


initial guess is still uncertain and new estimations should be obtained.

Ill-posed problem?: The results from 2. about ill-posedness are here employed. Yes:

If the problem is ill-posed then the parameter estimation should be regularized, i.e., Reg=

Active. One of the regularization techniques explained in Section 3.3 according to the

kind of ill-posed problem should be applied. No: If the problem is not ill-posed then the

parameter estimation can be performed without regularization, i.e., Reg=None.

Parameter estimation: The solution of the PE problem in Eq. 2.5 either with or

without regularization is calculated. The parameter estimate θReg, the corresponding

regularized sensitivity matrix SReg(uIG, θReg) in Table 3.1 and the original sensitivity

matrix S(uIG, θReg)1 evaluated at uIG and θReg are then obtained.

Once more the three evaluations 1. Estimator performance assessment:, 2. Ill-

conditioning analysis: and 3. Identifiability diagnosis: applied to the regularized

and original sensitivity matrices SReg(uIG, θReg) and S(uIG, θReg), respectively are accom-

plished.

Ill-posed problem?: The results from 2. about the ill-posedness of SReg(uIG, θReg)

are here employed. Yes: If the problem is ill-posed then the parameter estimation should

be regularized, i.e., Reg= Active. A regularization technique explained in Section 3.3 ac-

cording to the kind of ill-posed problem should be implemented. The parameter estimation

should be iteratively repeated with θIG = θReg till this condition is not fulfilled anymore.

No: If the problem is not ill-posed the evaluation of the parameter precision should be

performed.

Precision reached?: The evaluation of the precision of the estimate θReg is conducted.

To do so, parameter variances calculated in 1. after the parameter estimation and a

predefined variance threshold are compared. Yes: Current parameter variances are less

than threshold, then available parameter estimates are adequate for the model and the

model calibration is finished. No: Current parameter estimates are still uncertain and a

new experiment should be design.

Ill-posed problem?: The results from 2. about the ill-posedness of S(uIG, θReg)

after parameter estimation are here employed. Yes: If the problem is still ill-posed then

the optimal experimental design should be maintained regularized, i.e., Reg= Active

according to the regularization chosen in the parameter estimation. No: If the problem

is not ill-posed the optimal experimental design is performed without regularization, i.e.,

Reg=None.

Optimal experimental design: The optimal design problem in Eq. 2.35 for the cor-

responding regularized parameter covariance matrix or sensitivity matrix SReg(uIG, θReg)

1This matrix is obtained without regularization

52

4.2. Analysis paradigms

in Table 3.1 is solved. The optimal design uReg, the corresponding updated original and

regularized sensitivity matrices S(uReg, θReg) and SReg(uReg, θReg), respectively are then

obtained. At this point the possibility to use several OED criteria should be exploited.

The selection of the best optimal design should not be only due to the parameter variance

reduction but also based on ill-conditioning and identifiability improvements.

Updating of the input design and parameter vectors: In the sequential and online

approaches for optimal experimental design in Sections 2.7.4 and 2.7.5 they are repeatedly

updated. Accordingly, the updating of the initial input design and parameter vector with

the optimal design uReg and parameter estimate θReg, respectively is accomplished. The

cycle continues till the desired precision is reached.


This section describes two paradigms to analyze estimated parameters. The first paradigm

is based on one parameter estimation and the inversion of the Fisher-Information matrix

to form the parameter covariance matrix C. The Fisher-information matrix is computed

by parameter sensitivities (section 4.3.1) therefore this approach is called as sensitivity

method. The second paradigm, on the other hand, requires multiple parameter estimations

and assesses the parameter uncertainty with data from finite samples. It needs repeated

optimizations and is called Monte Carlo method.

Both strategies follow the same structure of analysis: they solve the parameter estima-

tion problem, analyze the ill-conditioning and identifiability issues, evaluate the perfor-

mance of the estimator and (where applicable), conduct the optimal experimental design.

Moreover, all computations (PE, OED, ill-conditioning and identifiability analysis) in the

case studies analyzed in this thesis use normalized predicted response variables

yi(u, θ, tk) =yi(u, θ, tk)

maxtk∈T ymi (tk), (4.1)

with i = 1, ..., Ny and for all tk ∈ T , and normalized parameters

θj =θj

θj,IG, (4.2)

with j = 1, ..., Nθ. Hereinafter, for notation simplicity yi ← yi and θj ← θj . Doing so,

the estimation problem is normalized and the computed sensitivity matrix is considered

in its standard form. This matrix preserves all information of the system [20]. A stan-

dardized sensitivity matrix2 has reduced influence of parameters and predicted response

variables with different orders of magnitude which may mask the ill-conditioning analysis

and identifiability diagnosis.

2In linear estimation one recommended column equilibration is to scale the columns of the sensitivitymatrix to unit length [11]

53


4.2.1. Sensitivity method

This is the most common strategy to estimate parameters in nonlinear models. The inputs

of this paradigm are a fixed model structure, experimental data (Y m) collected under

the experiment design (u), and the initial guess of parameters (θIG). The parameter or

point estimate θ is computed by solving the parameter estimation problem in Eq. 2.5.

The sensitivity matrix S (obtained either by using finite differences in Eq. 2.14 or the

sensitivity equations 2.15), the Fisher-information matrix F (Eq. 2.10) and the parameter

covariance matrix (Eq. 4.3) are then obtained.

The sensitivity-based method is straightforward to implement and provides fast local

information of the model behavior around the parameter estimates. Sensitivity information

can also be used in conjunction with singular value analysis to perform structural model

analysis to detect ill-conditioning and diagnose identifiability problems. Because of its

inherent local nature, however, the covariance matrix obtained from sensitivity might

not provide a good covariance approximation (specially true when S is ill-conditioned).

Consequently, this method is most useful for qualitative analysis of parameter uncertainty,

the presence of ill-conditioning and (at least practical) identifiability problems.

The summary of techniques when the sensitivity method is used to determine model

parameters is schematized in Figure 4.2. Notice that precision (see Section 4.3.1) but no

accuracy is considered, the ill-conditioning is analyzed by one method (see Section 4.4.1)

whereas the identifiability diagnosis may be performed by using three different methods,

namely the variance (see Section 4.4.2), SVD (see Section 4.4.2) and QR (see Section 4.4.2)

methods.

4.2.2. Monte Carlo method

This method is a simple, useful but computer-intensive technique to explore the unknown

form of the parameter probability distribution [89] and to reach conclusions about ill-

conditioning and identifiability [11, 80]. The idea behind of this method is to propagate

random measurement errors in the experimental data to conduct parameter estimation and

to recover the corresponding parameters which will have a specific probability distribution.

The measurement errors are sampled several times from a known probability distribution

such as a normal distribution. The various experimental data sets are considered virtual

replications of the same experiment.

Two versions of this paradigm depending on the available experimental information can

be implemented. In both versions their results are employed to compute the expected

value E[Θ] in Eq. 4.5, the precision measures (i.e., the approximate covariance matrix C

in Eq. 4.4 and confidence intervals in Eq. 2.28) and the accuracy measures when applicable

(i.e., the bias in Eq. 2.29 and the empirical averaged MSE in Eq. 4.7). They are based on

a fixed structure m, a fixed experiment design u and a fixed parameter initial guess θIG

(see Figure 4.3).

54


Precision

Cov. Matrix

Confidence

Interval

Ill-conditioning

analysis

Identifiability

analysis

Reliability

tests

Hypothesis

test

Estimator

performance

assessment

Variance

method

QR

method

SVD

method

Sensitivity

method

1 2 3

Parameter Initial Guess θ

Sensitivity Matrix

Parameter

estimation

Estimated parameter θ

Sensitivity Matrix

Initial guess

Experimental data ExperimentModel

Initial guess θ

Model outputs

Modeling

Figure 4.2.: Model parameters estimated and analyzed based on the sensitivity method

The first version uses the available experimental data vector Y m and the measurement

error covariance matrix Cy to generate synthetic data. In this setting NMC data sets

denoted as Y m(j), j = 1, ..., NMC of the available observations Y m by sampling N (Y m, Cy)

are obtained. For each synthetic data set Y m(j) ∼ N (Y m, Cy), the problem in Eq. 2.5 is

solved to obtain the estimates θ(1), · · · , θ(NMC).

In the second version, the NMC data sets are sampled from the normal distribution with

mean Y (u, θ∗) and variance Cy, i.e., Ym(j) ∼ N (Y (u, θ∗), Cy). The true parameter vector

(or reference vector) θ∗ is used to obtain model responses Y (u, θ∗) at fixed design u. The

estimation in Eq. 2.5 is then performed for each synthetic data set Y m(j) and the j-point

estimate θ(j) is recovered.

The replications can also be interpreted as perturbations on the data that can help

to assess the stability of the estimates. When the true parameter θ∗ is known (as in

theoretical studies) it can be also assessed the accuracy of the estimator by computing the

bias and the empirical mean squared error of the estimate. The Monte Carlo approach

enables a more quantitative analysis, but it is computationally demanding. It is a pure

numerical procedure which allows to more precisely quantify the parameter uncertainties

[100]. Moreover, it is a more generally applicable computational alternative to diagnose

ill-conditioning [11] and analyze identifiability problems [125].

The summary of techniques when the Monte Carlo method is used to determine model

55


Parameter estimation

Initial guess

Experimental data

Experiment

() ~( , ), or,()

~( , θ∗ , )

For = 1. …

Model

Initial guess θ

Model outputs

Modeling

Precision

Cov. Matrix

Confidence

Interval

Ill-conditioning

analysis

Identifiability

analysis

Reliability

tests

Hypothesis

test

Estimator

performance

assessment

Variance

method

Monte Carlo

method

Parameter estimate θ()

Accuracy

Bias Empirical

MSE

1 2 3

Figure 4.3.: Model parameters estimated and analyzed based on the Monte Carlo method

parameters is schematized in Figure 4.3. Notice that, precision (see Section 4.3.1) and

accuracy (see Section 4.3.2) are considered, the ill-conditioning is analyzed by only one

method (see Section 4.4.1) whereas the identifiability diagnosis may be performed by using

only the variance method (see Section 4.4.2).

4.3. Estimator performance assessment

Precision and accuracy are the arguments which will be used in this framework to assess

the performance of an estimator. Theory of those statistical properties are summarized in

Sections 2.5.1 and 2.5.2 in Chapter 2. In the following, the computation of precision and

accuracy measures corresponding to the sensitivity and Monte Carlo methods are given.

Notice that the accuracy computation in terms of Bias and MSE is only possible when a

reference vector is available.

4.3.1. Estimator precision

The parameter variance σ2θj, the standard deviation σθj and the confidence interval for

the i-parameter are calculated. Two strategies to compute the covariance matrix are

here described. The first strategy computes the so-called average estimated covariance

matrix based on sensitivity information. The second strategy calculates the so-called true

56

4.3. Estimator performance assessment

covariance matrix based on Monte Carlo simulations.

Covariance based on the Sensitivity Matrix

The matrix C is approximated by the inverse of the Fisher-information matrix in Eq. 2.13

which is denoted as F . This matrix is known as the average estimated covariance matrix

[3],

C ≈ F−1 =[ST S

]−1. (4.3)

The Fisher-information matrix F may be only guaranteed to be positive definite when the

sensitivity matrix S is full-rank [101]. The relationship between the Fisher matrix and the

covariance matrix is derived by applying a first-order Taylor expansion of the maximum

likelihood function around θ, assuming a Gauss-Newton approximation of its Hessian and

using the Cramer-Rao bound [3, 38, 74]. Despite the fact that this approximation only

approximately applies to nonlinear models, it is widely used [3, 38].

Covariance based on Monte Carlo

In this approach the so-called true parameter covariance matrix C is estimated via a

sample mean of the NMC parameter point estimates θ(j) with j = 1, · · · , NMC obtained

from the Monte Carlo method (see Section 4.2.2),

C =1

(NMC − 1)

NMC∑j=1

(θj − E[Θ])(θj − E[Θ])T (4.4)

where E[Θ] is approximated using the sample mean

E[Θ] ≈ 1

NMC

NMC∑j=1

θj . (4.5)

This statistical approach can better capture the effect of nonlinearities on the posterior

distribution of the parameters, but it does not provide much insight on specific sources of

identifiability issues.

Confidence intervals

The computation of the confidence intervals according to Section 2.5.1 are based on the

estimator Θj and the standard deviation σθj of the j-th parameter, the confidence level α

and the degrees of freedom. In the sensitivity method the j-th estimator Θj is the point

estimate θj , while in the Monte Carlo method, the sample mean E[Θ] in Eq. 4.5 is used.

σθj is extracted from the covariance matrices in Eqs. 4.3 and 4.4 depending on the selected

method. α is typically 0.05.

57


As a further metric the confidence length

CIL(j) = UB(j) − LB(j) (4.6)

is defined and conveniently expressed as relative to the estimated value as 100 ·(CIL(j))/θj .

4.3.2. Estimator accuracy

The evaluation of the accuracy needs a reference parameter vector (preferably the true

parameter vector when applicable) to compute the bias in Eq. 2.29 and MSE in Eq. 2.30.

In the case of the sensitivity method the parameter expected value E[Θ] is considered the

point estimate θ in Eq. 2.5a, whilst in Monte Carlo the empirical sample mean in Eq. 4.5

is required.

Although there are well-known analytical derivations for the MSE in the case of linear

problems, in nonlinear cases this expression can only be estimated numerically [3, 93].

To do so, a series of Monte Carlo replications can be performed to obtain the empirical

averaged MSE as follows,

MSE ≈ 1

NMC

NMC∑j=1

θj − θ∗2 . (4.7)

4.4. Structural Analysis

The previous section seeks to evaluate estimator performance by computing parameter

variances, confidence intervals, biases and MSEs. The computations in that section by

themselves, however, do not detect the ill-conditioning resulting from structural deficien-

cies of the parameter covariance and Fisher matrices [1, 31, 41, 70, 120, 124]. Therefore,

guidelines to evaluate this ill-conditioning and detect identifiability problems are here

presented. The guidelines of the methods to detect ill-conditioning are separately distin-

guished for the sensitivity and Monte Carlo methods, see Sections 4.4.1 and 4.4.1, respec-

tively. The guidelines of all identifiability methods in Section 4.4.2 are suitable for the

sensitivity method but only the variance method also applies to the Monte Carlo approach.

Figures 4.4 and 4.5 display the ill-conditioning strategy and all identifiability techniques,

respectively based on the sensitivity method explained in Section 4.2.1.

4.4.1. Ill-Conditioning Analysis

Sensitivity method

The guideline outlined in Publication III [78] to analyze ill-conditioning is here followed.

It is a singular value analysis summarized as follows:

1. Perform SVD of S (see Section 2.4.1) to obtain SVs = ς1, · · · , ςNθ.

58


Obtain singular values by SVD→

SVs

Compute κ() and γ()

Is well-conditioned?

Classify ill-posed problem

YES

NO

Sensitivity matrix,

Set threshold of SVs → ∈-threshold

Cut the SVs at ∈-threshold and

determine the numerical rank ∈

is well-conditioned,

for = ,… , ∈∈∈∈

Ill-conditioning analysis2

Ill-posed problem

Wel

l-p

ose

d p

rob

lem

iden

tifi

able

Iden

tifi

ab

ilit

y d

iag

no

sis

3

Figure 4.4.: Ill-conditioning analysis based on the sensitivity method.

2. Compute the condition number κ(S) (Eq. A.14) and collinearity index γ(S) (Eq.

A.16).

3. Check if κ(S) and γ(S) satisfy κ(S) ≤ κmax or γ(S) ≤ γmax. If false, it is said that

S is ill-conditioned and the problem is ill-posed.

4. Analyze the kind of ill-conditioning of S according to Figures 3.2 and 3.3, i.e., rank-

deficiency or ill-determined rank class.

5. Define the ϵ-threshold (the lowest bound on the SVs) according to Eq. 3.3.

6. Cut the singular value spectrum SVs at ϵ-threshold. Singular values above the

ϵ-threshold are said to be well-conditioned and those below are said to be ill-

conditioned. The number of well-conditioned singular values reveals the numerical

rank rϵ of the sensitivity matrix, which ideally should be Nθ.

Monte Carlo method

As all Monte Carlo studies, this straightforward alternative is a computationally intensive

procedure. The basic idea is to select “reasonable“ perturbations for creating many ficti-

59


Calculate par. variance based on the

Fisher Information matrix

Calculate par. variance from SVD

Compute variance-decomp.

proportions ,, , , = 1,… ,θ

Select j-par. with the largest

influence of ill-conditioning →

∑ , ∈

≥ , = 1,… ,θ

Compute Π by QRP:

Π =

Reorder → = Π

Cut at ∈:

= (∈) , ( !∈)

Define par. variance threshold → ρ

Select parameters with the largest

variance → "θ# > %, = 1,… ,θ

Parameter vector

QR Method SVD Method Variance Method

2

Unidentifiable parameters

Identifiability diagnosis3

ran

k

Define the maximum variance

proportion threshold →

Select the last (& − ∈)

elements of

SV

D o

f S

Sensitivity matrix,

2Ill-conditioning analysis Ill-conditioning analysis

Figure 4.5.: Identifiability diagnosis techniques based on the sensitivity method.

tious experimental data sets Y m(j) by using a prior information of measurement errors (ϵ(j)),

estimate the parameter vectors (θ(j)) corresponding to each fictitious data set, and check

if the estimator remains stable at some point [11, 100, 125]. Small parameter variability

indicates a stable estimator. Otherwise, the estimator is unstable and the problem is

ill-conditioned. Reasonable a priori limits for this variability should be taken.

4.4.2. Identifiability diagnosis

In this section techniques to detect unidentifiable parameters based on parameter variances

and orthogonal decompositions of the sensitivity matrix are discussed, namely variance,

SVD and QR methods. The last technique is also called parameter subset selection or

orthogonalization method. Each technique constructs an identifiable ranking based on

different metrics.

Variance Method

Unidentifiable parameters may be defined as those with variances outside some given

ranges [125] following this simple procedure:

60


1. Compute C either from the sensitivity method as shown in Eq. 4.3 or Eq. 3.5, or

from Monte Carlo method as shown in Eq. 4.4.

2. Take the diagonal elements of C as variances σ2θj

for each parameter with j =

1, · · · , Nθ.

3. Rank parameters of θ by ascending order according to parameter variances.

4. Define the variance threshold ρ.

5. Select the unidentifiable parameters as those with parameter variance larger than

the variance threshold (i.e., σ2θj

> ρ).

SVD Method

As seen in the singular value analysis of Section 4.4.1 and in the variance-decomposition

in Section 3.2.2, small (ill-conditioned) singular values inflate parameter variances. The

following procedure relies on a parameter variance-decomposition that quantifies the con-

tribution of each singular value to the variance of each parameter (see Section 3.2.2). The

rationale behind this ranking is that ill-conditioned singular values have large contributions

to parameters with large variance. In other words, a parameter that is strongly influenced

by ill-conditioned singular values provides evidence of identifiability issues. The procedure

reads:

1. Compute parameter variances σ2θj

for j = 1, · · · , Nθ according to SVD in Eq. 3.5.

Perform procedure of Section 4.4.1 to obtain the numerical rank rϵ and the ill-

conditioned singular values ςi, i = rϵ + 1, · · · , Nθ.

2. Compute the variance-decomposition proportion πi,j , i = 1, · · · , Nθ for each j-

parameter by using Eq. 3.7 in Section 3.2.2.

3. Rank parameters of θ by ascending order according to the sum of their proportions∑Nθi=rϵ+1 πi,j . Parameters on the top of the list are not strongly influenced by ill-

conditioned singular values.

4. Define the maximum proportion threshold πmax (typically set to 0.5).

5. Select the unidentifiable parameters are those satisfying∑Nθ

i=rϵ+1 πi,j ≥ πmax, for

j = 1, · · · , Nθ. If rϵ = Nθ then all parameters are identifiable.

QR Method

In this work the algorithm for local identifiability analysis which has been presented by

Refs. 45 and 122, 123. The original algorithm is based on orthogonal projections of the

Hessian matrix (see Eq. 2.10) and is one of the most widely used among several approaches

61


for parameter subset selection [70]. It was recently modified and applied by Ref. [79] using

the sensitivity matrix.

This method selects a subset of linearly independent parameters based on orthogonal

projections of the columns of the sensitivity matrix [70, 79, 78]. It is also called parameter

subset selection or othogonalization method and is a popular way of determining uniden-

tifiable parameters. Other studies dedicated to parameter subset selection can be found

in Refs. 17, 19, 29, 30, 45, 79, 87, 123, 129, 134. The rationale behind this ranking is that

the unidentifiable parameters are expected to have the largest variance among the whole

parameter vector and this can be detected by identifying the linearly dependent columns

of the sensitivity matrix.

The algorithm involves the calculation of the numerical rank rϵ of S (see Section 4.4.1),

which is the dimension of the identifiable parameter subset. In addition, the analyzed

parameter vector θ is reordered according to the linear independence of the columns of S

by applying QRP decomposition (via Householder transformation). This decomposition

(see Definition A.5.10 of Appendix A.5) can be written as:

SΠ = QR, (4.8)

where Q ∈ RNy ·Nm·Ne×Ny ·Nm·Ne is an orthogonal matrix, R ∈ RNy·Nm·Ne×Nθ is an upper-

triangular matrix with decreasing diagonal elements and Π ∈ RNθ×Nθ is a permutation

matrix. Π orders the columns of S according to its linear independence. This means that

the first columns of SΠ are the largest independent set of columns of S. The reordered

parameter vector reads

θ = ΠT θ. (4.9)

The result is θ =((θ(rϵ))T , (θ(Nθ−rϵ))T

)T. Here, θ(rϵ) contains the first rϵ elements of θ

which are the identifiable parameters, and θ(Nθ−rϵ) contains the last Nθ − rϵ elements of

θ, which are not identifiable.

The summarized procedure is:

1. Perform procedure of Section 4.4.1 to obtain the numerical rank rϵ. The dimension

of the identifiable parameter subset is rϵ.

2. Compute the permutation matrix Π from the QRP decomposition of S according to

Eq. 4.8.

3. Build the identifiable ranking θ = ΠT θ.

4. Select the unidentifiable parameters as the last (Nθ−rϵ) entries of the ordered vector

θ.

62

4.5. Regularization

4.5. Regularization

When the problem is diagnosed ill-posed due to the ill-conditioning of the corresponding

sensitivity matrix (see Section 4.4.1) then a treatment is required. Various possibilities are

available, for instance conducting more experiments, adding new experimental information,

restructuring the model or aiding numerically the deficiency. This last numerical treatment

is the regularization. In the following, shorts guidelines to implement three regularization

techniques in parameter estimation and optimal experimental design are given.

4.5.1. Regularization in parameter estimation

In this section a general guideline to apply regularization to the parameter estimation is

formulated which is suitable for the sensitivity method.

1. Determine a convenient regularization according to the kind of ill-posed problem

(i.e., rank-deficient or ill-determined rank) obtained in Section 4.4.1. Rank-deficient

problems perform well when using either SsS or TSVD, whilst ill-determined rank be-

haves well when applying Tikhonov regularization. Notwithstanding, all techniques

might be applied anywhere.

2. Tune the corresponding regularization parameter following the recommendations in

Section 4.5.3 depending on the chosen regularization, i.e., ϵ-threshold for SsS or

TSVD, whilst λ for Tikhonov.

3. Define the regularized cost function depending on the chosen regularization. If SsS is

applied then the cost function in Eq. 3.8 is selected and the unidentifiable parameters

determined in the QR method (see Section 4.4.2) are required. If TSVD is applied

then the cost function in Eq. 3.9 is selected. If Tikhonov is applied then the cost

function in Eq. 3.10 is selected and a priori information of the parameters expressed

in variance σθ and reliable nominal values θR are required.

4. Calculate the regularized sensitivity matrix SReg and the corresponding Jacobian

based on SReg for each regularization according to Table 3.1. If TSVD is applied then

the SVD of S and its numerical rank rϵ determined in the ill-conditioning analysis

(see Section 4.4.1) are required to build the respective regularized sensitivity matrix.

The residual Z is needed to computed the Jacobian in all regularizations.

5. Compute the solution of the corresponding regularized optimization problem in Eqs.

3.8, 3.9 and 3.10. The Jacobian corresponding to each regularization should be

supplied to the algorithm.

63


4.5.2. Regularization in optimal experimental design

In this optimization the regularized sensitivity matrix SReg or the regularized covariance

matrix CReg computed at the parameter estimate θReg from Table 3.1 are used. Then the

criteria in Eqs. 2.39, 2.40 and 2.41 are obtained and the optimal experimental design in

Eq. 2.35 is solved.

4.5.3. Selection of the regularization parameter

This section outlines a simple procedure to select regularization parameters suitable for

SsS, TSVD, Tikhonov regularizations. The idea is to explore the effect of a discrete set of

regularization parameters on fitting, parameter precision, parameter accuracy (when appli-

cable) and ill-conditioning. The analyzed regularization parameter range is determined by

the span of the singular values of the sensitivity matrix. The procedure reads as follows:

• Determine the regularization parameter range to conduct the search. To do so,

the maximum and minimum singular values of the sensitivity matrix i.e., ς1 and ςθ,

respectively are used. They are the limits for the regularization parameters ϵ for

SsS and TSVD, and λ for Tikhonov. Note that regularization parameters less than

ςθ does not generate a regularization effect and values larger than ς1 over-regularize

the problem and introduce more bias.

• Sample some values from the aforementioned range determined above. The MBLHD

in Section 2.8.1 would be an appropriate sampling type algorithm.

• Solve the corresponding regularized PE problem, i.e., Eqs. 3.8, 3.9 and 3.10, for each

selected regularization parameter.

• Take the regularization parameters with best simultaneous performance regarding

fitting, precision, ill-conditioning and when applicable accuracy. Regularization pa-

rameters which yield an adequate model fitting, a small parameter variance, ill-

conditioning improvement and a small bias (when applicable) are selected as suit-

able.

4.6. Other analysis

4.6.1. Parameter sensitivity analysis

In this section, the idea of using the parameter-output dynamic sensitivity information

and a suitable sensitivity metric of each j-th parameter, i.e., the sensitivity measure δj is

exploited. Parameter-output dynamic sensitivity information can be found in the sensitiv-

ity matrix of Eq. 2.9. The sensitivity measure δj in Eq. 2.17 allows to rank the parameters

according to their overall influence in the whole vector Y .

64

4.6. Other analysis

In the following, a simple procedure to rank sensitive parameters according to their

overall effect on the outputs is described:

1. Compute the sensitivity matrix either by finite differences in Eq. 2.14 or solving the

sensitivity equations in Eq. 2.15.

2. Take the j-th column of the sensitivity matrix and compute δj in Eq. 2.17 for

j = 1, · · · , Nθ.

3. Rank parameters of θ by ascending order according to the sensitivity measure δj for

j = 1, · · · , Nθ.

For simple problems, the single dynamical effects of each parameter θj on the output

yi ∈ y as a graphical representation might be used. Therein, the corresponding sensitivity

profile ∂yi∂θj|t for t ∈ T is drawn. There are Ny ·Nθ sensitivity time profiles for each experi-

ment ξ ∈ E . The comparison of each profile enables the determination of which parameter

largely excites the outputs. Doing so, the information from the corresponding measurable

variable ymi ∈ ym to recover each parameter can be proved. Insensitive parameters cannot

be reliably recovered from the available data.

Finally, ranking results computed by the QR method in Section 4.4.2 may be connected

with the sensitivity analysis. That allows to determine if unidentifiable parameters selected

according to their linear independence may introduce a large bias in the solution. The

rationale is that parameters unidentifiable (in the sense of linear dependence) with small

sensitivities will introduce less bias than those with large sensitivities. It is important

to point out that this analysis has also a local nature based on the current experimental

information.

4.6.2. Selection of a parameter initial guess

To define an adequate initial guess (IG) in nonlinear optimization problems several possi-

bilities may be taken into account. Some of them are values from literature, values of a

priori calculations using the model, or the best solution for several IGs randomly generated

as starting points. The latter can be based on a sampling type algorithm e.g., MBLHD

(see Section 2.8.1), to obtain initial guess candidates within a prescribed parameter range.

The rationale is that, from different parameter initial guesses the best solution of their

corresponding PE problems in Eq. 2.5 in terms of good fitting and less ill-conditioning is

selected as appropriate parameter initial guess θIG.

The procedure reads as follows:

1. Define several initial guess candidates IGi, for instance from literature, MBLHD,

etc.

2. Solve the PE problem in Eq. 2.5 starting from each IGi and collect the corresponding

i-th cost function CFi.

65


3. Follow the ill-conditioning analysis in Section 4.4.1 after each i-th parameter estima-

tion and gather the numerical rank rϵ,i.

4. Select the best solution as that with simultaneously low cost function and large

numerical rank as possible.

It is important to highlight that the solution for the IG with the best fitting (the lowest

cost function) does not have necessarily the less ill-conditioning behavior (the largest

numerical rank). Thus, a trade-off should be found regarding these two properties.

66

5. Lithium-ion battery: Finding adequate

experimental data

5.1. Abstract

1 The lack of informative experimental data and the complexity of first-principles battery

models make the recovery of kinetic, transport, and thermodynamic parameters compli-

cated. This chapter explores how different sources of experimental data affect parameter

structural ill-conditioning and identifiability by using the computational framework in

Chapter 4. The sensitivity method is used to analyze the estimators. Accordingly, its

techniques related to ill-conditioning analysis, practically identifiability diagnosis and esti-

mator performance assessment are here employed. Moreover, dynamic sensitivity profiles

to assess the excitation provided by different parameters on different outputs are also pre-

sented. Monte Carlo studies are accomplished in order to support intermediate conclusions

obtained by the sensitivity method. The study is conducted on a modified version of the

Doyle-Fuller-Newman model of a Lithium-ion battery. It is here demonstrated that the use

of typical experimental data in Lithium-ion batteries i.e., the voltage discharge curves, only

enables the identification of a small parameter subset, regardless of the number of experi-

ments considered. Furthermore, it is shown that the inclusion of a new measurable variable

(i.e., a single electrolyte concentration measurement) significantly aids identifiability and

mitigates ill-conditioning. Doing so, the impact of poor and informative experimental data

on ill-conditioning and identifiability is illustrated.

5.2. Li-Ion Battery Modeling

The Li-Ion battery model under study is that proposed in Ref. 33 and experimentally

validated in Ref. 34. The Li-Ion cell sandwich in Figure 5.1 consists of a lithiated car-

bon anode (LixC6), a polymer electrolyte, and a lithium-manganese-oxide-spinel cathode

(LiyMn2O4). The active material in the composite electrodes is assumed to be made

up of spherical particles and supported on an inert material. The polymer electrolyte in

the separator uses a LiPF6 salt in a non-aqueous liquid mixture of ethylene carbonate

and dimethyl carbonate with a random co-polymer matrix of vinylidene fluoride and hex-

1The content of this chapter is reprinted (adapted) with permission from (D. C. Lopez C., G. Wozny, A.Flores-Tlacuahuac, R. Vasquez-Medrano, and V. M. Zavala. A computational framework for identifia-bility and ill-conditioning analysis of lithium-ion battery models. Industrial & Engineering ChemistryResearch, 55(11):3026-3042, 2016). Copyright (2016) American Chemical Society. (Publication I inAppendix A.2)

67

5. Lithium-ion battery: Finding adequate experimental data

afluoropropylene. The lithium ions (Li+) travel through the electrolyte from one porous

electrode to the other whereas the electrons travel through an external closed circuit. The

Li+ ions react and diffuse in the electrodes towards the inner regions of metal oxide active

material particles (the solid phase). The discharge process takes place when Li+ ions

diffuse from the anode to the cathode.

Llc

la

ls

x

LixC6

r

LiyMn2O4

r

Li+

Cathode AnodeSeparator

e-

Li+ Li+

Figure 5.1.: Li-Ion cell during discharge process. Cell consists of a LixC6 negative electrode, aLiyMn2O4 positive electrode, and a separator with LiPF6 salt-based electrolyte. (Fig-ure taken from publication I - Lopez et al. (2016) in Appendix A.2 - reprinted fromIndustrial & Engineering Chemistry Research with permission from American Chemi-cal Society).

The governing equations presented in Refs. 33, 34 are based on the porous electrode

and concentrated electrolyte theories. These equations consist of mass transport balances

in the electrolyte including migration, diffusion, and reaction; Ohm’s law in the electrolyte

which includes the diffusion potential and the variation of the electrolyte resistivity with

concentration; Fick’s laws in the solid active material which assumes a constant solid

diffusion coefficient; Ohm’s law in the solid electrode matrix; Butler-Volmer kinetics; and

current conservation. Radial diffusion is considered to be the transport mechanism of Li+

ions into the spherical particles in the electrodes.

The space-time model is comprised of a set of highly complex partial differential and

algebraic equations. Strategies to simplify the model are discussed in Refs. 115, 116, 85.

The subscripts a, s, and c to denote anode, separator, and cathode regions, respectively are

used. The subscripts e and s denote the electrolyte and solid phases, respectively. Three

independent variables (axial coordinate x, radial coordinate r and time t) are considered.

The model includes twenty dependent variables (states) summarized in Table 5.1 and

68


Table 5.1.: Variables in Li-Ion Model. (Table taken from publication I - Lopez et al. (2016)in Appendix A.2 - reprinted from Industrial & Engineering Chemistry Research withpermission from American Chemical Society).

Variable Description Unit Anode Separator Cathode

(k = a) (k = s) (k = c)

ce,k(x, t) Electrolyte concentration in regionk

mol m−3 X X X

Φe,k(x, t) Electrolyte-phase potential in re-gion k

V X X X

ik(x, t) Local current density in region k A m−2 X X Xcs,k(x, r, t) Concentration of Li+ on the inter-

calation particle of electrode kmol m−3 X - X

Φs,k(x, t) Solid-phase potential of electrode k V X - Xjn,k(x, t) Pore wall flux of Li+ on the interca-

lation particle of electrode kmol m−2 s−1 X - X

κ0,k(x, t) Ionic conductivity of the electrolytein region k

S cm−1 X X X

Uk(x, t) Open-circuit potential of electrodek

V X - X

Table 5.2.: Estimated parameters in Li-Ion Model.

Parameter Description UnitInitial Guess True Value

θIG θ∗

Ds,a Li+ diffusion coeff. inanode solid particle

m2 s−1 1.50× 10−13 1.00× 10−13

Ds,c Li+ diffusion coeff. incathode solid particle

m2 s−1 1.13× 10−10 7.50× 10−11

D Electrolyte salt diffu-sion coeff.

m2 s−1 5.85× 10−14 3.90× 10−14

ka Anode reaction rateconst.

m2.5 mol−0.5 s−1 3.00× 10−11 2.00× 10−11

kc Cathode reaction rateconst.

m2.5 mol−0.5 s−1 3.00× 10−11 2.00× 10−11

p Bruggman coeff. - 2.25 1.5Rf Anode Film resistance Ω m2 0.135 0.090t+ Transport number - 0.363 0.363

given by,

• The electrolyte concentration ce,k(x, t), electrolyte potential Φe,k(x, t), and local cur-

rent density ik(x, t) in all regions k = a, s, c.

• The solid concentration cs,k(x, r, t), solid potential Φs,k(x, t), and reaction rate

jn,k(x, t) in the electrodes k = a, c.

• The conductivity of the electrolyte κ0,k(x, t) in all regions k = a, s, c.

• The open-circuit potential Uk(x, t) in the electrodes k = a, c.

69


The kinetic and transport parameters to be estimated are presented in Table 5.2. They

are the Li+ diffusion coefficient in the solid particle of the anode Ds,a, the Li+ diffusion

coefficient in the solid particle of the cathode Ds,c, the salt diffusion coefficient in the

electrolyte D, the reaction rate constant in the anode ka, the reaction rate constant in

the cathode kc, the Bruggman coefficient p, the film resistance at the anode Rf , and the

transport number t+. The design and operating variables, constants and fixed parameters

are summarized in Table 5.3.

Table 5.3.: Operating and design variables, constants and fixed parameters in Li-Ion Model. (Tabletaken from publication I - Lopez et al. (2016) in Appendix A.2 - reprinted from In-dustrial & Engineering Chemistry Research with permission from American ChemicalSociety).

Variable/Parameter

Description UnitAnode Separator Cathode

(k = a) (k = s) (k = c)

Operating variables

I Discharge current (1.0 C) A m−2 17.5T Temperature K 298

Design variables

c(0)e Initial electrolyte concentra-

tion in region kmol m−3 2000

c(0)s,k Initial concentration of Li+

in electrode kmol m−3 14870 - 3900

Φe,0 Electrolyte potential at x =0

V 0 - -

Φs,c,0 Electrode potential at x =ℓa + ℓs

V - - 4.2

ℓk Thickness of region k µ m 100 52 174ϵk Porosity of region k - 0.357 1 0.444ϵf,k Volume fraction of fillers in

region k- 0.172 - 0.259

Physical constants

F Faraday’s constant C mol−1 96487R Ideal gas constant J mol−1

K−117.5

Kinetic and transport parameters

cs,k,max Max. concentration of Li+

in electrode kmol m−3 26390 - 22860

Rs,k Radius of active material inelectrode k

µ m 12.5 - 8.5

σk Electronic conductivity ofelectrode k

S m−1 100 - 3.8

70


The model here identified was modified in Ref. 85 to aid the computational performance.

The main modifications are summarized as follows:

• The local current density ik with k = a, s, c was eliminated at the electrodes and

the electrolyte by substituting Faraday’s equation in Ohm’s equation. With this

simplification the model reduced to seventeen dependent variables given by ce,a(x, t),

ce,s(x, t), ce,c(x, t), Φe,a(x, t), Φe,s(x, t), Φe,c(x, t), cs,a(x, r, t), cs,c(x, r, t), Φs,a(x, t),

Φs,c(x, t), jn,a(x, t), jn,c(x, t), κ0,a(x, t), κ0,s(x, t), κ0,c(x, t), Ua(x, t) and Uc(x, t).

• The three PDEs representing the electrolyte phase concentrations across the three

regions (i.e., anode ce,a(x, t), separator ce,s(x, t) and cathode ce,c(x, t)) were approx-

imated by a single PDE with the axis dimension spanning x ∈ [0, L]. The new

variable was called ce(x, t). The same simplification was made for the electrolyte

phase potentials (i.e., anode Φe,a(x, t), separator Φe,s(x, t) and cathode Φe,c(x, t)).

The new continuous variable was Φe(x, t). This reduced the system to thirteen depen-

dent variables given by ce(x, t), Φe(x, t), cs,a(x, r, t), cs,c(x, r, t), Φs,a(x, t), Φs,c(x, t),

jn,a(x, t), jn,c(x, t), κ0,a(x, t), κ0,s(x, t), κ0,c(x, t), Ua(x, t) and Uc(x, t).

• The boundary conditions for the potential of both electrodes and the electrolyte

were modified. Specifically, Φe(0, t) = Φe,0 and Φs,c(ℓa + ℓs, t) = Φs,c,0 were set.

Two additional equations related to the integral form of Faraday’s law (see integral

equations in Table 5.5) were introduced. The transformations were based on the

fact that the current supplied by each portion of the anode and cathode should add

up to the total current I. This modification tries to avoid having in this model a

family of solutions instead of a unique solution when null potential flux as boundary

condition is considered.

• L’Hopital’s theorem to the Li-Ion diffusion equation in the solid active material

(derived from Fick’s law) to avoid the indeterminate boundary condition at the

sphere center (r = 0) was applied. Consequently, an additional equation was intro-

duced to represent the concentration of Li+ ions in the solid-phase of the electrodes

(cs,a(x, r, t) and cs,c(x, r, t)).

• A dimensionless model in the axial x and spherical coordinates r was used. Each

region was normalized according to its width. For instance, x∗ = xℓa

and r∗ = rRs,a

in the anode was defined.

71


Table5.4.:Governingequationsformodified

Li-IonPDAE

model.(T

able

takenfrom

publicationI-Lopez

etal.(2016)in

Appendix

A.2

-reprintedfrom

Industrial

&EngineeringChem

istryResearchwithpermissionfrom

AmericanChem

ical

Society).

Region

Govern

ingEquation

Boundary

Condition

IBoundary

Condition

IIIn

itialCondition

Anode

Solid-phase

potential:

Φs,a(x,t)

x=

0x=

ℓ a

σeff,a

∂2Φ

s,a

(x,t)

ℓ2 a∂x2

=Faaj n

,a(x,t)

Φs,a(x,t)=

0∂Φ

s,a

(x,t)

∂x

=0

Solid-phase

concentration:c s

,a(x,r,t)

r=

0r=

Rs,a

t=

0

∂cs,a

(x,r,t)

∂t

=3D

s,a

R2 s,a

∂2cs,a

(x,r,t)

∂r2

ifr=

0∂cs,a

(x,r,t)

∂r

=0

c s,a(x,r,0)=

c(0)

s,a

∂cs,a

(x,r,t)

∂t

=D

s,a

R2 s,a

∂2cs,a

(x,r,t)

∂r2

+2 r

∂cs,a

(x,r,t)

∂r

ifr>

0∂cs,a

(x,r,t)

∂r

=−R

s,a

jn,a

(x,t)

Ds,a

Cathode

Solid-phase

potential:

Φs,c(x,t)

x=

ℓ a+

ℓ sx=

ℓ a+

ℓ s+

ℓ c

σeff,c

∂2Φ

s,c

(x,t)

ℓ2 c∂x2

=Facj n

,c(x,t)

Φs,c(x,t)=

Φs,c,0

∂Φ

s,c

(x,t)

∂x

=−ℓ c

Iσeff,c

Solid-phase

concentration:c s

,c(x,r,t)

r=

0r=

Rs,c

t=

0

∂cs,c

(x,r,t)

∂t

=3D

s,c

R2 s,c

∂2cs,c

(x,r,t)

∂r2

ifr=

0∂cs,c

(x,r,t)

∂r

=0

c s,c(x,r,0)=

c(0)

s,c

∂cs,c

(x,r,t)

∂t

=D

s,c

R2 s,c

∂2cs,c

(x,r,t)

∂r2

+2 r

∂cs,c

(x,r,t)

∂r

ifr>

0∂cs,c

(x,r,t)

∂r

=−R

s,c

jn,c

(x,t)

Ds,c

Anode

Sep

arator

Cathode

Electrolyte-phase

potential:

Φe(x,t)

x=

0x=

ℓ a+

ℓ s+

ℓ c

κeff(x,t)∂2Φ

e(x

,t)

l2∂x2

=2κeff(x

,t)R

T

F(1

−t +

)∂2ce(x

,t)

l2∂x2

−Faj n(x,t)

Φe(x,t)=

Φe,0

∂Φ

e(x

,t)

∂x

=0

Electrolyte-phase

concentration:c e(x,t)

x=

0x=

ℓ a+

ℓ s+

ℓ ct=

0

ϵ∂ce(x

,t)

∂t

=D

eff∂2ce(x

,t)

l2∂x2

+a(1

−t +

)jn(x,t)

∂ce(x

,t)

∂x

=0

∂ce(x

,t)

∂x

=0

c e(x,0)=

c(0)

e

72


Table 5.5.: Auxiliary equations of modified Li-Ion PDAE model. (Table taken from publicationI - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial & EngineeringChemistry Research with permission from American Chemical Society).

Region Auxiliary equation

Anode

σeff,a = σa ϵpa

aa = 3Rs,a

(1− ϵa − ϵf,a)

jn,a(x, t) = 2 ka (cs,a(x,Rs,a, t))0.5 (ce,a(x, t))

0.5(cs,a,max −cs,a(x,Rs,a, t))

0.5sinh( 0.5FRT

(Φs,a(x, t)− Φe,a(x, t)− Ua(x, t)− Fjn,a(x, t)Rf ))

Ua(x, t) = −0.16 + 1.32e−3

cs,a(x,Rs,a,t)

cs,a,max + 10e−2000

cs,a(x,Rs,a,t)

cs,a,max∫ ℓa

0

(Faa jn,a(x, t)|Φs,a=Φe,0)dx = I

Cathode

σeff,c = σc ϵpc

ac = 3Rs,c

(1− ϵc − ϵf,c)

jn,c(x, t) = 2 kc cs,c(x,Rs,c, t)0.5ce,c(x, t)

0.5(cs,c,max −cs,c(x,Rs,c, t))

0.5 sinh( 0.5FRT

(Φs,c(x, t)− Φe,c(x, t)− Uc(x, t)))

Uc(x, t) = 4.198 + 0.0565 tanh(−14.554cs,c(x,r,t)|r=Rs,c

cs,c,max+ 8.609)−

0.0275

(1

(0.9984−cs,c(x,r,t)|r=Rs,c

cs,c,max)0.492

− 1.901

)− 0.157 e

−0.0473 (cs,c(x,r,t)|r=Rs,c

cs,c,max)8

+

0.8102 e−40(

cs,c(x,r,t)|r=Rs,ccs,c,max

−0.133)∫ ℓa+ℓs+ℓc

ℓa+ℓs

(Fac jn,c(x, t)|Φs,c=Φs,c,0)dx = −I

Anode/Separator/Cathodek = a, s, c

Φe(x, t) = (Φe,a(x, t),Φe,s(x, t),Φe,c(x, t))

ce(x, t) = (ce,a(x, t), ce,s(x, t), ce,c(x, t))

jn(x, t) = (jn,a(x, t), 0, jn,c(x, t))

κeff (x, t) = (κeff,a(x, t), κeff,s(x, t), κeff,c(x, t))

κeff,k(x, t) = (1× 102 κ0,k(x, t))ϵpk

κ0,k(x, t) = 1.0793× 10−4 + 6.7461× 10−3(1× 10−3ce,k(x, t))− 5.2245× 10−3(1×10−3ce,k(x, t))

2+1.3605×10−3(1×10−3ce,k(x, t))3−1.1724×10−4(1×10−3ce,k(x, t))

4

Deff = (Deff,a, Deff,s, Deff,c)

Deff,k = D ϵpk

ϵ = [ϵa, ϵs, ϵc]

l = [la, ls, lc]

a = [aa, 0, ac]

73


The complete set of equations of the modified model is presented in Tables 5.4 and 5.5.

The PDAE system was discretized by the method of lines in the axial and radial coordinates

to obtain a set of DAEs according to Ref. 85. The consistent initial conditions for this

DAE system were obtained as follows. The algebraic equations of the discretized system

and those of the most interconnected variables were decoupled from the whole model at

t = 0 (i.e., jn,a(x, t), jn,c(x, t), Φs,a(x, t), Φs,c(x, t) and Φe(x, t)). Here, the electrolyte

concentration functions for κ0,k(ce(x, t)) with k = a, s, c and Uk(ce,k(x, t)) with k =

a, c were also included. These equations by assuming all the concentrations equal to their

initial values (ce(x, 0) = c(0)e , cs,a(x, r, 0) = c

(0)s,a and cs,c(x, r, 0) = c

(0)s,c ) were solved. With

this initial solution of the pore wall flux of Li+ ions (i.e., jn,a(x, 0) = j(0)n,a, jn,c(x, 0) = j

(0)n,c)

and potentials (Φs,a(x, 0) = Φ(0)s,a, Φs,c(x, 0) = Φ

(0)s,c and Φe(x, 0) = Φ

(0)e ) the discretized

PDEs for the concentrations (ce(x, t), cs,a(x, r, t), and cs,c(x, r, t)) was finally solved.

5.3. Results and Discussion

For the modified Li-Ion model the differential and algebraic state variable

vectors defined in Eq. 2.1 are given by x = (ce, cs,a, cs,c) and z =

(Φe,Φs,a,Φs,c, jn,a, jn,c, κ0,a, κ0,s, κ0,c, Ua, Uc), respectively (states are assumed to be dis-

cretized). The parameter vector is θ = (D,Ds,a, Ds,c, ka, kc, p, Rf , t+). The cell voltage

Vcell(t) and the electrolyte concentration in the separator core ce(ℓa + ℓs/2, t) as the pre-

dicted response variable, i.e., y = (Vcell(t), ce(ℓa + ℓs/2, t)), are considered. The cell voltage

Vcell(t) is computed using the solid-phase potential on the right-hand side of the cathode

Φs,c(ℓa + ℓs + ℓc, t) and on the left-hand side of the anode Φs,a(0, t) as,

Vcell(t) = Φs,c(ℓa + ℓs + ℓc, t)− Φs,a(0, t) (5.1)

Constant discharge current rates I (galvanostatic process) as the input or controls u = I

are assumed. The change of Vcell(t) for a discharge rate I is known as the voltage discharge

curve. The cell voltage Vcell(t) and the electrolyte concentration in the separator core

ce(ℓa + ℓs/2, t) are assumed to be measured at 100 (equally spaced) time points Y m =

(Vcell(tk), ce(ℓa + ℓs/2, tk)), for tk ∈ T = 1, · · · , 100. For both measurable variables a

standard deviation of 1% is employed.

In order to investigate the effect of experimental data collection on ill-conditioning

and identifiability issues of Li-Ion battery models three different cases are set. In those

cases, two main focus are considered, namely the collection of several experiments of the

same output information (Case 1 and 2) and the addition of a new output information

(Case 3) in the parameter estimation. Case 1 and Case 2 study if information provided by

discharge curves is sufficient to reliably estimate key parameters of interest. Whereas Case

3 demonstrates that the use of new experimental information e.g., electrolyte concentration,

dramatically improves identifiability. The three cases can be summarized as follows:

74


• Case 1 (single discharge curve): one experiment that only uses discharge curve infor-

mation (is set I1 as the standard discharge rate) is considered. The only observable

variable is assumed to be Vcell(·).

• Case 2 (multiple discharge curves): the effect of progressively adding experiments

with discharge curve information (Ii, i = 2, . . . , 6 is set) is considered. The only

observable variable is assumed to be Vcell(·).

• Case 3 (discharge curves and electrolyte concentration profile): the effect of pro-

gressively adding experiments with discharge curve information and include the elec-

trolyte concentration in the middle of the separator ce(ℓa + ℓs/2, ·) as observable

variable (Ii, i = 1, . . . , 4 it is set) is considered. Then, the observable variables are

Vcell(·) and ce(·)|x=ℓa+ℓs/2.

Case 1 is analyzed in detail to discuss the different techniques of the framework in

Chapter 4 while Case 2 and 3 are presented in summarized form. The discharge curve at I1

is shown in Figure 5.2. The sensitivity and Monte Carlo methods of Section 4.2 to analyze

parameters are applied. The results of Case 1 (base case), 2 and 3 are presented in Sections

5.3.1, 5.3.2, and 5.3.3, respectively. It is important to point out that the framework

outlined in Chapter 4 may be easily used to determine if other source of experimental

information (including type of measures, measuring error, sample frequency, etc.) are

indeed aiding the ill-conditioning and identifiability of any battery model.

In Table 5.2 the true parameter vector θ∗ taken from Ref. 34 and the initial guess

vector θIG are displayed. In all cases, the experimental data Y m is virtually generated

by perturbing the model solution Y (u, θ∗) at u and θ∗ under measurement error samples

drawn from a normal distribution with mean zero and variance σ2y = 1 × 10−4. To aid

numerical stability and ill-conditioning, the outputs in Y m and parameters are properly

normalized.

In Figure 5.2 the discharge curves for Cases 1 and 2 are displayed. The discharge rates I

are expressed relative to the base current C = 17.5A/m2 having I1 = 1C (base discharge),

I2 = 2C, I3 = 3C, I4 = 4C (fast discharge), and I5 = 0.5C and I6 = 0.1C (slow discharge).

A cut-off voltage of 2.8V is assumed.

5.3.1. Case 1: Single Discharge Curve.

Sensitivity Method

The results of the sensitivity method discussed in Section 4.2.1 are summarized in Table

5.6. The estimate θ and their variances σ2θj

(diagonal elements of C computed from Eq. 4.3)

are also presented. The estimator accuracy is measured in terms of the squared bias β(Θ)2

computed from Eq. 2.29 and MSE computed from Eq. 2.30. The model predictions at θ

are shown in Figure 5.2.

75


Model fitting

Time (s)

0 1000 2000 3000 4000

Cel

l V

olt

age,

Vce

ll (

V)

2.75

3

3.25

3.5

3.75

4

Standard discharge rate

I4 I3

I2

I1

Slow discharge rates

Fast discharge rates

Time (s)

0 250 500 750 1000 1250

Vce

ll (

V)

2.8

3

3.2

3.4

3.6

I6

I5

Time (s)

0 104 2×104 3×104 4×104

Vce

ll (

V)

2.753

3.253.5

3.754

4.25

Figure 5.2.: Discharge curves for Case 1 (base rate I1 = 1C) and Case 2 simultaneously consideringfast (I2 = 2C, I3 = 3C, I4 = 4C), and slow rates (I5 = 0.5C and I6 = 0.1C). Markersare experimental data and solid lines are model predictions after parameter estimationat estimator θ. (Figure taken from publication I - Lopez et al. (2016) in Appendix A.2- reprinted from Industrial & Engineering Chemistry Research with permission fromAmerican Chemical Society).

Table 5.6.: Case 1 for Sensitivity method. (Table taken from publication I - Lopez et al. (2016)in Appendix A.2 - reprinted from Industrial & Engineering Chemistry Research withpermission from American Chemical Society).

ParsTrueθ∗

Estimatedθ

Estimator Performance Identifiability analysis(Parameter Ranking)Precision Accuracy

Varianceσ2θ

Biasβ(Θ)2

MSEVarianceMethod

SVDMethod

QRPMethod

Ds,a 6.67× 10−1 2.54× 10−1 3.56× 10−2 1.70× 10−1 2.06× 10−1 (1)∗ (1) (2)∗

Ds,c 6.67× 10−1 9.13× 10−1 5.79× 10+2 6.07× 10−2 5.79× 10+2 (7) (7) (7)D 6.67× 10−1 1.32× 10+0 1.48× 10+1 4.25× 10−1 1.52× 10+1 (5) (5) (5)ka 6.67× 10−1 3.95× 10+3 4.40× 10+15 1.56× 10+7 4.40× 10+15 (8) (8) (8)kc 6.67× 10−1 7.40× 10−1 3.84× 10+1 5.45× 10−3 3.85× 10+1 (6) (6) (6)p 6.67× 10−1 7.93× 10−1 1.55× 10+0 1.59× 10−2 1.56× 10+0 (2) (2) (1)∗

Rf 6.67× 10−1 5.98× 10−1 6.13× 10+0 4.65× 10−3 6.13× 10+0 (3) (4) (4)t+ 1.00× 10+0 2.92× 10−1 9.83× 10+0 5.01× 10−1 1.03× 10+1 (4) (3) (3)∗

Identifiable Subset Dimension 1 0 3

76


Estimator Analysis. Despite the good fitting exhibited in Figure 5.2, the estimated

parameters have large variances if only one discharge curve (at I1) is used as experimental

data. The most precise parameter is the Li+ ion diffusion coefficient in the solid particle

of the anode Ds,a with a variance of 3.56×10−2 and the worst is the reaction rate constant

in the anode ka with variance of 4.40× 1015. The precision of each parameter in terms of

the length of their confidence intervals presented in Eq. 2.28 is shown in Table 5.7. The

lengths are computed as percentages relative to θ∗.

Table 5.7.: Case 1, 2, and 3: Confidence interval lengths for Sensitivity and Monte Carlo methods.Lengths are expressed as percentages relative to the true parameter. (Table taken frompublication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial &Engineering Chemistry Research with permission from American Chemical Society).

Pars

Sensitivity Method Monte Carlo Method

Case 1 Case 2 Case 1 Case 2 Case 3

SC1 SC4 SC6 SC1 SC4 SC6 SC1 SC4

Ds,a 1.1× 102% 5.3× 101% 6.1× 101% 6.0× 103% 1.9× 102% 1.6× 102% 4.6× 102% 5.8× 101%

Ds,c 1.4× 104% 5.5× 102% 3.6× 102% 8.7× 105% 1.3× 105% 9.3× 103% 2.4× 105% 1.4× 102%

D 2.3× 103% 2.3× 102% 2.2× 102% 8.4× 104% 1.7× 102% 8.2× 101% 6.0× 101% 2.4× 101%

ka 3.9× 1010% 2.2× 102% 1.5× 102% 1.4× 106% 1.4× 106% 9.3× 103% 4.4× 106% 8.7× 105%

kc 3.7× 103% 9.1× 102% 9.0× 102% 6.3× 105% 5.3× 102% 5.1× 102% 1.1× 102% 8.4× 101%

p 7.4× 102% 8.3× 101% 7.1× 101% 1.4× 102% 4.9× 101% 3.7× 101% 2.2× 101% 8.0× 100%

Rf 1.5× 103% 6.5× 101% 6.4× 101% 1.4× 102% 3.7× 101% 2.9× 101% 1.4× 102% 1.2× 101%

t+ 1.2× 103% 5.8× 101% 4.8× 101% 2.4× 102% 4.5× 101% 3.8× 101% 4.2× 101% 1.0× 101%

The estimator accuracy in terms of its bias is now quantified. The film resistance at

the anode Rf presents a squared bias of 4.65 × 10−3 which is equivalent to a relative

bias (with respect to θ∗) of 10%. The parameter ka exhibits the largest squared bias of

1.56× 107 equivalent to a relative bias of 5.92× 105%. The overall performance metrics of

this parameter estimator are 4.40×1015 for precision (related to Tr [C]) and 1.56×107 for

bias (as the squared norm of β(Θ)). With these results it is observed that the estimator

for Case 1 is highly unstable.

Ill-Conditioning Analysis. Structural issues by applying the procedures of Section 4.4.1

are now explored. The singular values of S at the parameter estimate θ vary from ς1 =

9.1087 × 101 to ς8 = 1.5074 × 10−8. The spectrum of the singular values SVs is the

black-solid line with markers in Figure 5.3 (left panel). The condition number and the

collinearity index are κ = 6.0428× 109 and γ = 6.6341× 107, respectively.

On the left-hand side of Figure 5.3 lower bounds with respect to the condition number

and the collinearity index, ϵκ and ϵγ , respectively for Case 1 are presented. These values

are computed by using the predefined empirical upper bounds κmax = 1000 [46, 79, 78]

and γmax = 15σy [17, 79, 78]. The bound γmax is scaled by the measurement standard

deviation σy because the scaled sensitivity matrix S = Σ−1y S. Having so, the lowest bound

on the SVs to select the well-conditioned singular values is ϵ = ϵγ = 6.67× 100. For Case

1, only three singular values pass this test, which implies that S has a numerical rank of

77


∈κ: lower bound κ

ill-conditioned ςi

well-conditioned ςi

∈γ: lower bound γ

Case 1-SC1

Sin

gu

lar

Valu

e

10−3

10−2

10−1

100

101

102

103

Singular value (ςi)

ς1 ς2 ς3 ς4 ς5 ς6 ς7 ς8

Case 1-SC1Case 2-SC2Case 2-SC3Case 2-SC4Case 2-SC5Case 2-SC6

ill-conditioned ςi

well-conditioned ςi

∈=∈γ

10−3

10−2

10−1

100

101

102

103


ς1 ς2 ς3 ς4 ς5 ς6 ς7 ς8

Figure 5.3.: Singular value spectra. Left panel is Case 1 (single discharge curve) and right panelis Case 2 (multiple discharge curves). (Figure taken from publication I - Lopez et al.(2016) in Appendix A.2 - reprinted from Industrial & Engineering Chemistry Researchwith permission from American Chemical Society).

three (rϵ = 3) with five ill-conditioned singular values.

Identifiability Analysis. The results to apply the three different identifiability analysis

methods (variance, SVD, and QRP) described in Section 4.4.2 are here described. The

identifiable ranking list obtained for each method in Table 5.6 is shown. The numbers in

parenthesis indicate the position of each parameter in the ranking and the stars indicate

the identifiable parameters according to each method.

• Variance Method: under this method the most identifiable parameter is Ds,a (the

smallest variance of σ2Ds,a

= 3.56× 10−2), and the least identifiable parameter is ka

(the largest variance of σ2ka

= 4.40× 1015). The variance threshold of ρ = 1.5× 10−1

according to Ref. 108 is used which yields only one parameter as identifiable.

• SVD Method: under this method a similar ranking is found. In Figure 5.4 the contri-

bution of each singular value to the variance of each parameter is displayed. It should

be highlighted the strong influence of the ill-conditioned singular values (i.e., ςi for

i = 4, · · · , 8) on the variance. These are responsible for the large variances observed

in Table 5.6. It is also seen that the last two parameters in the identifiable ranking

list (Ds,c and ka) are fully influenced by the smallest singular values ς7 and ς8, re-

spectively (proportions of 100%). The most identifiable parameter Ds,a exhibits the

smallest impact from the ill-conditioned singular values. The proportion, however, is

still significant (79.5%). In fact, if the proportion threshold πmax = 50% according

to Ref. 11 to select the identifiable parameters is employed, this would indicate that

Ds,a could not be classified as an identifiable parameter. Accordingly, all parameters

would be considered unidentifiable. In Table 5.6 a parameter subset dimension equal

to zero is thus presented. From Figure 5.4 it also becomes evident that the smallest

78


singular values ςi for i = 4, · · · , 8 simultaneously affect many parameters. This is an

indication of linear dependence.

• QR Method: under this method the parameter with the most effect on the outputs

is the Bruggman coefficient p. Consequently, in Table 5.6 this parameter on the first

place of the ranking is located. The reaction rate constant in the anode ka is the last

in the ranking. This means that, after all orthogonal projections, this parameter

has the smallest effect on the measured variables. In order to select the number

of the identifiable parameters, rϵ = 3 (the rank of S obtained in the previous ill-

conditioning analysis) is used as the identifiable parameter subset dimension. Under

this threshold, parameters p, Ds,a and t+ are obtained as identifiable.

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 2 3 4 5 6 7 8

Va

ria

nce

-dec

om

posi

tio

n p

rop

ort

ion

(%)

V1 V2 V3 V4 V5 V6 V7 V8

Identifiable parameter position

ςςςς1 ςςςς2 ςςςς3 ςςςς4 ςςςς5 ςςςς6ςςςς7 ςςςς8

Ds,a Ds,c D ka kc p Rf t+

(1) (7) (5) (8) (6) (2) (4) (3)

Well-conditioned ςi

Ill-conditioned ςi

ππππmax

Figure 5.4.: Case 1: Variance decomposition for SVD identifiability method of Section 4.4.2. (Fig-ure taken from publication I - Lopez et al. (2016) in Appendix A.2 - reprinted fromIndustrial & Engineering Chemistry Research with permission from American Chemi-cal Society).

From previous results several conclusions can be established. Firstly, the three methods

indicate that the whole constant parameter vector using one discharge curve cannot be

identified. This is because the discharge signal u = I1 does not excite enough the system

and this is manifested as ill-conditioning of the sensitivity matrix S. This is reinforced

by Figure 5.5, where the sensitivity time profiles of the cell voltage Vcell to the different

parameters are presented. As can be seen, the measured variable Vcell is not significantly

excited by several parameters (several time profiles are flat and close to zero). Thus

insensitive parameters (e.g., ka, Ds,c and kc) are found which can take any value in a

broad space without affecting the output. It is important to highlight that the estimation

of a set of constant parameters is here considered, however the same analysis may be

applied in the case of estimating time-varying parameters, for instance to analyze capacity

fade [104]. In that case, in each cycle the same methods could be used to address similar

issues associated with their estimation.

79


dVCe

ll/dD

s,a

−2

−1

0

1

2

3

4

dVCe

ll/dD

s,c

−2

−1

0

1

2

3

4

dVCe

ll/dD

−2

−1

0

1

2

3

4

dVCe

ll/dk a

−2

−1

0

1

2

3

4

dVCe

ll/dk c

−2−1

01234

Time (s)1 10 100 10000

dVCe

ll/dp

−30

−25

−50

Time (s)1 10 100 10000

dVCe

ll/dR f

−30

−25

−50

Time (s)1 10 100 10000

dVCe

ll/dt +

−2−1

01234

Time (s)1 10 100 10000

Figure 5.5.: Sensitivity time profiles of cell voltage with respect to parameters dVcell(t)/dθ at nomi-nal discharge rate I1 for Case 1. (Figure taken from publication I - Lopez et al. (2016)in Appendix A.2 - reprinted from Industrial & Engineering Chemistry Research withpermission from American Chemical Society).

Secondly, it can be also concluded that parameter variability and identifiability issues

are closely related to the ill-conditioning of the sensitivity matrix S. Accordingly, any im-

provement in the ill-conditioning of S will have a beneficial impact in parameter variances,

confidence intervals and identifiability. This can be achieved by providing alternative

discharge signals that more properly excite the system.

Monte Carlo Method

The sensitivity method provides several indications of poor identifiability. In this section,

the Monte Carlo method described in Section 4.2.2 to validate these observations is em-

ployed. To obtain the data sets Y mj the measurement variances σ2

y = 1×10−4 and L = 200

replications are used. The results are summarized in Table 5.8. It is presented the empir-

ical mean E[Θ] in Eq. 4.5 and the parameter variance σ2θkk as the diagonal elements of

the approximate matrix C. The accuracy performance in terms of the squared bias β(Θ)2

and the MSE are also displayed. The marginal probability density function (pdf) for each

parameter are exhibited in Figure 5.6.

Estimator Analysis. From Table 5.8 it can be observed that variances do not match

those obtained with the sensitivity method presented in Table 5.6. This provides evidence

that the variances obtained from the Fisher-information matrix are badly approximated.

From Table 5.8 it can be also seen that parameters Ds,a, Ds,c, D, ka and kc have large

variances while parameters p, Rf and t+ have small ones. This becomes more evident when

observing the relative confidence interval lengths presented in Table 5.7. According to the

80


Table 5.8.: Case 1: Summary of results for Monte Carlo. (Table taken from publication I - Lopezet al. (2016) in Appendix A.2 - reprinted from Industrial & Engineering ChemistryResearch with permission from American Chemical Society).

Parameterθ

Trueθ∗

Meanθ

Estimator PerformanceIdentifiability Analysis(Parameter Ranking)Precision Accuracy

Varianceσ2θ

Biasβ(Θ)2

MSE Variance Method

Ds,a 6.67× 10−1 2.14× 10+0 1.01× 10+2 2.16× 10+0 1.03× 10+2 (4)Ds,c 6.67× 10−1 2.24× 10+2 2.12× 10+6 5.00× 10+4 2.17× 10+6 (7)D 6.67× 10−1 1.24× 10+1 1.98× 10+4 1.38× 10+2 1.99× 10+4 (5)ka 6.67× 10−1 6.91× 10+2 5.42× 10+6 4.76× 10+5 4.40× 10+6 (8)kc 6.67× 10−1 1.04× 10+2 1.13× 10+6 1.07× 10+4 1.14× 10+6 (6)p 6.67× 10−1 6.97× 10−1 5.77× 10−2 9.48× 10−4 5.86× 10−2 (2)∗

Rf 6.67× 10−1 6.30× 10−1 5.55× 10−2 1.37× 10−3 5.69× 10−2 (1)∗

t+ 1.00× 10+0 8.92× 10−1 3.74× 10−1 1.17× 10−2 3.86× 10−1 (3)

Performance metric 8.68× 106 5.37× 105 9.22× 106 −Identifiable Subset Dimension 2

Monte Carlo method, the most precise parameters are the film resistance at the anode

Rf (σ2Rf

= 5.55× 10−2) and the Bruggman coefficient p (σ2p = 5.77× 10−2). The relative

lengths of the confidence intervals for Rf and p, however, are quite large (140% and 143%,

respectively). The most uncertain parameter is ka with a variance of σ2ka

= 5.42×106 and

a relative length of the confidence interval of 1.4× 106%.

In terms of estimator accuracy, it is observed that the mean E[Θ] (Eq. 4.5) presents a

deviation from θ∗ for parameters Ds,a, Ds,c, D and kc larger than that exhibited by the

parameter estimate θ obtained with the sensitivity method and presented in Table 5.6.

This is a reflection of the instability of the parameters. Parameter ka is the most biased

parameter obtained by Monte Carlo. The most precise parameters p, Rf and t+ are also

the least biased parameters. This ranking is very close to that obtained with the variance,

SVD, and QRP methods of the sensitivity setting.

From Figure 5.6 it is clearly seen that only parameters p, Rf and t+ are identifiable

and their distribution are close to normal. With these results it is confirmed that the

parameter estimator obtained in Case 1 is unstable.

Identifiability Analysis. It is now applied the variance method of Section 4.4.2 to analyze

identifiability under Monte Carlo. Under the variance threshold ρ = 1.5 × 10−1 two

identifiable parameters, Rf and p are found which means an identifiable parameter subset

dimension equal two, as indicated in Table 5.8. From this table it is also seen that the

most identifiable parameter is Rf with variance σ2Rf

= 5.55×10−2 and the less identifiable

parameter is ka with variance σ2ka

= 5.42 × 106. This is partially consistent with the

identifiability results of the sensitivity method.

5.3.2. Case 2: Multiple Discharge Curves.

From Case 1 it can be concluded that only a very small parameter subset is identifi-

able. In particular, our analysis indicates that parameters Rf , p and t+ are the only

81


Figure 5.6.: Case 1: Marginal pdfs obtained from Monte Carlo. Solid-black lines and filled re-gions represent the normal and the non-parametric distributions of each estimator,respectively. Parameters with a star are nominated as identifiable. (Figure taken frompublication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial &Engineering Chemistry Research with permission from American Chemical Society).

identifiable parameters. Their variances, however, are large. It is now analyzed the ef-

fect of incorporating additional discharge curves. New curves obtained at different dis-

charge rates (I2 = 2I1, I3 = 3I1, I4 = 4I1, I5 = 0.5I1 and I6 = 0.1I1) are progressively

added. The first scenario (denoted as SC1) uses signal u1 = I1 and data Y m1 = V 1

cell(tk)

(this corresponds to Case 1). Scenario SC2 uses the signal vector u2 = (I1, I2) and

data Y m2 = (V 1

cell(tk), V2cell(tk)) and it is continued until scenario SC6 with signal vector

u6 = (I1, I2, . . . , I6) and data Y m6 = (V 1

cell(tk), V2cell(tk), · · · , V 6

cell(tk)).

Sensitivity Method.

For each scenario SC1, ..., SC6 an estimation is conducted in order to obtain the estimates

θξ, the model prediction vector Yξ and its corresponding scaled sensitivity matrix Sξ. It is

then analyzed the estimator performance for each scenario based on its average estimated

covariance matrix Cξ, confidence intervals, and bias β(θξ).

Estimator Analysis. In Table 5.7 it is shown the confidence interval lengths in relative

terms. It is clearly seen that the addition of experimental information reduces the con-

fidence levels. The main reduction is observed when four experiments (SC4) are used.

Interestingly, considering two additional experiments (SC6) only provides a slight improve-

ment over SC4. Moreover, despite the reduction in parameter variance over Case 1, it is

observed that the parameters still have large uncertainties. For instance, for scenario SC6,

we have that D, Ds,c and kc have interval lengths of 216%, 365% and 902%, respectively.

82


Considering the large parameter variability it can be concluded that information from

discharge curves does not seem sufficient to completely identify the parameter vector.

These results also seem to indicate that slow discharge rates I5, I6 are not informative.

This last observation is corroborated by analyzing the sensitivity profiles for different

rates. In Figure 5.7 the sensitivity profiles for the cell voltage Vcell to the parameters for

the discharge rates I1 (standard discharge), I4 (fast discharge), and I6 (slow discharge)

are presented. Therein it is observed that the slow discharge rate I6 provides significantly

less excitation compared to I1 and I4. In addition, it is interesting to observe that the

fast discharge rate I4 induces richer dynamic behavior (this is particularly evident from

the profiles of the Bruggman coefficient and of the diffusion coefficients).

dVCe

ll/dD

s,a

−2

−1

0

1

2

3

4I6I1I4

dVCe

ll/dD

s,c

−2

−1

0

1

2

3

4

dVCe

ll/dD

−2

−1

0

1

2

3

4

dVCe

ll/dk a

−2

−1

0

1

2

3

4

dVCe

ll/dk c

−2

−1

0

1

2

34

Time (s)1 10 100 10000

dVCe

ll/dp

−30

−25

−20

−15

−10

−50

Time (s)1 10100 10000

dVCe

ll/dR f

−30

−25

−20

−15

−10

−50

Time (s)1 10100 10000

dVCe

ll/dt +

−2

−1

0

1

2

34

Time (s)1 10100 10000

Figure 5.7.: Sensitivity time profiles of cell voltage with respect to parameters dVcell(t)/dθ at slowI6, nominal I1, and fast I4 discharge rates for scenario Case 2-SC6. (Figure takenfrom publication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial &Engineering Chemistry Research with permission from American Chemical Society).

Ill-Conditioning Analysis. It is now evaluated changes in ill-conditioning as discharge

curves are added. The results are presented in the right panel of Figure 5.3 where the lift

in the singular value spectrum with each scenario can be observed. This lift is accompanied

by a reduction in the spectrum slope (related to the condition number κ) and an increase

in the smallest singular value ς8 (related to the collinearity index γ). In particular, the

condition numbers for SC4 and SC6 are 6.668× 102 and 6.024× 102, respectively whereas

the collinearity indexes are 1.564×100 and 1.395×100. The spectra and the ill-conditioning

metrics (κ and γ) demonstrate a significant improvement in the ill-conditioning from SC1

to SC4 but just a slight improvement of SC6 with respect to SC4. In Figure 5.3 it is

also seen that the number of well-conditioned singular values is not the full length of the

parameter vector. In other words, matrix S has a rank equal to five only (rϵ=5).

83


Identifiability Analysis. In Table 5.9 the subset dimension of the identifiable parameters

after applying the three methods of Section 4.4.2 is presented. Here, the same thresholds of

Section 5.3.1 for Case 1 for each identifiability method are also used. The variance method

selects two and six parameters as identifiable for scenarios SC2 and SC6, respectively. This

is the result of the progressive reduction of the parameter variance. For SC4 and SC6

the identifiable parameters are Ds,a, kc, p, Rf and t+. Parameters Ds,c and ka remain

unidentifiable in all scenarios. The QR method only selects 5 parameters as identifiable

for scenario SC6 while the SVD method only selects one.

Table 5.9.: Case 2: Summary of results for Sensitivity and Monte Carlo methods. (Table takenfrom publication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial &Engineering Chemistry Research with permission from American Chemical Society).

Input actionuξ

Experimental dataY mξ

Identifiable Subset Dimension

SCξ New Iξ

Sensitivity MethodMonteCarlo

VarianceMethod

SVDMethod

QRMethod

VarianceMethod

1 1.0 C = 17.5 [I1] [V 1cell] 1 0 3 2

2 2.0 C = 35.0 [I1, I2] [V 1cell, V

2cell] 2 1 4 3

3 3.0 C = 52.5 [I1, I2, I3] [V 1cell, V

2cell, V

3cell] 3 1 5 4

4 4.0 C = 70.0 [I1, I2, I3, I4] [V 1cell, V

2cell, V

3cell, V

4cell] 5 1 5 5

5 0.5 C = 8.75 [I1, I2, I3, I4, I5] [V 1cell, V

2cell, V

3cell, V

4cell, V

5cell] 6 1 5 5

6 0.1 C = 1.75 [I1, I2, I3, I4, I5, I6] [V 1cell, V

2cell, V

3cell, V

4cell, V

5cell, V

6cell] 6 1 5 5

Monte Carlo Method.

The variances obtained with Monte Carlo are again several orders of magnitude different

than those estimated with the sensitivity method. This is illustrated in Table 5.7. It can

be also observed that, because ill-conditioning is improved as experimental information is

added, the qualitative behavior of both methods becomes similar. From Table 5.7 it is

confirmed that the most precise parameters predicted by Monte Carlo are p, Rf and t+

and a total of five parameters are considered identifiable for SC4, SC5, and SC6. This

is consistent with the variance and QRP methods under the sensitivity setting. This

again suggests that the sensitivity method can qualitatively diagnose variance behavior.

Parameters Ds,c, ka and kc remain with large variance and are unidentifiable.

In Figure 5.8 the marginal probability density functions (pdfs) for scenarios SC4 and

SC6 are presented. By comparing these pdfs with the pdfs of SC1 (Case 1) in Figure 5.6

it is clearly seen an improvement in the stability of the estimates. From Figure 5.8 it is

observed that there is no noticeable difference in the pdfs of scenarios SC4 and SC6 for the

most precise parameters p, Rf and t+. This indicates that, even if the spectrum of singular

values does not improve from SC4 to SC6 (as suggested by the sensitivity method), there

is additional information provided by the discharge rates I5 and I6. This information,

however, is still insufficient to determine the rest of the parameters (particularly ka, kc,

and Ds,c) which still present large variances.

84


Figure 5.8.: Case 2: Marginal pdfs for parameters obtained with Monte Carlo analysis for scenariosCase 2-SC4 (left) and Case 2-SC6 (right). (Figure taken from publication I - Lopezet al. (2016) in Appendix A.2 - reprinted from Industrial & Engineering ChemistryResearch with permission from American Chemical Society).

5.3.3. Case 3: Discharge curves and electrolyte concentration profile.

In Case 3 two observable outputs are used: the cell voltage Vcell(t) and the electrolyte con-

centration in the middle of the separator ce(x, t)|x=ℓa+ℓs/2. It is here considered scenarios

SC1, SC2, SC3, and SC4 that progressively consider the addition of new experimental

information but this time each experiment measures the two outputs. Accordingly, it is

obtained SC1 and SC4 by collecting the experiments at u1 = I1 and u4 = (I1, . . . , I4),

respectively. For consistency, it is used the same discharge rates of Case 1 and Case 2. It

should be noted that scenarios SC5 and SC6 are not here considered because the inclusion

of slow discharge rates have not demonstrated an extra aid on identifiability. Moreover,

these scenarios are highly demanding in computation.

In Figure 5.9, the model fitting for both variables after solving the parameter estimation

for SC1 are displayed. It should be highlighted the nonlinear response of the concentration

profile. Figure 5.10 shows the singular value spectra for all considered scenarios. Therein

it is found a considerable large lift in the spectrum even when only one experiment is used

(Case 3-SC1 compared to Case 1). The form of the new spectrum defines five singular

values as well-conditioned compared to three in Case 1. A small singular value of ς8 =

9.6113× 10−9, however, is still observed for Case 3-SC1. This small singular value makes

the condition number large (κ = 6.2796 × 1010) and the collinearity index large as well

(γ = 1.0404 × 108). When more experiments are added (Case 3-SC2 to Case 3-SC4) the

conditioning becomes better with an extra lift in the spectra and larger values for ς8. In

Case 3-SC4 seven well-conditioned singular values compared to the five in Case 2-SC4

85


^

Time (s)

0 500 1000 1500 2000 2500 3000 3500

Electro

lyte C

on

centra

tion

(mol/m

3)

1400

1500

1600

1700

1800

1900

2000

2100

2200

Cel

l V

olt

age

(V)

2.75

3

3.25

3.5

3.75

4

Figure 5.9.: Case 3: Voltage and electrolyte concentration profile at separator for scenario Case3-SC1. (Figure taken from publication I - Lopez et al. (2016) in Appendix A.2 -reprinted from Industrial & Engineering Chemistry Research with permission fromAmerican Chemical Society).

are observed. The effect of adding electrolyte concentration information is thus highly

beneficial from an ill-conditioning stand-point.

Case 1-SC1Case 3-SC1Case 3-SC2Case 3-SC3Case 3-SC4

∈=∈γ

10−3

10−2

10−1

100

101

102

103


ς1 ς2 ς3 ς4 ς5 ς6 ς7 ς8

Figure 5.10.: Case 3: Spectrum of singular values under different scenarios. (Figure taken frompublication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial &Engineering Chemistry Research with permission from American Chemical Society).

In Figures 5.11 and 5.12 it is further illustrated that the improvement in the ill-

conditioning is associated with the information supplied by the new observable variable.

By comparing the figures it can be seen that the cell voltage is excited in similar ways

for currents I1 and I4 while this is not the case for the electrolyte concentration. In par-

ticular, the electrolyte concentration presents rich and different dynamic responses at fast

and slow rates which aids the identification of parameters. From Figure 5.12 it is also

observed that the electrolyte concentration at the separator core ce(ℓa + ℓs/2, t) is highly

excited by parameters D, p and t+. Parameters Ds,c and kc also provide more excitation

in comparison with Cases 1 and 2 (it is recalled that Ds,c and kc remain unidentifiable in

Case 2).

86


dVCe

ll/dD

s,a

−2

−1

0

1

2

3

4I1I4

dVCe

ll/dD

s,c

−2

−1

0

1

2

3

4

dVCe

ll/dD

−2

−1

0

1

2

3

4

dVCe

ll/dk a

−2

−1

0

1

2

3

4

dVCe

ll/dk c

−2−1

01234

Time (s)1 10 100 10000

dVCe

ll/dp

−30

−25

−50

Time (s)1 10 10010000

dVCe

ll/dR f

−30

−25

−50

Time (s)1 10 10010000

dVCe

ll/dt +

−2−1

01234

Time (s)1 10 100 10000

Figure 5.11.: Sensitivity time profiles of cell voltage with respect to parameters dVcell(t)/dθ atnominal I1 and fast I4 discharge rates for scenario Case 3-SC4. (Figure taken frompublication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial &Engineering Chemistry Research with permission from American Chemical Society).

dce/d

Ds,a

−2

−1

0

1

2

I1I4

dce/d

Ds,c

−2

−1

0

1

2

dce/d

D

−20

−15

0

5

dce/d

k a

−2

−1

0

1

2

dce/d

k c

−2

−1

0

1

2

Time (s)1 10 100 10000

dce/d

p

−200

20406080

100

Time (s)1 10 10010000

dce/d

R f

−2

−1

0

1

2

Time (s)1 10 10010000

dce/d

t +

−5

0

5

10

15

Time (s)1 10 100 10000

Figure 5.12.: Sensitivity time profiles of electrolyte concentration in the separator with respect toparameters dce(ℓa + ℓs/2, t)/dθ at nominal I1 and fast I4 discharge rates for scenarioCase 3-SC4. (Figure taken from publication I - Lopez et al. (2016) in Appendix A.2- reprinted from Industrial & Engineering Chemistry Research with permission fromAmerican Chemical Society).

87


The observations for Case 3 are further validated by using Monte Carlo. In Table 5.7

confidence interval lengths are presented and in Figure 5.13 the pdfs for each parameter

for scenarios SC1 and SC4 are depicted. Therein it is observed an important reduction

in the confidence intervals compared to Case 2 presented in Table 5.7. In particular,

parameters Ds,c and kc have large variances and are unidentifiable in Case 2-SC4 while

these parameters become identifiable in Case 3-SC4. These results confirm observations

obtained from the sensitivity analysis presented in Figures 5.11 and 5.12. It can be also

concluded that voltage and electrolyte concentration at a single location is sufficient to

reliably identify 90% of the parameters.

An intriguing finding of this study is that the anode reaction constant ka remains highly

unstable, despite the addition of electrolyte concentration information. This is particu-

larly evident from the marginal pdfs presented in Figure 5.11. This is also confirmed by

the spectrum analysis which gives only seven well-conditioned singular values. From the

sensitivity profiles it can be seen that this parameter indeed excites the output variables.

This, however, does not seem sufficient to reliably estimate the parameter. This situation

can be explained from the observations made by the authors in Ref. 103; who note that

very strong variations (orders of magnitude) of the anode reaction constant are needed to

have an impact on the voltage curve. Our results indicate that, at its nominal value, this

parameter has little influence on the discharge curve. It is possible, however, that around

another nominal point such insensitivity disappears. This issue will be explored in future

work.

Finally, it should be noted that any change in the number, type or quality of exper-

imental information could be analyzed by using the techniques outlined in this study.

Doing so, a better selection of experimental data with the target to increase the number

of identifiable parameters can be done. That means, only if the change in experimental

information gives evidence of ill-conditioning and identifiability improvements should be

taken into consideration (for experimenters and modelers) to increase the quality of the

model parameters.

5.3.4. Computational Issues

The PDAE system is implemented in Matlab and is discretized by using 11 points in the

axial direction for each region and 11 points for the radial direction by using the method

of lines according to Ref. 85. The discretization routines functions dss010 and dss044

in MATLAB Release 2013a (The MathWorks Inc., Natick, Massachusetts, United States)

are used. The routine dss010 computes a tenth-order finite difference approximation of a

first-order derivative. Whereas the routine dss044 computes a fourth-order approximation

of a second-order derivative. In contrast to the solution reported in Ref. 85, the result-

ing DAE system containing 561 equations is here solved by using the integrator IDAS

from SUNDIALS which provides forward and adjoint sensitivity analysis capabilities [55].

88


Figure 5.13.: Case 3: Marginal pdfs for scenarios Case 3-SC1 (left) and Case 3-SC4 (right). (Figuretaken from publication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Indus-trial & Engineering Chemistry Research with permission from American ChemicalSociety).

The parameter estimation problems are solved using the nonlinear least-squares routine

lsqnonlin of MATLAB (trust-region-reflective algorithm). All results were obtained on

a Intel (R) Core(TM) i7-4770K CPU running at 3.50GHz and with 32.0 GB of available

RAM memory.

Table 5.10.: Computational results for simulation and parameter estimation problems. (Tabletaken from publication I - Lopez et al. (2016) in Appendix A.2 - reprinted fromIndustrial & Engineering Chemistry Research with permission from American Chem-ical Society).

Simulation Estimation

Discharge Rate Time [seconds] Instance Time [seconds] Iterations

1.0 C 6.1 Case 2-SC1 476.6 732.0 C 5.4 Case 2-SC2 711.8 543.0 C 4.3 Case 2-SC3 520.6 314.0 C 3.6 Case 2-SC4 343.6 180.5 C 6.6 Case 2-SC6 3751.8 320.1 C 48.1 Case 3-SC1 388.8 70

Case 3-SC2 664.0 57Case 3-SC4 915.2 47

The computational results are summarized in Table 5.10. The average simulation time

(including the time to compute the consistent initial solution) was in the range of 3-7

seconds except for the slow discharge rate which required 48 seconds. This is because

the slow discharge rate requires a long integration time to reach the cut-off voltage. The

number of iterations required to solve the parameter estimation problems is decreased as

experimental information is added because the problem becomes better conditioned and

the optimization algorithm can more easily identify a solution. The only exception is the

Case 2-SC6 which has more iterations. It can be attributed to the lack of information in-

89


troduced by scenario SC6 at slow discharge rates. The solution times for Case 2 are longer

than those of Case 3 because the latter requires the computation of additional sensitivity

information. A Monte Carlo procedure with 200 replications for Case 3-SC4 currently

requires around two days to complete. It is noted, however, that these replications can

be performed in parallel and can potentially reduce the time down to fifteen minutes. In

addition, simulations for estimation problems with multiple experiments can also be per-

formed in parallel. For Case 3-SC4 this could reduce the time to four minutes. Parallel

parameter estimation approaches have been proposed in Refs. 37 and 135. The use of

reduced models, as those proposed in Ref. 103 will be also investigated.

5.4. Conclusions and Future Work

The computational framework in Chapter (4) that includes sensitivity and Monte Carlo

methods to evaluate quality of parameter estimates, detect ill-conditioning issues and

diagnose identifiability problems to the Lithium Battery model was applied. It was here

demonstrated that sensitivity methods could qualitative detect unidentifiable parameters

using only structural information of the sensitivity matrix. This provided an advantage

over the most rigorous but also more expensive Monte Carlo method. This framework

could be easily used to determine if other source of experimental information are indeed

aiding the ill-conditioning and identifiability of any battery model.

In this case study the analysis indicated that cell voltage profile information collected

from constant discharge experiments only enabled the estimation of a small parameter

subset. The incorporation of electrolyte concentration profiles at a single axial point

was sufficient to estimate seven of eight parameters: the Li+ diffusion coefficient in the

solid particle of anode, the Li+ diffusion coefficient in the solid particle of cathode, the

salt diffusion coefficient in the electrolyte, the reaction rate constant in the cathode, the

Bruggman coefficient, the film resistance at the anode, and the transport number. The

only unidentifiable parameter was the reaction rate constant in the anode.

As part of future work, the framework will be used to investigate impacts of other

sources of experimental information (including type of measures, measuring error, sam-

ple frequency, etc.) on identifiability and ill-conditioning of constant and time-varying

parameters. On the other hand, it is highlighted that methods here described might be

also used to address similar issues associated with the estimation of not only constant

parameters but also time-varying parameters. Moreover, another strategy different to con-

stant discharge current as constant power mode will be also analyzed. An implementation

tailored to high-performance computing architectures to accelerate analysis times will be

developed as well. Moreover, it will be evaluated the benefit of regularizing the parameter

estimation as a direct way to deal with the ill-conditioning of the sensitivity matrix.

90

6. Bioethanol: Identifying an

over-parameterized model with large

parameter correlations

6.1. Abstract

1 Structure and parameter identification of nonlinear models of biological reaction net-

works is usually over-parameterized with large correlations among parameters. Hence, the

related inverse problems for parameter determination are mathematically ill-posed and nu-

merically difficult to solve. This chapter is aimed at the parameterization and validation of

a highly non-linear process model for the Simultaneous Saccharification and Fermentation

(SSF) process for producing ethanol from lignocellulosic waste materials. To do so, several

components of the computational framework outlined in Chapter 4 such as tracking of an

adequate initial guess, iterative parameter estimation, ill-conditioning analysis, local iden-

tifiability analysis, implementation of the regularization technique called parameter subset

selection (SsS) and estimator precision evaluation are here addressed. Model selection and

reduction are executed, although incipiently. The ill-conditioning analysis and the identifi-

able parameter selection is based on the analysis of the sensitivity matrix by rank-revealing

factorization methods. Using this, a reduction of the parameter search space to a reason-

able subset, which can be reliably and efficiently estimated from available measurements,

is achieved. The successful application of the iterative regularized parameter estimation

with ill-conditioning and identifiability analysis to the SSF process finds a relatively large

reduction in the identified parameter space. It is shown by a cross-validation that using

the practically identified parameters (even though the reduction of the search space), the

model is still able to properly predict the experimental data. Moreover, it is shown that

the model is easily and efficiently adapted to new process conditions by solving reduced

and well-conditioned problems.

1The content of this chapter is reprinted (adapted) with permission from (D. C. Lopez C., T. Barz,M. Penuela, A. Villegas, S. Ochoa, and G. Wozny. Model-based identifiable parameter determinationapplied to a simultaneous saccharification and fermentation process model for bio-ethanol production.Biotechnology Progress, 29(4):1064-1082, 2013). Copyright (2013) John Wiley and Sons. (PublicationII in Appendix A.2)

91

6. Bioethanol: Identifying an over-parameterized model with large parameter correlations

6.2. Bio-ethanol from cane bagasse by SSF process

Bio-ethanol has been an excellent candidate for the replacement of traditional fossil fuels

by alternative sources of liquid fuels. However, bio-ethanol made from sugar sources,

such as starch, corn or sugar cane has not been competitive with traditional fuels from

petrol because of relatively high costs and the complicated situation of the food security.

Therefore, alternative raw materials, for instance (ligno) cellulosic waste materials such as

crop residues, grass, wood ships, wheat straw or bagasse, which can act as an abundant and

cheap source of fermentable sugars, have become increasingly interesting [26, 36, 76, 95].

Experimental data were taken from the SSF process using sugarcane bagasse as biomass

[95]. The SSF process combines enzymatic hydrolysis with ethanol fermentation to keep

the concentration of glucose low. Among the various cellulose bioconversion schemes

this process is considered to be the most promising regarding its efficiency [26, 35]. If

fermentation occurs simultaneously with saccharification, glucose produced during the

saccharification will be rapidly converted into ethanol, reducing the inhibition caused by

high sugar concentration, and therefore the hydrolysis rate can be accelerated [92]. In

comparison with the process where these two stages are sequential, the SSF method enables

attainment of higher (up to 40%) yields of ethanol by removing end-product inhibition, as

well as by eliminating the need for separate reactors for saccharification and fermentation

[76].

Other advantages of this SSF are an easier operation, a shorter fermentation time, a

reduced risk of contamination with undesired microorganisms, due to the high tempera-

ture of the process, the presence of ethanol in the reaction medium, and the anaerobic

conditions and a lower equipment requirement than the sequential process since no hy-

drolysis reactors are needed [76, 23]. In spite of the clear advantages presented by the

SSF, the inconvenience is that there exist different optimal conditions for hydrolysis and

fermentation, which implies a difficult control and optimization of process parameters [23].

Besides, ethanol itself and some toxic substances arising from the pretreatment of the

lignocelluloses inhibit the action of fermenting microorganisms, as well as the cellulase

activity [76]. On the other hand, some compounds (e.g., proteolytic enzymes) that are re-

leased on cell lysis or are secreted by a particular strain can degrade the cellulase affecting

the microorganism-enzyme compatibility. On the whole, several process parameters must

be optimized: substrate concentration, enzyme to substrate ratio, dosage of the active

components (β-glucosidase ) in the enzymatic mixture, and yeast concentration [76].

In the experimental case study, parameters of a model for bio-ethanol production from

sugarcane bagasse in a SSF process were determined.

6.2.1. Experiments

Table 6.1 shows the five different experiments E1, E2, E3, E4, and E5 considered in

this case study. Experiments E1, E2, E3 and E4 were reported by Ref. 95 and carried

92


out in the Bioprocess Development Laboratory in chemical school of Federal University

of Rio de Janeiro. Experimental conditions of E1-E4 were defined in Ref. 95 by using

the analysis of variance method (ANOVA) in factorial designs (e.g. 23 and 34 designs)

and subsequent optimizations with response surface analysis. Those experiment designs

were addressed: a) to reveal the significant factors on the enzymatic hydrolysis (analyzed

factors: temperature, pH, enzymatic load and cellulignin content), and b) to investigate

the responses of the SSF process under the manipulation of the corresponding three main

factors (i.e. cellulignin content, pre-hydrolysis time and initial yeast). Experiment E5

was achieved in the Bioprocesses Laboratory in the Engineering Faculty of University of

Antioquia, Colombia; this experiment was designed using the optimal experimental con-

ditions identified by Ref. 95 for cellulignin content and pre-hydrolysis time but changing

the values of all remaining input factors of the SSF process in Table 6.1.

Table 6.1.: Experimental Conditions for Bio-Ethanol Production from Sugarcane Bagasse in theSSF Process. (Table taken from publication II - Lopez et al. (2013) in Appendix A.2- reprinted from Biotechnology Progress with permission from American Institute ofChemical Engineers).

EXPERIMENT

Cellulignin

dry solid

content

(w/w)

Pre-

hydrolysis

time (h)

Enzymatic load*Initial

yeast, Cx0

(g/L)

Experiment

duration

(h)GC-220

(FPU/g)

β-glucosidase

(UI/g)

1 E1 30% 12 26 17 6 33

2 E2 20% 8 26 27.3 6 30

3 E3 20% 8 26 - 6 33

4 E4 20% 12 30 - 2 33

5 E5 30% 12 35 - 3 33

* With respect to cellulignin dry mass (g)

Each experiment was repeated twice and a very good repeatability was obtained for

all experiments. Experimental data E2 and E3 (E2 & E3) were used in the parameter

estimation, and experiments E1, E4, and E5 were considered for the cross-validation of

the identified model.

The common characteristics of all five experiments are listed below:

1. The lignocellulosic material (cellulignin) was a pretreated residue from sugarcane

(i.e., sugarcane bagasse).The cellulignin was obtained by acid hydrolysis of sugarcane

bagasse from which the hemicellulosic fraction was removed. This resulting solid

residue was pretreated for increasing the accessibility of enzymes to cellulose by a

partial removal of the lignin. The pretreatment of lignocellulosic biomass and the

measurements of enzymatic load were conducted as described in Ref. 96

2. The initial suspension contained dry weight solid composed of the lignocellulosic

material considering a content of cellulose of 67% (w/w).

93


3. Before the SSF process, an enzymatic pre-hydrolysis at 50 C was carried out to

allow for the buildup of fermentable glucose.

4. Commercial enzymatic preparations with GC 220-Genencor and β-glucosidase (ac-

tivities of 104 FPU/mL and 439 IU/mL, respectively) were used; the concentration

of protein per mL of GC 220 was 220 mg/mL and 109 mg/mL of β-glucosidase.

Enzymatic load in Table 6.1 makes reference to cellulignin dry mass in gram.

5. After pre-hydrolysis stage, microorganisms were added at 37 C; the microorganism

was a commercially available Saccharomyces cerevisiae (Fleischmann’ yeast).

6. Glucose, cellobiose, and ethanol concentrations were determined by high-

performance liquid chromatography (HPLC) using a ShodexSCIOlI ion exchange

column for sugars (300 × 8 mm2; Shoko Co., Tokyo) at 80 C as stationary phase

and degassed Milli-Q (Molsheim, France) water as the mobile phase at a flow rate

of 0.6 mL/min 95. Standard deviations of cellobiose, glucose, and ethanol concen-

trations in the HPLC were 0.022 g/L, 0.016 g/L, and 0.020 g/L, respectively.

6.2.2. Modeling

The SSF process is described by a generic dynamic model taken from Ref. 35 4 which

is a compilation of models of different complexity presented in Refs. 36, 99, 98, 97.

For parameterizing these models, several experimental substrate-enzyme-microorganism

systems were used in literature depending on the goal of the study. For instance, the

systems of Avicel-Cellubrix 2-Saccharomyces cerevisiae [35] and waste paper-Econase 3-

Saccharomyces cerevisiae [97] were used for parameterizing both process hydrolysis and

fermentation, whereas the systems Avicel-Cellubrix 4-Saccharomyces cerevisiae and wheat

straw-Cellubrix 5-Saccharomyces cerevisiae were used only for the hydrolysis stage [36] and

the system glucose-Brettanomyces custersii system only for the fermentation stage [99].

The generic dynamic model considers the four most influencing factors for the kinetics of

SSF process, namely, the cellulosic substrate concentration, the cellulase and β-glucosidase

enzyme system, the substrate-enzyme interaction, and the enzyme-yeast interaction. The

simplified reaction mechanisms are presented in Figure 6.1, where cellulose is simulta-

neously hydrolyzed to cellobiose (υ1 as production rate of cellobiose from cellulose by

cellulase) and glucose (υ3 as production rate of glucose from cellobiose by β-glucosidase),

cellobiose is then converted to glucose (υ2 as production rate of glucose from cellulose

by cellulase), and glucose is catabolized to ethanol, biomass, and carbon dioxide by the

fermentative microorganism. Yeast growth and glucose consumption rates are expressed

2Enzyme complex of cellulase and β-glucosidase of Novozymes Corp., Denmark.3Enzyme complex of cellulase and β-glucosidase of Enzyme Development Corp., New York, NY.4Enzyme complex of cellulase and β-glucosidase of Novozymes Corp., Denmark.5Enzyme complex of cellulase and β-glucosidase of Novozymes Corp., Denmark.

94


by υ4 (i.e., biomass production rate) and υ5 (i.e., substrate consumption rate), respec-

tively. The hydrolysis of cellulose by cellulase is a reaction that takes place on the surface

of the insoluble substrate (heterogeneous catalysis), whereas hydrolysis of cellobiose by

β-glucosidase is carried out in the aqueous phase (homogeneous catalysis).

Figure 6.1.: Simplified reaction mechanisms in SSF processes [36]. (Figure taken from publicationII - Lopez et al. (2013) in Appendix A.2 - reprinted from Biotechnology Progress withpermission from American Institute of Chemical Engineers).

The dynamic behavior of cellulose (Cc ), cellobiose (Ccb ), glucose (CG ), biomass (CX

), and ethanol (CEtOH ) concentration during a batch SSF process are described by the

following mass balances in Eqs. 6.1-6.6. The change in enzyme concentration (CE) [36] is

presented in Eq. 6.6.

∂Cc

∂t= − [υ1 + υ3] (6.1)

∂Ccb

∂t= 1.056υ1 − υ2 (6.2)

∂CG

∂t= 1.053υ2 − 1.111υ3 + υ5 (6.3)

∂CX

∂t= υ4 (6.4)

CEtOH = −0.511 [1.111(Cc − Cc0) + 1.053(Ccb − Ccb0) + (CG − CG0) + (CX − CX0)]

(6.5)

∂CE

∂t= −KDCE (6.6)

The kinetics of the hydrolysis model (υ1 , υ2 , and υ3 expressed by g L−1 h−1 ) are

presented in Eqs. 6.7 - 6.9, in which kmax,i with i = 1, · · · , 3, is the maximum specific

95


rate at full saturation of the substrate with enzyme in the respective reaction. The active

amount of enzyme, for reactions with cellulose as substrate (υ1 and υ3 ), is assumed to be

determined by enzyme adsorption onto the cellulose substrate according to the principles

of heterogeneous catalysis (Langmuir adsorption constant KL ). Inhibition by glucose to

cellulase and β-glucosidase was assumed through K1,G and K2,G . For all three reactions,

υ1 , υ2 , and υ3 in Figure 6.1, the zero-order rate constant is given as a function of

temperature (activation energy Ea ); in addition, all enzyme activity is assumed to be

subject to thermal inactivation (KD ). Moreover, inhibition of cellulase by ethanol is

assumed to affect the rates of reaction υ1 and υ3 by inhibition constant K1,EtOH [35].

The thermal inactivation constant KD follows an Arrhenius type relationship, KD(T ) =

ADe−∆H

T . Reductions in the glucose production originated by changes in nature of the

cellulose substrate during the SSF process is considered using a recalcitrance constant Krec

[35].

υ1 =

[kmax,1 ·

CE

KL + CE

]· Cc

[K1,G

K1,G + CG

]·[e−KD(T )·t

]·

[e

−EaRT

e−Ea

RTref

]·[

K1,EtOH

K1,EtOH + CEtOH

]·[e−Krec·

(1− Cc

Cc0

)] (6.7)

υ2 = [kmax,2 · eg · eT ] ·

⎡⎣ Ccb

Km

(1 + CG

K2,G

)+ Ccb

⎤⎦ · [e−KD(T )t]·

[e

−EaRT

e−Ea

RTref

](6.8)

υ3 =

[kmax,3 ·

CE

KL + CE

]· Cc

[K1,G

K1,G + CG

]·[e−KD(T )·t

]·

[e

−EaRT

e−Ea

RTref

]·[

K1,EtOH

K1,EtOH + CEtOH

]·[e−Krec·

(1− Cc

Cc0

)] (6.9)

For modeling glucose consumption and biomass formation, standard Monod kinetics ex-

panded to include ethanol inhibition on yeast [35] were assumed. Yeast growth rate (υ4 )

and substrate consumption rate (υ5 ) are described in Eqs. 6.10 and 6.11.

υ4 =µmax · CG

KG + CG· CX ·

[Kiy,EtOH

Kiy,EtOH + CEtOH

](6.10)

υ5 = −υ4Yxg− CX ·ms (6.11)

In the above depicted model Eqs. 6.1 - 6.11, neither the inhibition by released com-

pounds on cell lysis or secreted by a particular strain on cellulases, nor the inhibition by

components in the enzyme preparation which might reduce microbial viability leading to

cell lysis [76] were considered. Moreover, the inhibition by some toxic substances arising

from pretreatment of the lignocellulose on fermenting microorganisms, as well as on the

cellulase activity [76] was also not formulated in the mentioned model.

96

6.3. Results and discussion


All computations were carried out in Matlab 2008 R⃝. The solver ode15s was used for

integration of the process model, nonlinear regression was performed using the function

nlsqnonlin and the Levenberg-Marquardt least squares minimization algorithm. The pa-

rameter set of the DAE system Eqs. 6.1-6.6 analyzed in this chapter is given in Table 6.2,

where the parameter set dimension is Nθ = 14 . An analysis of the concentration mea-

surements showed that the errors for all three analyzed components (cellulose, cellobiose,

and glucose) were in the same range with no correlations between measurements. Thus,

the covariance matrix of experimental errors Cy was set to a diagonal matrix based on

the standard deviations reported in Section 6.2.1. Measured variables in the experiment

conducted by [95] were cellobiose Ccb, glucose CG, and ethanol CEtOH concentrations such

that the experimental data vector was Y m = (Ccb, CG, CEtOH)T with Ny = 3. E2 and E3

described in Section 6.2.1 and summarized in Table 6.1 were considered simultaneously in

the parameter estimation. With 15 sampling points for each measured variable in E2 and

18 points in E3, the total number of sampling points, for the experimental set from now

on named E2&E3, is then Nm = 99. Because of the difference between the magnitudes

of each parameter involved in this problem, the i, j-th element of the sensitivity matrix S

was normalized as follows

sij =∂yi∂θj

max(|θj | , θtrsh)max(|yi| , ytrsh)

, (6.12)

where θtrsh and ytrsh are user-defined thresholds, e.g., as a function of the machine preci-

sion root square√ϵmach. The normalized sensitivity matrix S was then used to perform

the ill-conditioning and local identifiability analysis. The methodology used to determine

and analyze model parameters was the sensitivity method (see Section 4.2.1). The stages

“Experimentation“, “Modeling“ and “Parameter estimation“ of the consolidated computa-

tion framework depicted in Figure 4.1 were executed. In this case study the optimal

experimental design was not conducted. The results for the different mentioned stages

of the algorithm are discussed in the following. It should be point out that for the local

identifiability analysis only the QR method 4.4.2 is considered whilst the SsS approach is

applied as regularization technique.

6.3.1. Model selection

In the first step of the “Modeling“ stage in Figure 4.1, an appropriate model for the

SSF process was selected. For doing so, several model candidates were proposed. These

candidates had their origins in the formulation of the SSF models proposed by Refs. [36,

35]. Additional changes of these preliminary candidates were analyzed. For each candidate

structure a parameter estimation was solved using the same experimental data Y m of

all experiments described in Table 6.1. In order to identify the best model structure

97


the selection of the best candidate was based on the model fitting performance, i.e, the

minimum value of the cost function in Eq. 2.5b.

The model in Eqs. 6.1-6.11 was subject to three main changes. The first two changes

affected production rates with cellulose as substrate (i.e., υ1 and υ3), whereas the last

change affected production rates with cellobiose as substrate (i.e., υ2). The summary of

the referenced changes are given in the following:

• Change 1: the factor Cc was removed from Eqs. 6.7 and 6.9. Moreover, the

new lumped parameters k∗max,1 and k∗max,3 in Eq. 6.13 with units g L−1 h−1 were

used for production rates υ1 and υ3. This change may be supported by assuming

that the hydrolytic system is independent from substrate concentration because all

experiments were conducted with higher initial substrate concentrations (Cc0 taking

values of 133.3 and 200 g/L, see 6.1) compared with values in Ref. [36] (i.e., Cc0 = 40

g/L). Having so, it is considered that the whole time there was an excess amount of

cellulignin available for the enzymes.

υi =

[k∗max,i ·

CE

KL + CE

]·[

K1,G

K1,G + CG

]·[e−KD(T )·t

]·

[e

−EaRT

e−Ea

RTref

]·[

K1,EtOH

K1,EtOH + CEtOH

]·[e−Krec·

(1− Cc

Cc0

)], i = 1, 3

(6.13)

• Change 2: the term making reference to as substrate recalcitrance effect using the

parameter Krec was removed from Eqs. 6.7 and 6.9. This term had been previously

considered in Ref. [35] for explaining some reductions in the glucose production

according to its experimental data. The reason for the deletion was that a decrease

in glucose production by this effect was not expected in the current system due to

the fact that the cellulose substrate used in each experiment of Table 6.1 was highly

pretreated, which physically transformed the structure of the sugarcane bagasse (see

Ref. [95]), making it more susceptible to the enzymatic action, and thus reducing

the possibility for this event to take place. Values of Krec close to zero obtained

by solving initial parameter estimation problems (results here not shown) confirmed

this assumption. With this additional change, the Eq. 6.13 takes the next form:

υi =

[k∗max,i ·

CE

KL + CE

]·[

K1,G

K1,G + CG

]·[e−KD(T )·t

]·

[e

−EaRT

e−Ea

RTref

]·[

K1,EtOH

K1,EtOH + CEtOH

], i = 1, 3.

(6.14)

• Change 3: the ethanol inhibition proposed by Ref. [99] was here included by using

the constant K2,EtOH , which affects the rate of the reaction of cellobiose to glucose

98


(i.e., υ2) in Eq. 6.8. The new formulation is shown in Eq. 6.15:

υ2 = [kmax,2 · eg · eT ] ·

⎡⎣ Ccb

Km

(1 + CG

K2,G

)+ Ccb

⎤⎦ · [e−KD(T )t]·

[e

−EaRT

e−Ea

RTref

]·

[K2,EtOH

K2,EtOH + CEtOH

] (6.15)

The finally selected model including the three changes described above is composed

of Eqs. 6.1-6.6, 6.10, 6.11, 6.14, and 6.15. Figure 6.2 shows the experimental data of

E1 and the adjusted predictions of the original model proposed by Ref. [35] (shown

in Eqs. 6.1 - 6.11) and the adjusted predictions of the finally selected model with cost

function of 33.1 and 18.5, respectively. The parameter values for the thermal inactivation

0

20

40

60

80

100

0 10 20 30

Co

nce

ntr

ati

on

(g

/L)

Time (h)

Cellobiose Glucose

Ethanol Ccb Model

CG Model CEtOH Model

0

20

40

60

80

100

0 10 20 30

Time (h)

Figure 6.2.: Model fitting to experimental data E1 by using (left panel) the original model proposedin Ref. [36] and (right panel) using the finally selected model after the model selectionstep. (Figure taken from publication II - Lopez et al. (2013) in Appendix A.2 -reprinted from Biotechnology Progress with permission from American Institute ofChemical Engineers).

constant KD(T ) = AD · e∆H/T as well as the activation energy Ea in Eqs. 6.14 and 6.15

of the final model were maintained constant taking into account that these parameters

had limited impact on the solution according to previous sensitivity analysis. Reference

values were taken from Ref. [35] i.e., ∆H = 1.48 × 105 J mol−1, AD = 3.64 × 1018 h−1,

and Ea = 2.98 × 104 J mol−1. On the other hand, the initial enzyme concentration CE0

at t = 0 and values of eT and eg depended on the specific enzyme dosage of the specific

experiment. For both experiments E2 and E3 the value of CE0 was 5200 FPU/L, whereas

values of eT were 5.35 g/L and 4.58 g/L, and values for eg were 2,900 U/g and 1,800 U/g

for experiment E2 and E3, respectively.

99


6.3.2. Parameter initial guess selection

There are different options for defining an initial parameter guess (see Section 4.6.2) for

the initialization of the computational framework in Figure 4.1. Here, the first attempt

was to use literature values taken from Ref. [35] and [97] summarized in Table 6.2 as

IGDrissen and IGPhilippidis, respectively. The solutions of the PE problem in Eq. 2.5

Table 6.2.: Initial guesses taken from literature and generated by MBLHD. (Table taken frompublication II - Lopez et al. (2013) in Appendix A.2 - reprinted from BiotechnologyProgress with permission from American Institute of Chemical Engineers).

PARAMETER UNITIGDrissen IGPhilippidis IG4 IG22

q0 q0 q0 q0

1 kmax,1 h-1 0.081 0.0827 3.49 21.50

2 kmax,2 gU-1h-1 0.0108 0.00406 4.42 0.92

3 kmax,3 h-1 0.058 0.0834 11.5 8.5

4 ms h-1 0.02 0 0.884 0.183

5 Yxg gg-1 0.11 0.113 0.050 0.750

6 mmax h-1 0.25 0.19 0.582 2.083

7 KG gL-1 0.0252 0.000037 150 683

8 KL FPUL-1 18.2 544.89 984 917

9 K1,G gL-1 6.3 53.16 275 75

10 K1,EtOH gL-1 95 50.35 15.0 45.0

11 Kiy,EtOH gL-1 50 50 48.3 81.7

12 Km gL-1 1.92 10.56 425 358

13 K2,G gL-1 0.54 0.62 58.2 491.8

14 K2,EtOH* gL-1 - - 225 108

simultaneously considering experimental data from E2 and E3 and starting from initial

values IGDrissen and IGPhilippidis are shown in Figure 6.3. It can be seen that extremely

low values for Ccb , CG , and CEtOH compared with measured concentrations were calcu-

lated after successful termination of the parameter estimation algorithm. The worst result

is obtained for IGPhilippidis , with values of the cost function CFIGDrissen = 191.6 and

CFIGPhilippidis = 290.5 (see Table 6.2). The poor fitting might be attributed to remark-

able differences between the substrate, enzymatic load, and operating conditions used in

the literature experiments compared with experiments E2 and E3. Moreover, differences

exist between parameters in model from literature and the selected model because of the

dissimilarity in the model structure, (e.g., kmax,1 and kmax,3 in Table 6.2 are different

from k∗max,1 and k∗max,3 in Eq. 6.13); consequently, their true parameter values are far

away from that considered in literature [35, 97]. The second strategy for selecting the

parameter initial guess was the data collection plan MBLHD (see Section 2.8.1). This

method allows to sample different parameter guesses within a prescribed parameter range.

From the parameter range an by using MBLHD(θ) 30 initial guesses (i.e., NIG = 30) were

generated and their corresponding parameter estimation problems were solved. In order to

select the most appropriate initial guess the procedure in Section 4.6.2 was applied. The

cost function CF of the parameter estimation was the model fitting measure, whereas the

100


0

20

40

60

80

100

0 10 20 30

Time (h)

0

20

40

60

80

100

0 10 20 30

Con

cen

trati

on

(g/L

)

Time (h)

Cellobiose Glucose Ethanol

Ccb (Drissen) CG (Drissen) CEtOH (Drissen)

Ccb (Philippidis) CG (Philippidis) CEtOH (Philippidis)

Figure 6.3.: Fitting of experimental data using the model selected in step 1: (left panel) resultsfor E2 and (right panel) results for E3. Different initial guesses, IGDrissen[35] andIGPhilippidis [97], were used for the solution of the parameter estimation problemswhere measured data from E2 and E3 was considered simultaneously. (Figure takenfrom publication II - Lopez et al. (2013) in Appendix A.2 - reprinted from Biotechnol-ogy Progress with permission from American Institute of Chemical Engineers).

numerical rank rϵ (see Section A.5.11) of the sensitivity matrix was the ill-conditioning

measure. The best initial guess was that with minimum CF and maximum rϵ.

The obtained CF values as well as the corresponding numerical rank rϵ are shown in

Figure 6.4. It should be noted that points where no convergence could be reached were not

shown in Figure 6.4 (i.e., IG8, IG11, and IG27). Additionally, the dashed line indicates

a user defined upper bound of the cost function (“CF Bound=50“), where all IGi smaller

than this value generated an appropriate fitting to experimental data. In that context the

best initial guess was IG4 with the lowest CF value, i.e., CF4 = 27.0, however it did not

have a large numerical rank, i.e., rϵ = 8. On the other hand, in terms of ill-conditioning,

an initial guess is adequate if it promotes a matrix as well-conditioned as possible, in this

case study that means a system with a large numerical rank rϵ. The largest numerical rank

among the evaluated initial guess set was rϵ = 12 achieved by IG22. In this case, IG22 had

also an acceptable fitting performance (CF12 = 40.1 ≤ CF Bound). With the requirements

of fitting and ill-conditioning accomplished, IG22 was selected as the appropriate initial

guess to continue the parameter determination as θIG. Notice that IG3 could also be the

selected as a adequate initial guess. In summary, the selection of the appropriate initial

guess for nonlinear models follows the idea to generate an acceptable model fitting to the

data but also the highest performance of well-posedness.

101


0

2

4

6

8

10

12

14

1

10

100

1000

Nu

meric

al

ran

k (

rεε εε)) ))

Cost

fu

ncti

on

(C

F)

Initial Guess

CF CF Bound Rank

IG4

IG22

Figure 6.4.: Cost function (CF) of parameter estimation and numerical rank (rϵ) of the sensitivitymatrix obtained for 30 different initial guesses generated by MBLHD considering datafrom E2 and E3. The maximum acceptable cost function value to accept an initialguess is denominated “CF Bound“. (Figure taken from publication II - Lopez et al.(2013) in Appendix A.2 - reprinted from Biotechnology Progress with permission fromAmerican Institute of Chemical Engineers).

6.3.3. Iterative parameter estimation with structural analysis

In this case study an iterative procedure for estimating parameters including structural

analysis, i.e., ill-conditioning and identifiability, was performed. The ill-conditioning di-

agnosis was conducted based on the sensitivity method (see Section 4.4.1). Accordingly,

the computation of the condition number κ, collinearity index γ and numerical rank rϵ

of the sensitivity matrix were the ill-conditioning measures (see Section 3.2.1). Regarding

the identifiability analysis the local QR method detailed in Section 4.4.2 of the Chapter

4 was implemented. The parameter estimation was then regularized by using the SsS ap-

proach (see Section 3.3.1). The integration of the structural analysis in the iterative cycle

of parameter estimation permits to find those parameters which are actually identifiable

in the system (reducing the dimension of the parameter vector) according to their linear

independence. The new reduced problem is then well-conditioned and its solution has a

better accuracy than the initial problem including all parameters and their correlations.

Accordingly, when SsS regularization is applied, unidentifiable parameters associated to

the ill-conditioned singular values are fixed at prior estimates (referred to as “inactive“

parameters) and reduced-order problems are considered for the determination of the re-

maining “active“ parameters.

Figure 6.5 shows the results of 6 iterations of parameter estimation and structural anal-

ysis. Four parameters (i.e., µmax, kmax,2, Km, K1,G) were finally identified using data

102


from experiments E2 and in the 6th iteration E3 6. The iterative process stopped be-

cause the ill-conditioning measures κ and γ in the current iteration k were smaller than

their thresholds (κmax = 1000 and γmax = 15) and the rank of the current iteration rk

equaled the rank of the immediately before iteration rk−1. In Figure 6.5, for each itera-

tion k = 1, · · · , 6, the following results are given: cost function value (CFk ), condition

number ( κk), collinearity index (γk), rank of sensitivity matrix (rK), the current esti-

mated parameter vector (θrk−1), its relative standard deviation (%σθ) and its variance

(σ2θ). The respective last row for the k-th iteration in Figure 6.5 contains the the new

ordered parameter vector θ =((θ(rk))T , (θ(Nθ−rk))T

)T, such that the most identifiable

parameter is located at the first position (shady cells in Figure 6.5 contain the elements of

identifiable parameter (θ(rk))T and the lowest identifiable parameter is at the last position

(elements of (θ(Nθ−rk))T include the current nonidentiable parameters θ(rk−1−rk) and those

nonidentifiable obtained in previous iterations).

6It has to be noted that in order to overcome local solutions, the identifiable parameter subset of reducedPE problems was initialized using its corresponding elements from θ0 = IG22.

103


pa

r(1

)p

ar(2

)p

ar(3

)p

ar(4

)p

ar(5

)p

ar(6

)p

ar(7

)p

ar(8

)p

ar(9

)p

ar(1

0)

pa

r(1

1)

pa

r(1

2)

pa

r(1

3)

pa

r(1

4)

k*

ma

x,1

km

ax

,2k

*m

ax

,3m

sY

xg

µ µµµm

ax

KG

KL

K1

,G

K1

,EtO

HK

iy,E

tOH

Km

K2

,GK

2,E

tOH

CF

140

.1

T

11.2

0.0

97

8.1

52.2

9E

-03

0.3

37

10.3

88

39

14

29.7

31.8

9.2

38

06

13.7

128

κ κκκ1 111

178

11

σ σσσ2 222

0.2

81

0.1

50

.28

659

2.8

91

.77

28

.86

30

.42

0.6

50

.17

3.9

617

.76

0.1

215

.88

31

.38

γ γγγ1 111

81

.3σ σσσ

53

%3

19

%5

3%

81

20%

133

%5

37

%5

52

%8

1%

41

%1

99

%4

21

%3

5%

399

%5

60

%

r1

12

T

K1

,G

Km

k*

ma

x,1

k*

ma

x,3

KL

Yxg

Kiy

,EtO

HK

1,E

tOH

km

ax,2

K2

,Gµ

ma

xK

2,E

tOH

KG

ms

pa

r(1

)p

ar(2

)p

ar(3

)p

ar(4

)p

ar(5

)p

ar(6

)p

ar(7

)p

ar(8

)p

ar(9

)p

ar(1

0)

pa

r(1

1)

pa

r(1

2)

K1

,G

Km

k*

ma

x,1

k*

ma

x,3

KL

Yx

gK

iy,E

tOH

K1

,EtO

Hk

ma

x,2

K2

,Gµ µµµ

ma

xK

2,E

tOH

CF

228

.1

T

4.5

17

66

52.8

20.8

90

94.7

3E

-02

28.0

7.4

80

.51

1.9

91

.74

23.5

κ κκκ2

222

45

σ σσσ2

22

.17

0.0

02

.48

1.7

6E

-03

2.2

720

6.6

811

1.3

26

.14

0.1

30

.76

23

9.8

77

.38

γ γγγ2

23

.7σ σσσ

21

47

%1%

158

%4%

151

%14

38%

10

55%

248

%3

6%

87

%15

49%

272

%

r2

8

TK

mk

*m

ax,3

km

ax,2

K2

,GK

1,G

K

1,E

tOH

µm

ax

KL

Kiy

,EtO

Hk

*m

ax,1

Yxg

K2

,EtO

H

pa

r(1

)p

ar(2

)p

ar(3

)p

ar(4

)p

ar(5

)p

ar(6

)p

ar(7

)p

ar(8

)

Km

k*

ma

x,3

km

ax

,2K

2,G

K1

,G

K1

,EtO

Hµ µµµ

ma

xK

L

CF

327

.9

T

61

02

8.3

0.3

17

2.3

44

.12

7.1

61

.75

99

7

κ κκκ3

13

15

σ σσσ2

0.1

92

.06

0.3

72

.55

0.4

50

.32

0.0

81

.46

γ γγγ3

2.6

σ σσσ4

3%

144

%6

1%

160

%6

7%

56

%2

8%

121

%

r3

7

Tµ

ma

xK

1,E

tOH

Km

km

ax,2

K1

,G

KL

k*

ma

x,3

K2

,G

pa

r(1

)p

ar(2

)p

ar(3

)p

ar(4

)p

ar(5

)p

ar(6

)p

ar(7

)

µ µµµm

ax

K1

,EtO

HK

mk

ma

x,2

K1

,G

KL

k*

ma

x,3

CF

427

.9

T1

.74

6.9

75

93

0.6

57

3.7

91

00

03

5.7

κ κκκ4

65

93

σ σσσ2

0.0

60

.31

0.1

01.7

9E

-03

0.2

211

.33

0.0

5

γ γγγ4

3.4

σ σσσ2

5%

56

%3

1%

4%

47

%3

37

%2

3%

r4

5

T

km

ax,2

K1

,G

µm

ax

k*

ma

x,3

Km

K1

,EtO

HK

L

pa

r(1

)p

ar(2

)p

ar(3

)p

ar(4

)p

ar(5

)

km

ax

,2K

1,G

µ µµµ

ma

xk

*m

ax

,3K

m

CF

527

.9

T0

.65

73

.79

1.7

43

5.7

59

3

κ κκκ5

21

76

σ σσσ2

3.3

3E

-03

0.0

60

.05

0.7

10

.03

γ γγγ5

0.9

σ σσσ6%

24

%2

3%

84

%1

6%

r5

4

Tµ

ma

xk

ma

x,2

Km

K1

,G

k*

ma

x,3

pa

r(1

)p

ar(2

)p

ar(3

)p

ar(4

)

µ µµµm

ax

km

ax

,2K

mK

1,G

CF

628

.0

T1

.74

4.5

6E

-03

8.8

53

.75

κ κκκ6

62

0σ σσσ

20

.05

7.8

3E

-02

4.2

3E

-07

0.0

9

γ γγγ6

0.4

σ σσσ2

2%

28

%0

.07%

30

%

pa

r(1

)p

ar(2

)p

ar(3

)p

ar(4

)p

ar(5

)p

ar(6

)p

ar(7

)p

ar(8

)p

ar(9

)p

ar(1

0)

pa

r(1

1)

pa

r(1

2)

pa

r(1

3)

pa

r(1

4)

µ µµµm

ax

km

ax

,2K

mK

1,G

k

*m

ax

,3K

1,E

tOH

KL

K2

,GK

iy,E

tOH

k*

ma

x,1

Yx

gK

2,E

tOH

KG

ms

1.7

44.5

6E

-03

8.8

53

.75

35.7

6.9

71

00

02

.34

28.0

52.8

4.7

3E

-02

23.5

883

2.2

9E

-03

k=

4

k=

5

k=

6

Iden

tifi

ab

le E

stim

ate

d

Pa

ram

ete

r V

ecto

r

k=

3

k=

1, r

0=

Nθ θθθ=

14

k=

2

7ˆ

θ

)(

0ˆ

rθ

)(

1ˆ

rθ

)(

2ˆ

rθ

)(

3ˆ

rθ

)(

4ˆ

rθ

)(

5ˆ

rθ

)(

2r

θ

)(

3r

θ

)(

4rθ

)(

5rθ

()

()

TT

rN

pT

r

=

−)

()

(5

5ˆ

,ˆ

ˆθ

θθ

()T

rN

)(

1ˆ

−θ

θ

()T

rN

)(

2ˆ

−θ

θ

()T

rN

)(

3ˆ

−θ

θ

()T

rN

)(

4ˆ

−θ

θ

()T

rN

)(

5ˆ

−θ

θ

)(

1rθ

Figure

6.5.:Resultsof

param

eter

estimationwithidentifiabilityanalysis.

(Figure

takenfrom

publicationII

-Lopez

etal.

(2013)in

Appendix

A.2

-reprintedfrom

BiotechnologyProgresswithpermissionfrom

AmericanInstitute

ofChem

icalEngineers).

104


0

20

40

60

80

100

0 10 20 30

Time (h)

0

20

40

60

80

100

0 10 20 30

Con

cen

trati

on

(g/L

)

Time (h)

Cellobiose Glucose Ethanol

Ccb (k=1) CG (k=1) CEtOH (k=1)



Figure 6.6.: Experimental vs. predicted concentrations using the parameter vector θ calculatedin iteration k = 1, 4, 6 using experimental data of E2 and E3. Results for E2 (leftpanel) and results for E3 (right panel). (Figure taken from publication II - Lopez etal. (2013) in Appendix A.2 - reprinted from Biotechnology Progress with permissionfrom American Institute of Chemical Engineers).

Figure 6.6 shows measured data for Ccb , CG , and CEtOH of experiments E2 and

E3 and the corresponding predicted concentrations using the parameter vector calculated

in iteration k = 1 (original parameter vector with 14 estimated elements), in iteration

k = 4 (7 estimated elements in the reordered parameter vector), and in iteration k = 6 (4

estimated elements in the reordered parameter vector).

Figure 6.7 depicts the ranking of the parameters according to their sensitivity measure

δj (see Eq. 2.17). The highest parameter sensitivity measure found in k = 1 (i.e., max-

imum value of δj of the columns in the sensitivity matrix S(r0) correspond to Km and

the parameter with lowest sensitivity measure correspond to ms. It can be seen that the

ranking based on sensitivity measures does not give the same results as the ranking ac-

cording to identifiability (shady cells in Figure 6.7), e.g., in k = 1, the parameter KG had

a medium sensitivity measure but is nonidentifiable. The QR method for selecting the

identifiable parameters (Section 4.4.2) considered parameters with high sensitivity mea-

sures first for the construction of the identifiable parameter subset. It can be seen that

for all iterations k = 1 to k = 6, the parameters Km, kmax,2, µmax, and K1,G, which

were the identifiable parameters in the iteration k = 6, were in the first positions of

the sensitivity ranking, which demonstrates that QR method had preferences for select-

ing the most uncorrelated parameters with high sensitivity. However, some parameters

which also highly affected the predicted response variables, e.g., k∗max,3 in Figure 6.7,

were not identifiable due to correlations with identifiable parameters (i.e., µmax, kmax,2,

Km, and K1,G). In the first iteration (k = 1) with r0 = Nθ = 14, after applying the

PE and the structural analysis, the sensitivity matrix rank of the original problem was

105


k=1 k=2 k=3 k=4 k=5 k=6 Iteration

r=12 r=8 r=7 r=5 r=4 r=4 Rank

1 Km Km Km kmax,2 Km Km

2 k*max,1 kmax,2 k*max,3 k*max,3 kmax,2 kmax,2

3 k*max,3 k*max,3 kmax,2 Km k*max,3 K1,G

4 K1,G K1,G K2,G K1,G K1,G µmax

5 kmax,2 KL K1,G µmax µmax

6 µmax k*max,1 K1,EtOH K1,EtOH

7 KG K2,G KL KL

8 Kiy,EtOH µmax µmax

9 K2,G Yxg

10 KL K1,EtOH

11 K1,EtOH K2,EtOH

12 Yxg Kiy,EtOH

13 K2,EtOH

14 ms

Sensitivity

Ranking

Figure 6.7.: Ranking of parameters according to sensitivity (the most sensitive parameter abovein position 1). The identifiable parameter subset in every iteration k is marked byshaded cells. (Figure taken from publication II - Lopez et al. (2013) in Appendix A.2- reprinted from Biotechnology Progress with permission from American Institute ofChemical Engineers).

r1 = 12, with CF1 = 40.1, κ1 = 17811, and γ1 = 81.3 (see Figure 6.5). The large ill-

conditioning measures κ1 and γ1 confirmed that the parameter estimation problem was

ill-conditioned, despite of the good fitting of the experimental data (see dash-dot line

in Figure 6.6). Strong correlations among the parameters in θ(r0) were then expected.

The local identifiability analysis yielded a new ordered active parameter vector (˜θ(r1))T =

(K1,G Km k∗max,1 k∗max,3 KL Yxg Kiy,EtOH K1,EtOH kmax,2 K2,G µmax K2,EtOH)T = (θ(r1))T

and the nonactive vector˜θ(r0−r1) = (KG ms)

T = (θ(Nθ−r1))T ; thus, the complete param-

eter vector for the next iteration k = 2 reads θ =((θ(r1))T (θ(Nθ−r1))T

)Tin Figure 6.5)

where the shaded cells make reference to the identifiable parameter subset˜θ(rk) in iteration

k. In the second iteration (k = 2), the two unidentifiable parameters obtained from the

iteration k = 1 were removed from the parameter estimation problem by fixing them to

values found at the previous iteration (KG = 883 and ms = 2.29× 10−3).

The reduced parameter estimation problem was solved

for the previously identifiable vector θ(r1) =˜θ(r1) =

(k∗max,1, kmax,2, k∗max,3, Yxg, µmax, KL, K1,G, K1,EtOH , Kiy,EtOH , Km, K2,G, K2,EtOH)T

and the identifiability analysis was again performed in the second iteration. The results

were r2 = 8, CF2 = 28.1, κ2 = 22245, and γ = 23.7. The reduction in γ2 indicated

that this reduced problem was better conditioned than the problem in k = 1 but there

are still four parameters which are not identifiable, indicated by r2 = 8 and because

κ2 >> κmax and γ2 > γmax. The new identifiable parameter vector reads (˜θ(r2))T =

(Km, k∗max,3, kmax,2, K2,G, K1, G, K1,EtOH , µmax, KL )T = (θ(r2))T , and the nonactive

parameters to be fixed are stored in (˜θ(r1−r2))T = (Kiy,EtOH , k∗max,1, Yxg, K2,EtOH)T . The

total vector of nonactive parameters to be held constant at previous estimates in the next

106


iteration k = 3, are stored in θ(Nθ−r2) =(Kiy,EtOH , k∗max,1, Yxg, K2,EtOH , KG, ms

)T.

The vector of active parameters in k = 3 is θ(r2) =˜θ(r2).

The iterative parameter estimation stopped at k = 6, with κ and γ being smaller than

their defined thresholds (κ6 < κmax and γ6 < γmax). Consequently, the active parameter

subset of this reduced parameter estimation problem did not need a further reduction and

thus all its estimated parameters(θ(r5)

)T= (µmax, kmax,2, Km, K1,G)

T were identifiable

with r6 = 4.

6.3.4. Estimator performance assessment

A statistical analysis of the of the estimated parameter vector˜θ(rk) obtained in each

iteration k was done. The reliability tests of parameter estimates of Section 2.5.3, namely

the 95% confidence interval and the hypothesis test with a significance level of α = 0.05,

were here applied. This is shown in Figure 6.8, where DoF is the degree of freedom, rk is the

current parameter vector dimension in the k-th iteration, Nm the number of experimental

data points, TH0 the Student t-value for each j-th parameter (˜θ(rk)j ∈ ˜

θ(rk)), L and U are

the lower- and upper-confidence limits described by Eq. 2.28, such that L ≤ θ(rk)j ≤ U .

Shaded cells in Figure 6.8 indicate significant parameter values in the k-th iteration,

which were statistically accepted to be different to zero (alternative hypothesis H1 in Eq.

2.31b) with a failure probability of 5%. The low number of significant parameters at initial

iterations demonstrated how parameter correlations influence results of this hypothesis

test. For strong correlations, it is not possible to indicate accurately which parameters are

not significant. Therefore, those parameters with low TH0-values at the beginning, where

correlations are present, should not be interpreted as negligible parameters to the model,

but those with higher TH0-values definitely should be considered statistically important.

107


K1

,G

Km

k*

ma

x,1

k*

ma

x,3

KL

Yxg

Kiy

,EtO

HK

1,E

tOH

km

ax,2

K2

,Gµ µµµ

ma

xK

2,E

tOH

KG

ms

Do

F8

72.434

2.833

1.879

1.894

1.240

0.752

0.237

0.502

0.314

0.2509

0.186

0.1785

r 11

229.7

806

11.2

8.15

914

0.337

9.23

31.8

0.097

13.7

10.3

127.6

Nm

99

L5.44

240

-0.65

-0.404

-551

-0.554

-68.1

-94.0

-0.517

-94.6

-99.7

-1294

α ααα0

.05

U54.0

1371

23.1

16.7

2380

1.23

86.6

158

0.711

122

120

1549

Km

k*

ma

x,3

km

ax,2

K2

,GK

1,G

K

1,E

tOH

µ µµµm

ax

KL

Kiy

,EtO

Hk

*m

ax,1

Yxg

K2

,EtO

H

Do

F9

19

7.2

02

3.8

12

.75

51

.14

80

.67

90

.40

40

.06

50

.66

3

r 28

76

62

0.8

0.5

12

2.0

4.5

17

.48

1.7

49

09

Nm

99

L7

51

19

.10

.14

3-1

.5-8

.69

-29

.4-5

2.0

-18

14

α ααα0

.05

U7

82

22

.60

.88

25

.41

7.7

44

.35

5.4

36

32

µ µµµm

ax

K1

,EtO

HK

mk

ma

x,2

K1

,G

KL

k*

ma

x,3

K2

,G

Do

F9

23.600

1.773

2.317

1.653

1.487

0.828

0.696

r 37

1.7

57

.16

61

00

.31

74

.12

99

6.5

22

8.3

Nm

99

L0

.78

-0.8

68

7.1

-0.0

63

9-1

.38

-13

93

-52

.4

α ααα0

.05

U2

.71

15

.19

11

34

0.6

98

9.6

23

38

61

09

.0

km

ax,2

K1

,G

µ µµµm

ax

k*

ma

x,3

Km

K1

,EtO

HK

L

Do

F9

42

3.6

42

.12

83

.97

34

.35

3.2

4

r 45

0.6

57

3.7

91

.74

35

.75

93

Nm

99

L0

.60

20

.25

30

.87

19

.42

29

α ααα0

.05

U0

.71

37

.32

2.6

25

2.0

95

7

µ µµµm

ax

km

ax,2

Km

K1

,G

k*

ma

x,3

Do

F9

54

.39

41

7.3

26

.26

74

.23

2

r 54

1.7

40

.65

75

93

3.7

9

Nm

99

L0

.96

0.5

82

40

52

.01

α ααα0

.05

U2

.53

0.7

33

78

15

.56

µ µµµm

ax

km

ax,2

Km

K1

,G

Do

F9

54

.45

63

.57

31

53

83

.30

2

r 64

1.7

40

.00

45

68

.85

3.7

5

Nm

99

L0

.97

0.0

02

03

8.8

41

.50

α ααα0

.05

U2

.52

0.0

07

09

8.8

66

.01

k=

6

k=

4

k=

5

k=

1

k=

2

k=

3

Tr

)(

2

~ θ

Tr

)(

1

~ θ

Tr

)(

3

~ θ

Tr

)(

4

~ θ

Tr

)(

5

~ θ

Tr

)(

6

~ θt H0

t H0

t H0

t H0

t H0

t H0

Figure

6.8.:Estim

ator

perform

ance

assessment:

param

eter

statisticalsignificance

(tj-value)

and95%

confidence

intervals(L≤

θ(rk)≤

U).

Shadycells

indicateparam

eter

withstatisticalsign

ificance

of95%.(F

igure

takenfrom

publicationII

-Lopez

etal.(2013)in

Appendix

A.2

-reprinted

from

BiotechnologyProgresswithpermissionfrom

AmericanInstitute

ofChem

icalEngineers).

108


All components of the last identifiable parameter vector θ(r5) obtained in iteration k =

6, showed a statistical significance of 95%, which also means that their 95% confidence

intervals do not include zero (0.97 ≤ µmax ≤ 2.52, 2.03 × 10−3 ≤ kmax,2 ≤ 7.09 × 10−3,

8.84 ≤ Km ≤ 8.86, 1.50 ≤ K1,G ≤ 6.01 in Figure 6.8), and therefore, they actually have a

strong effect on predicted variables and should remain in the model.

Moreover, to compare the results from the initial problem (k = 1, Nθ = 14) with the

last reduced PE problem (k = 6, r6 = 4), enormous improvements in the parameter accu-

racy demonstrated by reductions in relative standard deviation and narrower confidence

intervals were observed. For instance, an improvement in the maximum relative standard

deviation of the most uncertain parameter µmax from 537% to 22% (see Figure 6.5) and

corresponding confidence interval from −99.7 ≤ µmax ≤ 120 to 0.97 ≤ µmax ≤ 2.52 (see

Figure 6.8).

6.3.5. Validation of the identified model

The estimated parameter vector θ with four identifiable and ten unidentifiable parameters

shown at the end of Figure 6.5 fits properly data of two experimental data sets (i.e., E2

and E3 in Table 6.1). Those data set had different experimental conditions such that

enzymatic load due to the presence of β-glucosidase and substrate pretreatment, but also

common experimental conditions such that pre-hydrolysis time, initial cellulose concentra-

tion CC0, and initial yeast concentration CX0. Accordingly, the fact that ten parameters

of θ were fixed might be explained not only by parameter correlations (nonidentifiability)

and weak effects on predicted variables (low sensitivities), but also by the aforementioned

common conditions of both experiments. The different and shared conditions of E2 and

E3 are reflected in the experimental data used for parameter determination. Those shared

parameters which are not influenced by the differences are nonidentifiable parameters. In

contrast, the four identifiable parameters of θ are related to the variability of predicted data

for E2 and E3. However, the range of experimental conditions for which the model was

valid is certainly limited and structural errors can be partially compensated by adjusting

the parameter values. Therefore, during the validation processes, it is tested whether the

identified model is able to predict new experimental conditions and also if the selection of

identifiable and unidentifiable parameter subsets is valid for these conditions. The model

validation was carried out using experiments E1, E4, and E5 in Table 6.1; it included:

1. cross-validation in order to test the validity of model predictions regarding new

operating conditions and,

2. validation of the identifiable parameter subset where the ability to adjust the model

predictions to different operating conditions (here E1-E5 in Table 6.1) is tested by

re-estimating the identifiable parameter values only.

109


In the former analysis (cross-validation), model predictions based on the identifiable

estimated parameter vector θ after the sixth iteration (see Figure 7) were calculated for

new experimental conditions which were not considered for parameter determination:

• Cellulose initial concentration CC0 (E1 and E5),

• Pre-hydrolysis time (E1, E4, and E5),

• GC-220 and β-glucosidase enzyme load (E1, E4, and E5), and

• Initial yeast concentration CX0 (E4 and E5).

These predictions were then compared with experimental data and the quality of fitting

was analyzed.

The latter analysis evaluated the fitting of model predictions to different experimental

conditions when a parameter estimation problem was solved where only the identifiable

parameter subset (i.e., θ(r6)) was re-estimated. By doing so, the capacity of active pa-

rameters to describe the variability in measured data caused by different experimental

conditions was assessed.

Cross-Validation

The validity of the model predictions based on the identifiable estimated parameter vector

θ (see last row in Figure 6.5) was tested using experiments E1, E4, and E5 of Table 6.1.

For each experimental condition, SSF process simulations using θ were carried out. The

predicted variables are shown as solid lines in Figures 11a-c, respectively. It can be seen

that the identified model predicted the behavior of new experimental conditions reasonably,

with an appropriate fitting for E5 with cost function of 22.6 and a moderate fitting for

E1 and E4 with cost functions of 56.4 and 57.1, respectively. During the pre-hydrolysis

stage, the model overestimated the cellobiose (Ccb−PE E2&E3) and glucose production

(CG − PE E2&E3) for E1 and E5, whereas it underestimated the same concentrations

for E4. During the fermentation stage, the model underestimated the glucose production

(CG − PEE2&E3) and overestimated the ethanol production (CEtOH − PEE2&E3) for

E1 and E4.

Assessment of the identifiable parameter subset

Glucose (CG) and ethanol (CEtOH) model predictions for E1 and E4 in Figure 6.9 demon-

strated that the behavior of new experimental conditions could not be predicted properly

by the identified model. However, very acceptable predictions are obtained for E5. A

re-estimation of the active or identifiable parameter subset of θ was performed. The iden-

tifiable subset (i.e., vector (θ(r6=4))T = (µmax kmax,2 Km K1,G)T ) was newly estimated

for E1, E4, and E5 individually, such that the parameter dimension of the corresponding

parameter estimation was Nθ = 4.

110


0

20

40

60

80

100

0 10 20 30

Time (h)

0

20

40

60

80

100

0 10 20 30

Time (h)

0

20

40

60

80

100

0 10 20 30

Co

nce

ntr

ati

on

(g

/L)

Time (h)

Cellobiose

Glucose

Ethanol

Ccb-PE E2&E3

CG-PE E2&E3

CEtOH-PE E2&E3

Figure 6.9.: Cross-validation using parameter vectors obtained from parameter estimation withE2&E3: experimental vs. predicted concentrations for E1 (left panel), for E4 (middlepanel), and for E5 (right panel). (Figure taken from publication II - Lopez et al.(2013) in Appendix A.2 - reprinted from Biotechnology Progress with permission fromAmerican Institute of Chemical Engineers).

0

20

40

60

80

100

0 10 20 30Time (h)

0

20

40

60

80

100

0 10 20 30Time (h)

0

20

40

60

80

100

0 10 20 30

Co

nce

ntr

ati

on

(g

/L)

Time (h)

CellobioseGlucoseEthanolCcb-PE Np=4CG-PE Np=4CEtOH-PE Np=4Ccb-PE Np=14CG-PE Np=14CEtOH-PE Np=14

Figure 6.10.: Validation of the identifiable parameter subset using parameter vectors obtained aftersolving parameter estimation problems with Nθ = 4 and Nθ = 14: experimental vs.predicted concentrations for E1 (left panel), for E4 (middle panel), and for E5 (rightpanel). (Figure taken from publication II - Lopez et al. (2013) in Appendix A.2 -reprinted from Biotechnology Progress with permission from American Institute ofChemical Engineers).

The fitted model predictions for E1, E4, and E5 are shown as solid lines (Nθ = 4) in left,

middle and right panels of Figure 6.10, respectively. Comparing the results for E1 and E4

111


in left and middle panels of Figure 6.10 with the results obtained in the cross-validation in

the left and middle panels of Figure 6.9), large improvements can be observed. Whereas

the fitting for E5 in right panel of Figure 6.10 remained almost the same (compare with

right panel of Figure 6.9). These results are reflected by the cost functions for E1, E4,

and E5 of 25.4, 42.3, and 20.7, meaning reductions with respect to the cross validation

residuals of 55%, 26%, and 8%, respectively. Finally, all parameters were re-estimated

(Nθ = 14) for E1, E4, and E5 individually. Dashed-lines in Figure 6.10 show the model

predictions for Nθ = 14 which present a better fitting than using those parameters found

when Nθ = 4. Their cost functions were 9.5, 35.9, and 9.3 meaning reductions with respect

to the cross validation residuals of 83%, 37%, and 59% for E1, E4, and E5, respectively.

Comparing the results for the parameter estimations with Nθ = 14 with the problems with

Nθ = 4, limited improvements regarding the data fitting are visible. These improvements

have to be contrasted with a large increase in computation time due to the increased

problem dimension. Additionally, these problems are highly ill-conditioned with condition

number κ equal to 1.08× 106, 3.10× 105, and 1.17× 107 for E1, E4, and E5, respectively

(similar to the first iteration step with k = 1 of the initial problem in Figure 6.5). Table

6.3 contains the parameter vectors estimated for E1, E4, and E5, the columns “E2 & E3“

display the estimated parameter vector θ, the columns “Nθ = 4“ display the parameter

vector θ when the four identifiable parameters (elements of θ(r6)) are re-estimated, and the

columns “Nθ = 14“ display the parameter vector θ when all parameters are re-estimated;

finally, underlined values make reference to estimated parameters.

112


Table 6.3.: Estimated parameter vector using E1, E4, and E5 experimental data after solution ofdifferent PE problems. Column labeled “E2&E3“ contains the estimated parametersafter finishing the iterative parameter estimation with structural analysis (see Figure6.5). (Table taken from publication II - Lopez et al. (2013) in Appendix A.2 - reprintedfrom Biotechnology Progress with permission from American Institute of ChemicalEngineers).

PARAMETER E1 E4 E5

E2&E3 Nθθθθ=4 Nθθθθ=14 E2&E3 Nθθθθ=4 Nθθθθ=14 E2&E3 Nθθθθ=4 Nθθθθ=14

1 k*max,1 52.84 52.84 3.00 52.84 52.84 58.64 52.84 52.84 4.68

2 kmax,2 4.56×10-3 1.37×10-2 2.11×10-1 4.56×10-3 7.81×10-3 3.65×10-1 4.56×10-3 7.03×10-3 2.78×10-1

3 k*max,3 35.7 35.7 4.4 35.7 35.7 64.2 35.7 35.7 4.9

4 ms 2.29×10-3 2.29×10-3 7.79×10-4 2.29×10-3 2.29×10-3 4.64×10-1 2.29×10-3 2.29×10-3 2.10×10-5

5 Yxg 0.047 0.047 0.124 0.047 0.047 0.079 0.047 0.047 0.250

6 mmax 1.74 0.92 2.47 1.74 2.29 2.58 1.74 1.58 7.88

7 KG 883 883 743 883 883 863 883 883 813

8 KL 1000 1000 167 1000 1000 293 1000 1000 501

9 K1,G 3.75 3.60 220.56 3.75 3.92 1.97 3.75 3.69 109

10 K1,EtOH 6.97 6.97 20.4 6.97 6.97 1.54 6.97 6.97 46.3

11 Kiy,EtOH 28.0 28.0 13.8 28.0 28.0 13.3 28.0 28.0 6.7

12 Km 8.85 25.80 962 8.85 15.5 823 8.85 14.0 875

13 K2,G 2.34 2.34 4.63 2.34 2.34 1.66 2.34 2.34 4.02

14 K2,EtOH 23.5 23.5 569.9 23.5 23.5 21.2 23.5 23.5 110

Cost Function 56.4 25.4 9.5 57.1 42.3 35.9 22.6 20.7 9.3

6.3.6. Discussion of the results

The model used in this article describes the dynamic behavior of cellulose, glucose, cel-

lobiose, ethanol, and biomass. The cellulose hydrolysis model involved parameters and con-

stants related to the nature and dosage of the enzyme (eg, eT , KD ), the enzyme-cellulosic

substrate interaction (k∗max,1, kmax,2, k∗max,3, KL, Km), enzyme-glucose interaction (K1,G,

K2,G) and enzyme-ethanol interaction (K1,EtOH , K2,EtOH). Moreover, the glucose fer-

mentation model took into account some parameters to describe cell growth (µmax, KG),

substrate consumption (Kiy,EtOH ), and enzyme-yeast interaction (Yxg, ms). Thus, differ-

ences in the parameter values compared with the values presented by other authors may

be explained as by the presence of local minima due to the nonlinearity of the model as

by variations in the structural features of the pretreated sugarcane bagasse, variations in

the enzymatic mixture (i.e., the addition of pure β-glucosidase to the enzymatic complex

113


GC-220), and variations in the performance of Saccharomyces cerevisiae under the specific

conditions of the hydrolysis used in this case study. The accuracy of estimated parameters

not only depends on the frequency and accuracy of measurements but also on the selection

of measured process variables. Two key variables of the hydrolysis stage were measured

(i.e., glucose and cellobiose concentrations), whereas only one key variable of the fermen-

tation stage was measured (i.e., ethanol concentration). According to the experimental

fittings, the enzymatic hydrolysis behavior of the experimental data is better described

by the model than the fermentative behavior. Consequently, in the fermentation stage,

the residuals are bigger in the time region when microorganisms were added; this may be

explained by the few experimental information in the fermentation process (i.e., there is

no measurement of the biomass concentration which would contribute significantly to the

parameter identifiability and might be considered in future works, as additional response

variable). Therefore, the fermentation-related parameters were not identifiable and could

not be estimated properly.

Finally, besides the previously mentioned fact about little experimental information, a

more detailed modeling of the fermentative and the hydrolytic process (e.g., including

mass transfer phenomena and additional inhibition factors not explored here) should be

considered to describe in a more accurate way the real behavior of the process variables.

In this application, the fact that ten parameters were fixed might be explained not only by

parameter correlations (nonidentifiability) and weak effects on predicted variables (low sen-

sitivities) but also by the common conditions of both experiments (E2 and E3) displayed

in Table 6.1. In contrast, the four identifiable parameters are related to the variability of

predicted data for E2 and E3, to improve the fitting of model predictions to new experi-

mental conditions reflecting the high sensitivity and independence of this selected subset.

The identified model was the result of a model selection step where different literature

sources and two different experimental data sets were analyzed. The prediction capacity

of this model facing new experimental conditions was assessed and shown in Section 6.3.5.

Differences in some of these model output predictions regarding measured variables were

evidenced in Figure 6.9. These fittings were expected taking into account the experimen-

tal conditions of new experiments E1, E4, and E5 were in regions not considered by the

experiments used in the parameterization (see E2 and E3 in Table 6.1). However, proper

fittings were also shown in Figure 6.9, proving that identified model could still predict well

new experimental conditions (e.g., fitting of experiment E5). Limited improvements of

data fitting, when parameter estimation problems with Nθ = 14 were solved with respect

to estimation with Nθ = 4, were obtained in Section 6.3.5. Totally different parameter

vectors, a large increase in computation times and ill-conditioned matrices were the fea-

tures of each PE with Nθ = 14. According to this evaluation, it is possible to say the

four identifiable parameters had the ability to adapt to new experimental conditions, and

therefore, they should be re-estimated for a new experiment only, maintaining the others

(unidentifiable parameters) fixed on previous estimate.

114

6.4. Summary and Conclusions

6.4. Summary and Conclusions

In this chapter, a SSF process model for bio-ethanol production from sugarcane bagasse

was identified and critically evaluated. It was presented a systematic identification

methodology including an iterative parameter estimation with structural analysis (i.e.,

ill-conditioning and identifiability) to determinate kinetic parameters of interest. The

strategy to estimate and analyze the parameters was based on the sensitivity method.

Model structure, parameter initial guess selection and estimator performance evaluation

from statistical point of view were also here investigated.

SSF models proposed in literature by Refs. [35] and [97] are generic formulations for

the hydrolysis and fermentation process of any cellulose-enzyme-microorganism system.

However, the stated parameter values were identified for a specific cellulosic substrate,

enzymatic complex preparation, and microorganisms. Moreover, different phenomena were

studied independently in order to estimate individual parameters or parameter subsets.

The quality and concentration of the specific cellulosic substrate, enzyme dosage, and

mode of substrate-enzyme interactions play a dominant role for the performance of the

SSF process. Thus, it was necessary to evaluate the adequacy of literature models and

to adapt them (neglecting or adding terms) to the specific characteristics of the precise

cellulosic substrate-enzyme-microorganisms system (i.e., pretreated sugarcane bagasse /

GC-220 and β-glucosidase / Saccharomyces cerevisiae system).

The application of the framework in Chapter 4 proved to efficiently work detecting

and dealing with a strongly over-parameterized model. After successful termination, the

number of considered parameters was reduced to a relatively small subset of the original

parameter space in order to regularize the ill-posed problem. Thus, the most influencing

parameters for selected operating conditions were identified and their uncertainty was

significantly decreased.

The assessment of the identifiability of original parameter vector(with fourteen elements)

revealed that a maximum of four model parameters were identifiable (i.e., µmax, kmax,2,

Km, K1,G) from the used data (experiments E2 and E3). Uncertainties in the identi-

fied parameters expressed by their relative standard deviation was decreased from 537%

to 22%, from 319% to 28%, from 35% to 0.07%, from 41% to 30%, for µmax, kmax,2,

Km, K1,G , respectively. Those large reductions were also reflected in their smaller confi-

dence intervals compared to the original parameter vector (e.g., confidence reduction from

−99.7 ≤ µmax ≤ 120 to 0.97 ≤ µmax ≤ 2.52).

Due to differences between the cellulosic substrate- enzyme-microorganism system used

in literature and here, the use of literature parameters as initial guesses for the parameter

estimation was not successful. A parameter data collection plan for sampling different

parameter guesses within a prescribed parameter range was applied instead (i.e., MBLHD).

It has been shown by a cross-validation that the identified model is able to predict

different operating conditions well. The identified model fitted data of two different ex-

115


perimental data sets (E2 and E3) with some shared experimental conditions properly. A

cross-validation using three new experimental data sets (E1, E4, and E5) conducted in

different bioprocess laboratories (Federal University of Rio de Janeiro, Brazil and Univer-

sity of Antioquia, Colombia) was also carried out. Identified model proved to be able to

predict the behavior of new experimental conditions with different cellulose initial con-

centration, pre-hydrolysis time, GC-220, and β-glucosidase enzyme load and initial yeast

concentration. Identifiable subset enclosed the data variability for new experimental condi-

tions. Moreover, it has been shown that the identifiable parameter subspace should also be

used in order to adapt the model to different experimental conditions. The corresponding

parameter estimation problems could be efficiently solved, as they were of reduced-order

and well-conditioned, while giving nearly the same prediction error as if all parameters

were considered.

116

7. More cases from bioprocessing: the effect

of ill-posed parameter estimation on

optimal experimental design

7.1. Abstract

1 The previous chapters 5 and 6 have already considered the estimation of parameters of

ill-posed problems. They have also explained the detection and mitigation of its structural

problems i.e., ill-conditioning and identifiability, by using several techniques. This mitiga-

tion has been done by increment in quantity and quality the experimental information and

a properly selection of the parameter initial guess. However, they have neither dealt with

numerical strategies to stabilize the solution nor analyzed the effect of those strategies in

optimal experimental design. In order to point this out, this chapter discuses different nu-

merical approaches for handling nonlinear ill-posed problems within the offline parameter

estimation and optimal experimental design framework of Chapter 4.

Literature models with increasing complexity of three bio-processes namely, a semi con-

tinuous (fed-batch) fermentation [2], a reconstruction of biochemical networks [71] and a

biological system to water treatment [16, 65] are used to illustrate the effects of ill-posed

problems on nonlinear parameter estimations and optimal experimental designs. The

combination of the ill-conditioning analysis by the sensitivity method (see Section 4.4.1 )

and the local identifiability analysis by the SVD method (see Section 4.4.2) to diagnose

identifiable parameters is detailed.

The case studies are classified either rank-deficient or of ill-determined rank [52, 78]. A

link is established between the measures of the presented analysis to the common alpha-

betic design criteria in OED and information for the solution of ill-posed OED is derived.

It is important to point out that this analysis in the context of OED had not been available

in literature till the publication of the peer-review article in Ref. [78] (Publication III )

which this chapter is based on. Moreover, regularization techniques for handling nonlinear

ill-posed PE problems from singular value analysis point of view are discussed, namely,

orthogonal decomposition based techniques (i.e., SsS and TSVD) [19, 45, 52, 79, 83, 131]

and the Tikhonov strategy [4, 48, 49, 52, 63, 118]. These techniques are then applied

to ill-posed OED problems. To avoid the computational demanding bi-level optimization

1The content of this chapter is reprinted (adapted) with permission from (D. C. Lopez C., T. Barz, S.Korkel, and G. Wozny. Nonlinear ill-posed problem analysis in model-based parameter estimation andexperimental design. Computers & Chemical Engineering, 77:24-42, 2015). Copyright (2015) Elsevier.(Publication III in Appendix A.2)

117

7. Effect of ill-posed parameter estimation on optimal experimental design

approach addressed by authors in Ref. [49, 59], it is considered the variance contribution

of the biased estimator only [3, 38].

Finally, in this chapter two new sections containing information which has not been

published or used to prepare a new peer-review article are included. This content is

here marked as “(New)“. The first new section (Section 7.3.5) shows the evolution of

ill-conditioning after adding new experimental information. The aid of informative exper-

imental data on ill-conditioning is demonstrated. The second new section (Section 7.3.6)

exposes Monte Carlo studies which are accomplished in order to test the efficacy of optimal

designs computed from ill-posed parameter estimations and regularized parameter estima-

tions. The efficacy is measured in terms of estimator performance and the new states of

ill-conditioning and identifiability of the parameter estimation.

7.2. Applications

The effect of ill-posed parameter estimations on optimal experimental designs is studied

for three different case studies E1, E2, E3. These applications are derived from dynamic

systems from bioprocessing, being E1: a semi continuous (fed-batch) fermentation reactor

[2]; E2: a biochemical growth model in a reactor [71]; and E3: a sequencing batch reactor

for water treatment [65]. The degree of complexity of each example increases according

to its number of parameters and its state of ill-posedness of the PE from E1 to E3. The

starting information (i.e., model, experimental design uIG, parameter initial guess θIG and

true parameters θ∗) is taken from literature when available or defined in this study. For

instance, convergence tests on the PE are used to select the values of θIG. A brief problem

overview for each case study is given in Table 7.1. Model details for E1, E2 and E3 are

described in Refs. [2], [71] and [65], respectively.

118

7.2. Applications

Desc

rip

tion

N

ota

tion

Fed

batc

h f

erm

enta

tion

(see

Asp

rey

an

d M

acch

iett

o, 200

2)

Bio

chem

ica

l n

etw

ork

(see

Krem

lin

g e

t al.

, 20

04)

Act

iva

ted

Slu

dge

Mod

el (

AS

M3

)

(see

Kael

in e

t a

l, 2

009)

Exa

mp

le E

1

Exa

mp

le E

2

Ex

am

ple

E3

E1

E1

a

E1

b

E1c

Sta

te v

aria

ble

s x ∈

RN

x

(x1,

x2)

(x1, x

2)

(x1, x

2)

(x1,

x2)

(V, B

, S

, M

1, M

2, M

3)

(SO, S

S, S

NH, S

NO

2, S

NO

3, S

N2,

SA

LK,

SI,

XI,

XH,

XS, X

ST

O, X

AO

B, X

NO

B,

XT

SS)

Mea

sure

d s

tate

var

iable

s y ∈

RN

y

(x1)

(x1, x

2)

(x1, x

2)

(x1,

x2)

(B, S

, M

1, M

2, M

3)

(SO, S

S, S

NH, S

NO

2, S

NO

3,S

AL

K,X

ST

O)

Dat

a sa

mpli

ng t

ime

gri

d

t∈R

Nm

(2:2

:20)

(2:2

:20)

(0.5

:0.5

:20)

(0.2

5:0

.25:2

0)

(2:2

:60)

(0.0

02:0

.002:0

.2) a

Init

ial

con

dit

ions

x(t

0)

= x

0

(5.5

, 0.1

) (1

.0, 0

.1008, 1

.9134,

0.0

620

,

0.0

079,

0.0

749)

(0, 100

0, 20

, 20,

20, 0

, 5, 0

, 0, 2

00, 2

00, 5

0,

20, 2

0, 16

16.7

)

Input

acti

on v

aria

ble

s u ∈

RN

u

(u1, u

2)

(qin, q

out,

c in)

u1

Input

acti

on t

ime

gri

d

t∈R

Nm

u

(2 1

4 2

0)

(20 3

0 6

0)

(0.0

2:0

.02:0

.2) a

Input

acti

on i

nit

ial

des

ign

∈R

Nu⋅N

mu

u=

(0.1

2,0

.12,0

.12) u=

(15

,15

,15)

u=

(0.2

5 0

.35

0.3

5) u=

(0.2

5

0.3

5 0

.35) u=

(2.0

2.0

0.5

)

u=

(0.0

5, 1.0

, 0.0

5, 0.3

, 0.5

, 1.0

, 0.7

, 0

.2,

0.0

5, 1.0

)

Par

amet

ers

θ∈

RN

p

θi,

i=1,…

,4

θi,

i=1,…

,10

(r1

max, K

s, K

2, K

M1,

r 3m

ax, K

M2,

Ksy

nm

ax, K

IB,

Yxs,

KIA

)

In c

ontr

ast

to K

rem

lin

g, 2004,

wher

e so

me

par

amet

ers

val

ues

are

assu

med

to

be

exac

tly k

now

n t

o

over

com

e id

enti

fiab

ilit

y i

ssues

, in

this

work

all

par

amet

ers

are

con

sider

ed a

s u

nkno

wn.

θi,

i=1

,…,4

4

(iN

SS, i N

XI,

i NX

S, i N

BM

, f X

I,YH

O2,

YH

NO

3,

YH

NO

2,

YS

TO

O2, Y

ST

ON

O3, Y

ST

ON

O2, Y

AO

B, Y

NO

B, K

H,

kst

o, µ

H,

µA

OB, µ

NO

B, b

HO

2, b

ST

OO

2, b

AO

B,

bN

OB,

ηH

NO

3, η

HN

O2, η

HendN

O3, η

Hen

dN

O2, η

Nend, K

X,

KH

O2, K

HO

2in

h, K

HS

S,

KH

NH

4,

KH

NO

3, K

HN

O2,

KH

AL

K,

K

HS

TO, K

AO

BO

2,

KN

OB

O2, K

AO

BN

H4,

KN

OB

NO

2,

KN

AL

K, T

, K

la2

0, S

OS

AT)

Par

amet

er i

nit

ial

gues

s

θ

∈R

Np

(0.5

0.5

0.5

0.5

)

(7200, 0.1

33119, 16

6770

0, 3.6

6 ,

900

000,

3, 0.0

0246,

0.0

498,

0.0

000

211, 300)a

(0.0

12, 0

.01

6, 0.0

12, 0.0

28

,0.0

8, 0.3

2, 0.2

6,

0.2

6, 0

.32

, 0.2

8,

0.2

8, 0.0

72, 0.0

24

, 3.6

, 4

.8,

1.2

, 0.3

6,

0.2

6, 0

.12

, 0.1

2, 0.0

6, 0.0

88, 0.0

6,

0.0

6, 0.1

, 0.1

4, 0.0

4, 0.4

, 0.0

8, 0.0

8, 4

.0,

0.0

04, 0.2

, 0.2

, 0.0

4,

0.0

4, 0

.32

, 0

.32, 0.0

56

,

0.1

12

, 0.2

, 8

.0,

400,

4185

3) a

Dif

fere

nti

al E

q. S

yst

em

D

AE

con

ver

ted t

o O

DE

D

AE

co

nver

ted t

o O

DE

D

AE

con

ver

ted t

o O

DE

aD

efin

ed b

y a

uth

ors

Table7.1.:Problem

description

forcase

studiesE1,

E2an

dE3.(F

igure

takenfrom

publicationIII-Lopez

etal.(2015)in

Appendix

A.2

-reprintedfrom

Com

puters

&Chem

ical

Engineeringwithpermissionfrom

Elsevier).

119



What follows is a numerical analysis of the different nonlinear discrete ill-posed problems

E1, E2, E3 referenced in Section 7.2. The link between identifiability problems and ill-

conditioning is established and general problems concerning the solution of OED for ill-

posed PE using conventional alphabetic design criteria are discussed. The application of

different regularization techniques and the solution of the corresponding modified OED

problems are examined.

Solve OED:

= arg min

Ψ u

Model | θ |

None | SsS | TSVD | Tikh

Regularization technique

(Reg):

A | D | E

Design criterion

(crit):Starting information:

()

()

()

Optimal design:

Original matrix :

Regularized matrix

:

Ill-conditioning analysis by

sensitivity method

And Identifiability by QR

method

(Section 5.1.3.)

Regularized matrix

:

Original matrix :

Initial design:

(Section 8.3.2.)

(Section 8.3.1.)

(Section 8.3.3.)

(Section 8.3.3.)

Original matrix :

Regularized matrix :

Original matrix :

Regularized matrix

:

Sensitivity Matrix

Sensitivity Matrix

Parameter covariance

matrix

Parameter covariance

matrix

(Section 8.3.2.)

(Section 8.3.2.)

(Section 8.3.3.)

(Section 8.3.3.)

E1 | E2 | E3

Case study:

Figure 7.1.: Main procedure and nomenclature of Section 7.3. (Figure taken from publicationIII - Lopez et al. (2015) in Appendix A.2 - reprinted from Computers & ChemicalEngineering with permission from Elsevier).

The optimal design ucrit is obtained from the solution of the OED (see Eq. 2.35) for

different design criteria crit=A, D, E according to Eqs. 2.36-2.38 and different regu-

larization techniques Reg=None, SsS, TSVD, Tikh (see Table 3.1). The regularization

is achieved for the initial design uIG and the SVs of the regularized sensitivity matrix

SReg(uIG) is computed. It is then compared to the SVs obtained for the solution of

the regularized OED with the regularized sensitivity matrix SReg(ucrit) evaluated at the

optimal design ucrit. The ill-conditioning analysis is conducted by using the sensitivity

method in Section 4.4.1 whereas the identifiability diagnosis is performed by using the

three techniques described in Section 4.4.2, namely variance, SVD and QR methods.

The remainder of this paper is organized according to the flow diagram depicted in

120


Figure 7.1. In Section 7.3.2 the original state of ill-conditioning and identifiability for

each case study E1, E2, E3 at the initial design uIG using the sensitivity matrix S(uIG)

is established. In Section 7.3.3 the new state of conditioning and identifiability after

the application of OED without regularization at the optimal design ucrit using S(ucrit)

is shown. In Section 7.3.4 the application of the different regularizations methods is

discussed.

Two new unpublished sections are at the end of this chapter included. Section 7.3.5

addresses the effect of experimental data in the ill-conditioning of the sensitivity matrix.

Whereas, the results of Monte Carlo studies are displayed in Section 7.3.6 in order to

illustrate the effect of using ill-posed and regularized optimal designs on new parameter

estimations.


All numerical computations are performed on an Intel(R) Core(TM)2 (CPU 6600 @2.40-

GHz) computer with 4-GB RAM. Parallel programming is not used. Model and parame-

ter sensitivity equations are integrated by CVODES from SUNDIALS (Hindmarsh et al.,

2005). Parameter estimation and optimal experimental design problems are solved by us-

ing MATLAB Release 2013a (The MathWorks Inc., Natick, Massachusetts, United States).

Parameter estimations are solved by using“lsqnonlin“ (nonlinear least squares) function us-

ing the trust-region-reflective algorithm. Whilst, regularized estimation by Tikhonov2 and

OED problems are solved by using “fmincon“ (constrained nonlinear multivariable) func-

tion using the interior-point algorithm. Singular values are computed with “svd“ function

whereas the eigenvalue system is solved with “eig“ function.

7.3.2. Ill-conditioning and identifiability diagnosis

This section summarizes results from the ill-conditioning analysis for case studies E1, E2

and E3 at the initial design uIG using the sensitivity matrix S(uIG). The classification of

the ill-conditioning and the selection of singular values generating a well-posed problem

are described. The maximum values in Table 7.2 are used in order to determine the ϵ-

threshold and to define the numerical rank rϵ as explained in Section 4.4.1. Notice that

γmax(S) = γmax(S) · σ and κmax(S) = κmax(S). Moreover, in this section for the sake of

simplicity κ(S) and γ(S) are referred to as κ and γ, respectively .

E1 - Fed Batch Fermentation

Ill-conditioning analysis. Figure 7.2a contains the singular value spectrum of S(uIG)

spanning from ς4 = 3.1912 × 10−2 to ς1 = 1.2407 × 101. The condition number and

the collinearity index are κ = 3.8879 × 102 and γ = 3.1376 × 101, respectively, and this

2With regularized sensitivity matrix STikh =

[S

λ2L

].

121


Table 7.2.: Thresholds for condition number (κmax) and collinearity index (γmax). (Figure takenfrom publication III - Lopez et al. (2015) in Appendix A.2 - reprinted from Computers& Chemical Engineering with permission from Elsevier).

Problem κmax γmax

E1 1000 10E2 1000 15E3 1000 15

problem is categorized as rank-deficient. Note that only the collinearity measure is not

fulfilled, namely γ > γmax because the singular value ς4 is relatively close to zero, thus S

is ill-conditioned. The effect of including new experimental information on ill-conditioning

will be further analyzed in Section 7.3.5.

The ϵ-threshold is defined by γmax, i.e., ϵ = ϵγ according to Eq. 3.3. As can be seen in

Figure 7.2a, ς4 does not fulfill this threshold having ς4 < ϵ, therefore the numerical rank

is rϵ = 3.

Identifiability analysis. When the identifiability is analyzed by the variance method in

Section 4.4.2 the conclusion is that all parameters should be nominated as unidentifiable

because they have inflated variances which should be analyzed carefully. The variance of

all parameters is large and can be seen in Table A.1 of Appendix A.3. For instance, the

variance of the most uncertain parameter θ2 is larger than the variance of the most precise

parameter θ4, i.e., var(θ2) = 74V ar(θ4).

Observing the variance-decomposition in Appendix A.3 Table A.1 obtained by applying

the SVD method in Section 4.4.2, the ill-conditioned singular value ς4 contributes with

more than 73% to the variance of all parameters. This contribution exceeds the predefined

threshold (πmax = 50%), consequently this method also suggests that θ is practically

unidentifiable with all unidentifiable parameters.

By using the results of the ill-conditioning analysis and after applying the QR method

for identifiability in Section 4.4.2, only the parameter θ2 is unidentifiable and the model

is then considered practically unidentifiable.

E2 - Biochemical network

Ill-conditioning analysis. Figure 7.2b contains the singular value spectrum of S(uIG)

spanning from ς10 = 2.0383 × 10−6 to ς1 = 6.4266 × 101 with κ = 3.1529 × 107 and

γ = 4.9061 × 105. With this large ill-conditioning and collinearity measures (see Section

3.2.1) the matrix S(uIG) is categorized as ill-conditioned (κ > κmax and γ > γmax)

and thus this problem is a rank-deficient ill-posed problem. Notice, that there is a well-

determined gap in the SVs of S(uIG) between large (ς1-ς7) and small (ς8-ς10) singular

values, see Figure 7.2b. In this case, the condition γmax is the limiting condition for the

ϵ-threshold, i.e., ϵ = ϵγ (see Eq. 3.3) and then only the first seven singular values are

122


1E-02

1E-01

1E+00

1E+01

1E+02

1 2 3 4

Singular value index i

∈∈∈∈κκκκ: lower bound κκκκ

∈∈∈∈


ill-

conditioned

ςςςςi

∈∈∈∈γγγγ: lower bound γγγγ

Sin

gu

lar

valu

e ( ςς ςς

i)

(a)

1E-06

1E-05

1E-04

1E-03

1E-02

1E-01

1E+00

1E+01

1E+02

1 2 3 4 5 6 7 8 9 10


∈∈∈∈=∈∈∈∈γγγγ∈∈∈∈κκκκ


ill-

conditioned

ςςςςi

Sin

gu

lar

valu

e ( ςς ςς

i)

(b)

1E-16

1E-14

1E-12

1E-10

1E-08

1E-06

1E-04

1E-02

1E+00

1E+02

1 3 5 7 9 1113151719212325272931333537394143


∈∈∈∈κκκκ

∈∈∈∈=∈∈∈∈γγγγ


wel

l-co

nd

itio

ned

ςς ςςi

Sin

gu

lar

valu

e ( ςς ςς

i)

(c)

Figure 7.2.: Singular value spectrum (SVs) of the sensitivity matrix evaluated at the initial designS(uIG) for problem (a) E1, (b) E2 and (c) E3. Each singular value less than ϵ-thresholdis considered ill-conditioned. (Figure taken from publication III - Lopez et al. (2015)in Appendix A.2 - reprinted from Computers & Chemical Engineering with permissionfrom Elsevier).

well-conditioned because ςi ≥ ϵ for i = 1, · · · , 7.

Identifiability analysis. Analyzing the parameter variance obtained by the variance

method in Section 4.4.2, the most precise parameter is σ29 = 0.15, which is the only pa-

rameter with variance less than 1 in this case study, and the most imprecise parameter is

σ210 = 2.4066× 1011 (see Appendix A.3 Table A.2 for more variance details). Furthermore,

other parameters also have variances larger than 2.6 × 107, i.e., θ5 and θ6. With these

inflated variances any parameter might be considered identifiable regardless the value of

the variance-threshold ρ.

123


The effect of the ill-conditioned singular values can be better observed by the SVD

method in Section 4.4.2 based on the variance-decomposition. Therein, the ill-conditioned

singular values ς8, · · · , ς10 contribute more than 50% to the variance of all parameters

except θ4 and θ9. Thus, these eight parameters are preliminary selected as practically

unidentifiable.

A further analysis is performed by using the QR method in Section 4.4.2 which is the

based of the SsS regularization (see Section 3.3.1). This method finds seven practically

identifiable parameters because of rϵ = 7 of S(uIG). The identifiable parameters are θ1,

θ3, θ5, θ9, θ7, θ4, θ2. In all analysis the model is considered practically unidentifiable.

E3 - ASM3

Ill-conditioning analysis. Figure 7.2c contains the singular value spectrum of S(uIG)

spanning from ς44 = 2.8427 × 10−16 to ς1 = 4.3461. Compared to E1 and E2, problem

E3 shows the largest condition number and collinearity index, with κ = 1.5289 × 1016

and γ = 3.5178 × 1015. The singular values gradually decay to zero which indicates an

ill-determined rank problem. Hence, it has the biggest severity of ill-posedness. Only the

first seven singular values (i.e., ςi for i = 1, · · · , 7) correspond to a well-conditioned matrix

(rϵ = 7). The selection of these well-conditioned singular values is based on ϵ = ϵγ .

Identifiability analysis. As expected, this model is categorized as practically unidentifi-

able for all techniques of Section 4.4.2. The variance of the most precise parameter is

σ214 = 1.23 × 106 (see Table A.3 of Appendix A.3). All parameter variances are highly

influenced3 by the ill-conditioned singular values and therefore, none of them are reliable.

Moreover, the parameter variances are inflated masking the individual identifiability of

parameters and making a preliminary parameter selection impossible. Thus, a more elabo-

rate analysis is applied to determine the identifiable parameters according to QR method

in Section 4.4.2. With this method seven parameters θ23, θ44, θ15, θ12, θ16, θ14, θ8 are

considered identifiable. More effects of this severe ill-posedness on computations of OED

criteria without regularization are discussed in Section 7.3.3.

Similarly as in E1 (Section 7.3.2), in E3 the presence of small singular values (defined by

ϵ = ϵγ) strongly reduces the number of well-conditioned singular values and is an indication

for nearly linearly dependencies rather than for numerical instabilities (see Section 3.2.1).

Moreover, if those small singular values are considered well-conditioned, the serious con-

sequences are overestimation of parameter variances, erroneous uncertainty quantification

and improper results of the identifiability analysis.

3In Table A.3 of Appendix A.3 21 parameters have individual variance-decomposition proportions greaterthan πmax = 0.5 corresponding to the most ill-conditioned singular values π40, · · ·π44. Nonetheless, thecomplete set of ill-conditioned singular values π8, · · ·π44 contribute almost totally to the variance of the44 parameters

124


7.3.3. Optimal design without regularization

In this section the results of the optimal design without regularization are discussed. For

all problems E1-E3, the application of the A-, D- and E- criteria yield the expected char-

acteristic changes in the spectrum according to Section 2.7.2. Generally, A and E-criteria

focus on the lower part of the SVs whereas the target of the D-design is mostly directed

to raise the top section of the spectrum. The values in Table 7.2 are used to define the

ϵ-threshold.

E1 - Fed Batch Fermentation

For E1, A and E-designs reach almost the same optimal experimental solution thus, almost

the same SVs (see Figure 7.3a) which generate a lift of the bottom section. This indicates

a variance reduction for all parameters because of the increment of the smallest singular

values (i.e., ς4(S(uA)) = 1.6735× 10−1 and ς4(S(uE)) = 1.6775× 10−1). For instance, the

variance of the most uncertain parameter θ2 is substantially reduced with respect to the

most precise parameter θ4 (i.e. σ22 = 8.6σ2

4). A- and E-designs also yield spectra with the

smallest collinearity index γ(S(uA)) = 5.9756 and γ(S(uE)) = 5.9611, respectively. The

ill-conditioned singular values are completely removed from the problem according to the ϵ-

threshold (see in Figure 7.3a that neither the spectrum of S(uA) nor S(uE) are intercepted

by the bound ϵ). On the contrary, for D-design the ill-conditioning remains with only three

singular values being well-conditioned (see Figure 7.3b). Nevertheless, the lift of the SVs

observed in this figure (especially in the bottom section) promotes parameter variance

reduction. All parameters have at least 50% of variance reduction (e.g., the most precise

parameter θ1 has 88% variance reduction). Unfortunately, the most uncertain parameter

θ2 remains with large variance (σ22 = 43σ2

1) and it is still considered unidentifiable.

E2 - Biochemical network

Figures 7.4a-b show the change in the SVs of A- and E- optimal designs for E2, respectively.

In Figure 7.4a can be observed that A-design generates a lift of the whole spectrum,

although the main elevation is seen in the bottom section. This indicates a variance

reduction for all parameters. For E-design in Figure 7.4b only the bottom section of the

spectrum is lifted. Indeed, A-design yields the best results in terms of condition number

and collinearity index with κ(S(uA)) = 3.0198× 107 and γ(S(uA)) = 1.7995× 104 which

correspond to ς10 = 5.5570×10−5 and ς1 = 1.6781×102. However, those values still indicateill-conditioning of S(uA) as κ > κmax and γ > γmax, and thus there exist unidentifiable

parameters. The number of well-conditioned singular values remains seven (ς1 − ς7), the

same as for the initial design. For the A-design, κmax is the limiting condition of the

ϵ-threshold (see Eq. 3.3), i.e., ϵ = ϵκ. In Figure 7.4a, the A-criterion (average variance,

see Eq. 2.36) is reduced from ΨA(uIG) = 2.4071 × 1010 to ΨA(uA) = 3.2393 × 107. For

the A-design, the most precise parameter θ1 has a variance of 2.6323 × 10−3 whilst the

125


1E-02

1E-01

1E+00

1E+01

1E+02

1 2 3 4

None-IG

None-A

None-E

∈∈∈∈=∈∈∈∈γγγγ

1E-02

1E-01

1E+00

1E+01

1E+02

1 2 3 4


None-IG

None-D

ill-

con

dit

ion

ed

ςς ςςi


∈∈∈∈=∈∈∈∈γγγγ

(b)

(a)Sin

gu

lar

va

lue

( ςς ςςi)

S

ing

ula

r v

alu

e ( ςς ςς

i)

Figure 7.3.: Change in the singular value spectrum (SVs) of the sensitivity matrix for the OEDwithout regularization. Results are shown for the sensitivity matrix at initial designS(uIG) and at optimal A-design S(uA) and E-design (S(uE)) for problem E1. (Figuretaken from publication III - Lopez et al. (2015) in Appendix A.2 - reprinted fromComputers & Chemical Engineering with permission from Elsevier).

most imprecise parameter θ10 has 3.2383× 108. Moreover, parameters θ5 and θ6 also have

enormous variances larger than 5×104. For θ3, θ4, θ5, θ6, θ7, θ8 and θ10 the ill-conditioned

singular values ς8 − ς10 contribute with more than 65% to their variance.

For E-optimal design the results are not better. This is mainly because of the big

(numerical) difficulty in minimizing the large value of the design criterion generated by

the nearness to zero of the smallest singular values. Note ΨE(uIG) = 1/ς210 = 2.4069×1011.Accordingly the E-criterion could not be further reduced, with ΨE(uE) = 3.1409 × 109

and the SVs of S(uE) varies from ς10 = 1.7843× 1010 to ς1 = 6.7548× 101 still indicating

an ill-conditioned matrix S(uE). With this design eight well-conditioned singular values

are found. After the optimal design, the most precise parameter θ1 has a variance of

1.0304× 10−1 whilst the most imprecise parameter θ10 has 3.1407× 109 according to Eq.

3.5. Moreover, parameters θ5 and θ6 also have enormous variances larger than 1×105. For

θ5, θ6 and θ10 the ill-conditioned singular values ς9− ς10 contribute with more than 99.9%

of their variance. In general, there is an improvement in parameter variance because of

the increment of the small singular values. However, also the E-design does not produce

a well-posed problem, i.e., a problem free of identifiability problems.

126


1E-06

1E-05

1E-04

1E-03

1E-02

1E-01

1E+00

1E+01

1E+02

1 2 3 4 5 6 7 8 9 10


None-IG

None-E

1E-06

1E-04

1E-02

1E+00

1E+02

1 2 3 4 5 6 7 8 9 10

None-IG

None-A

∈∈∈∈=∈∈∈∈κκκκ

ill-conditioned

ςςςςi


Sin

gu

lar

va

lue

( ςς ςςi)

∈∈∈∈=∈∈∈∈κκκκ


ill-

conditioned

ςςςςi

Sin

gu

lar

va

lue

( ςς ςςi)

(b)

(a)

Figure 7.4.: Change in the singular value spectrum (SVs) of the sensitivity matrix for the OEDwithout regularization. Results are shown for the sensitivity matrix at initial designS(uIG) and at optimal design for problem E2 with: (a) A-criterion S(uA), and (b) E-criterion S(uE). Note that the shown lower bound ϵ is computed for S(ucrit). (Figuretaken from publication III - Lopez et al. (2015) in Appendix A.2 - reprinted fromComputers & Chemical Engineering with permission from Elsevier).

E3 - ASM3

The severe ill-posedness of problem E3 (see 7.3.2) makes a reliable computation of an

OED impossible. The computation of the eigensystem of F (uIG) and C(uIG) does not

give reasonable results. For F (uIG) one negative eigenvalue is obtained whereas eight

negative eigenvalues are found for C(uIG). The reasons are numerical errors (related to

the machine precision), originated by matrix operations (matrix product in Eq. 2.10 and

inversion in Eq. 4.3) and the solution of the ill-conditioned matrix eigensystem. It is

obvious that the computed negative eigenvalues are illogical for the positive semi-definite

matrices F and C.

Still, while a D-criterion value cannot be computed because of an error in the applied

algorithm, A- and E-criterion values can be obtained. To illustrate the magnitude of ill-

conditioned OED criteria, the singular values ςi for i = 1, · · · , Nθ of the sensitivity matrix

are used to compute A and E-criteria according to Eqs. 2.36 and 2.38, respectively. Those

values read ΨA(uIG) = 2.8×1029 and ΨE(uIG) = 1.2×1031, respectively. It has to be noted,that the same criteria have values of ΨA(uIG) = 1.8×1016 and ΨE(uIG) = 8.8×1017 when

the eigenvalues ςi for i = 1, · · · , Nθ of C(uIG) are taken into consideration. In general,

criteria computed from the singular values of S are by definition more reliable. They

also capture better the high singularity of the matrix C (e.g., presence of singular values

127


near the machine precision, i.e., 10 × 10−16). It is important to note that under these

circumstances a gradient-based minimization is impossible. Eventually computed results

are not reliable and it is not advisable to draw conclusions about the identifiability of

parameters for this kind of severe ill-conditioned problems.

To summarize, for all studied problems the ill-conditioning of S cannot be overcome by

solving the optimal design problem without regularization. Although the ill-conditioning

is sometimes improved (reduction of the parameter variance), in none of the problems this

is sufficient for guaranteeing stable PE and practically identifiability.

7.3.4. Optimal design with regularization

In this section the solution of the regularized OED problem is discussed. The common fea-

ture for all regularizations techniques is the generation of a new better-conditioned sensitiv-

ity matrix at the initial design SReg(uIG), Reg=SsS,TSVD,Tikh, i.e., the ill-conditioningis partially or completely removed depending on the selected value of the regularization

parameter. Details about the effect of the regularization parameter λ in Tikhonov will

be given in Section 7.3.4. The difference between the original SVs of the sensitivity ma-

trix without regularization, i.e., SNone(uIG) and the new regularized sensitivity matrix

SReg(uIG) is shown. Notice that ill-conditioned singular values are removed by either pa-

rameter space reduction (Reg=SsS) or singular value truncation (Reg=TSVD) or addition

of a well-conditioned matrix (Reg=Tikh). For the determination of the well-conditioned

singular values of S (via the numerical rank rϵ, see Section 2.2.1.) the thresholds in Table

7.2 are used.

Subset Selection (Reg=SsS)

For the SsS (see Section 3.3.1) the following features are observed:

1. For all studied examples E1-3, at the initial design uIG, only a subset of the parameter

space (determined by the rank of S(uIG)) is identifiable. The new reduced problems

(i.e., regularized problems) promote a well-conditioned sensitivity matrix SSsS(uIG),

that means all singular values in its spectrum are well-conditioned. Thus, even for

the strongly ill-conditioned problem E3, the OED could be successfully computed

for all considered design criteria.

For problem E2 and E3, the SVs of the regularized matrices SSsS(uIG) and

SSsS(ucrit) are shown in Figures 7.5a and 7.5b, respectively. For totally linear inde-

pendent parameters each singular value of S would match exactly with one parameter.

As this is not the case, each singular value is related to several parameters. Accord-

ingly, the SVs of the regularized matrix SSsS(uIG) is not identical to the SVs of the

original matrix S(uIG). Moreover, it can be seen, in Figures 7.5a and 7.5b, that

those singular values which belong to the reduced matrix SSsS take equal or smaller

128


1E-06

1E-04

1E-02

1E+00

1E+02

1 2 3 4 5 6 7 8 9 10

SsS-IG

SsS-A

None-IG-Org

SsS-A-Org

1E-04

1E-03

1E-02

1E-01

1E+00

1E+01

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20


SsS-IG

SsS-E

None-IG

SsS-E

∈∈∈∈=∈∈∈∈κκκκ

∈∈∈∈=∈∈∈∈γγγγ

Section of the complete

SVs from ς1 to ς44

Sin

gu

lar

va

lue

( ςς ςςi)

S

ing

ula

r v

alu

e ( ςς ςς

i)

(b)

(a)

Figure 7.5.: Singular value spectrum (SVs) of the original and the regularized sensitivity matrixafter applying Reg=SsS, S and SSsS , respectively. Results are shown for the initialuIG and optimal ucrit experimental designs for problem: (a) E2 where ucrit = uA,and (b) E3 where ucrit = uE . The solid-black curve shows the SVs of the originalsensitivity matrix without regularization at initial design S(uIG). The black-crossand gray-cross markers show the SVs of the regularized (reduced) matrix at initialand optimal designs SSsS(uIG) and SSsS(ucrit), respectively. Note that the lowerbound ϵκ is computed for SSsS(ucrit). (Figure taken from publication III - Lopez etal. (2015) in Appendix A.2 - reprinted from Computers & Chemical Engineering withpermission from Elsevier).

values than those coming from the original matrix S (interlacing inequality for sin-

gular values [117, 24]). Nevertheless, the tendency (for rank-deficient problems) to

conserve the largest singular values associated with the well-conditioned parameters

is evident (see Figure 7.5a for Problem E2).

2. The regularized OED improves only the active parameter subset and thus, only the

associated singular values. In Figure 7.5 it can be seen how the reduced SVs is lifted

(shifted to higher values) from SSsS(uIG) to SSsS(ucrit).

Despite of the improvements in the ill-conditioning of the reduced matrix SSsS(ucrit),

for the optimal design the ill-conditioning of the original matrix S(ucrit) is not always

improved, e.g., for problem E1 (data not shown here), all optimal designs generates

SVs with higher condition number and collinearity index. For the D-design the

following values are obtained: κ(S(uD)) = 8.4 × 102 > κ(S(uIG)) and γ(S(uD)) =

3.6× 101 > γ(S(uIG)).

For example E2, at the A-optimal design the original matrix S(uA) continues being

ill-conditioned with γ(S(uA)) = 1.9 × 104 > γmax. Nevertheless, the increment in

129


ς1 and ς10 of S(uA) (Figure 7.5a) produces a reduced condition number κ(S(uA)) =

3.2 × 106 < κ(S(uIG)) and also smaller parameter variances, e.g., σ25 = 5.19× 104,

σ26 = 5.26× 104 and σ2

10 = 3.2383× 108. But still κ(S(uA)) > κmax and the rank

of the problem is still seven. Accordingly, the ill-conditioning is improved but the

problem still remains ill-posed.

3. In some cases A and E-optimal designs could improve the problem rank and attract

new parameters to the identifiable region.

This is shown for problem E3 with the E-design in Figure 7.5b, where the problem

rank improves from 7 to 9. The improvement of the SVs of SSsS (from uIG to uE)

moves ς8 and ς9 into the feasible region defined by ϵ = ϵγ . The condition number

and collinearity index are slightly reduced κ(S(uE)) = 1.4365× 1016 < κ(S(uIG) =

1.5289×1016) and γ(S(uE)) = 2.3338×1015 < γ(S(uIG)) = 3.5178×1015. However,

the matrix S(uE) conserves its severe ill-conditioning. The same increment in the

active parameter subset dimension is observed for the A-optimal design.

Truncated Singular Value Decomposition (Reg=TSVD)

For the TSVD (see Section 3.3.2) the following features were observed:

1. In contrast to the indirect manipulation of the SVs in SsS, in TSVD the SVs is di-

rectly manipulated (truncated) and the remaining singular values do not change. The

truncated spectrum is then used to construct (approximate) a new matrix STSV D.

2. The TSVD produces a well-conditioned but rank-deficient matrix STSV D(uIG). The

covariance matrix CTSV D(uIG) (see Table 3.1) is then also rank-deficient and thus,

the D-optimal criterion cannot be computed. Hence, only A and E-designs were

studied for TSVD.

3. In Figure 7.6a-b, the SVs of the A and E-designs for problem E2 and E3, respec-

tively are depicted. From the comparison of Figures 7.5 and Figure 7.6 it can be

observed, that for both TSVD and SsS based optimal designs, very similar results

were obtained. This is true for the problem rank (number of identifiable parameters/

singular values) as well as for the influence of the different design criteria (see also

Figure 2.2). However, both, the computed design criterion and the optimal design

vector differ in their values according to the regularization technique applied. This

is certainly due to the differences in the number of considered parameters (whole

parameter space for TSVD and reduced space for SsS) and the differences in the

values of the initial SVs (compare Figures 7.5 and Figure 7.6). It should be also

noted that the design criteria in Eq. 2.36-2.38 are scaled differently, with Nθ and rϵ

for TSVD and SsS, respectively.

130


1E-04

1E-03

1E-02

1E-01

1E+00

1E+01

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Singular value (ςςςςi)

None-IG

TSVD-E

None-IG

TSVD-E

1E-06

1E-04

1E-02

1E+00

1E+02

1 2 3 4 5 6 7 8 9 10

None-IG-OrgTSVD-A-OrgNone-IG-OrgTSVD-A-Org

∈∈∈∈=∈∈∈∈κκκκ

Sin

gu

lar

va

lue

( ςς ςςi)

S

ing

ula

r v

alu

e ( ςς ςς

i)

∈∈∈∈=∈∈∈∈γγγγ

Section of the

complete SVs from

ς1 to ς44

(b)

(a)

Figure 7.6.: Singular value spectrum (SVs) of the original and the regularized sensitivity matrixafter applying Reg=TSVD, S and STSV D, respectively. Results are shown at initialuIGand optimal ucrit experimental designs, respectively for problem: (a) E2 whereucrit = uA, and (b) E3 where ucrit = uE . The solid-black curve shows the SVs of theoriginal sensitivity matrix without regularization at initial design S(uIG). The black-cross and gray-cross markers show the SVs of the regularized (approximated) matrixat the initial and optimized designs STSV D(uIG) and STSV D(ucrit), respectively. Notethat the lower bound ϵκ is computed for STSV D(ucrit). (Figure taken from publicationIII - Lopez et al. (2015) in Appendix A.2 - reprinted from Computers & ChemicalEngineering with permission from Elsevier).

Tikhonov Regularization (Reg=Tikh)

In this section the effect of weak and strong regularizations caused by the regularization

parameter λ (see Eq. 3.10) in Tikhonov regularization (see Section 3.3.3) is addressed.

The weak regularization is determined by λ1 = 0.001 whereas λ2 = 0.1 for the strong

regularization. The new features of the regularized sensitivity matrix 4 ST ikh are also

described. After applying this regularization, the following features are observed:

1. This regularization yields an increment (lift) of the SVs of the matrix S(uIG) at the

initial design maintaining the original parameter space dimension and the number

of singular values. For the stronger regularization parameter (i.e., λ2) the transfor-

mation of the SVs is also stronger (compare Figures 7.7a and b). The lift of the SVs

yields a transformation of the bottom section of the SVs by fixation of the small

singular values to λ2. Accordingly, singular values less than the respective regular-

ization parameter are replaced by λ2, i.e., ∀iςi < λ2 then ςi ≈ λ2. However, the

4STikh =

[S

λ2L

].

131


larger singular values may be also shifted.

For Problem E2 with λ1, the effect of the regularization is very small and the SVs

of ST ikhλ1

almost remains identical to that shown in Figure 7.2b. It may be expected,

taking into account that the smallest singular value of S, ς10 = 2.0383 × 10−6 is

larger than λ21 = 1.0 × 10−6. Nevertheless, a reduction of 10% in the condition

number and collinearity index is achieved. On the other hand, choosing a stronger

regularization with λ2, the last three singular values of S are increased to values

around λ22 = 1.0 × 10−2 in ST ikh

λ2)(uIG) (see Figure 7.7). Therefore, κ(ST ikh

λ2)(uIG))

and γ(ST ikhλ2

)(uIG) are drastically decreased to 6.4× 103 and 1.0× 102, respectively.

Despite of this improvement, the number of well-conditioned singular values is seven

(because ςi ≈ λ22 < ϵ = 6.67×10−2 for i = 8, · · · , 10) demonstrating that the problem

still remains ill-posed.

For Problem E3 with λ1, the smallest nine singular values (i.e., ςi ≈ λ21 for

i = 36, · · · , 44) are increased (see Figure 7.7a). However, the regularization is not

strong enough. Hence the matrix CT ikh1 still has three negative eigenvalues. For the

stronger regularization with λ2, the smallest thirty five singular values (i.e., ςi ≈ λ22

for i = 10, · · · , 44) are increased (see Figure 7.7b). The new regularized matrix

CT ikh2 has positive eigenvalues and the new condition number and collinearity index

are κ(ST ikhλ2

)(uIG)) = 4.3461 × 102 and γ(ST ikhλ2

)(uIG)) = 1.0 × 102. Again the ill-

conditioning improvement is not enough to turn the problem in a well-conditioned

one (the same seven well-conditioned singular values), however its transformation

is notorious. It must be noted that values of λ2 larger than ϵ, which completely

remove the ill-conditioning, might be selected. Nonetheless, those values strongly

affect the problem fitting producing large residuals. Thus they are not considered in

this study.

2. For problem E1 and E2, A- and D-optimal designs are successfully computed, see

e.g. Figure 7.8. Moreover, the influences of the different criteria are as described in

Section 2.5, see e.g. the results for E2 for the A-design in Figure 7.8a.

For Problem E3 with λ1, the D-design could not be computed as its criterion is

infinite at uIG. In contrast, for the stronger regularization with λ2 the D-design could

be computed and the criterion is reduced from ΨD(uIG) = 2.2 × 103 to ΨD(uD) =

1.6× 103.

3. Generally, the computed E-designs are not always reliable especially for strong reg-

ularization with λ2. Here the smallest singular values of ST ikh(ucrit) are repeatedly

corrected in each optimization iteration (i.e., ςi ≈ λ2 ∀iςi < λ2). Accordingly, if

the optimal design does not promote a SVs with a smallest singular value ςNθlarger

than λ2, this singular value is always approximated to λ2. This situation generates

an almost constant E-criterion value, equal to 1/(λ2)2, and makes the optimization

132


1E-16

1E-14

1E-12

1E-10

1E-08

1E-06

1E-04

1E-02

1E+00

1E+02

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43

None-IG

Stikh_


∈∈∈∈=∈∈∈∈γγγγ

wel

l-co

nd

itio

ned

ςς ςςi

Sin

gu

lar

valu

e ( ςς ςς

i)

Sin

gu

lar

valu

e ( ςς ςς

i)

(b)

(a)

1E-16

1E-14

1E-12

1E-10

1E-08

1E-06

1E-04

1E-02

1E+00

1E+02

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43


None-IG

Stikh_

∈∈∈∈=∈∈∈∈γγγγ


wel

l-co

nd

itio

ned

ςς ςςi

Figure 7.7.: Singular value spectrum (SVs) of the original and the regularized sensitivity matrixafter applying Reg=Tikh, S and STikh, respectively for problem E3 with: (a) λ1 =0.001 (weak regularization), and (b) λ2 = 0.1 (strong regularization). The solid-blackcurve shows the SVs of the original sensitivity matrix without regularization at thethe initial design S(uIG). The black-cross markers show the SVs of the regularizedmatrix at the initial design STikh(uIG). (Figure taken from publication III - Lopez etal. (2015) in Appendix A.2 - reprinted from Computers & Chemical Engineering withpermission from Elsevier).

unreliable.

7.3.5. Influence of the available measurement information on ill-posedness

(New)

This section summarized unpublished results about the influence of the available mea-

surement information on the conditioning of the sensitivity matrix (as a measure of the

ill-posedness of the PE) for problem E1. For doing so, a simple adaptation of E1 by adding

more experimental data is considered. In the following, the original sensitivity matrix S

was considered as base case. The problem formulation is changed and three different ex-

perimental data sets are considered, i.e., E1a, E1b and E1c (see Table 7.1). Firstly, the

state variable x2 is considered as additional measured variable, then the new experimental

data vector reads ym = (x1, x2)T . Secondly, the sampling times are changed to increase

the number of experimental points to (2 : 2 : 20), (0.5 : 0.5 : 20), and (0.25 : 0.25 : 20),

with a total number of experimental points Ny ·Nm ·Ne equal 20, 80 and 160 for problem

E1a (case reported in Asprey and Macchietto, 2000), E1b, E1c respectively.

The SVs of S evaluated at uIG for problem E1, E1a, E1b, E1c are shown in Figure 7.9.

It can be seen that adding more measurement information, the SVs reaches larger values

and shifts especially the singular value ςNθto higher values. The ϵ-threshold for E1 and

E1a is ϵ = maxϵκ, ϵγ = ϵγ , whereas for E1b and E1c the ϵ-threshold was ϵ = ϵκ. Note,

133


1E-06

1E-05

1E-04

1E-03

1E-02

1E-01

1E+00

1E+01

1E+02

1 2 3 4 5 6 7 8 9 10

None-IG

Tikh_2_lambda0.1-IG

Tikh_2_lambda0.1-A

_ _()

∈∈∈∈=∈∈∈∈γγγγ

1E-06

1E-05

1E-04

1E-03

1E-02

1E-01

1E+00

1E+01

1E+02

1 2 3 4 5 6 7 8 9 10


None-IG

Tikh_2_lambda0.1-IG

Tikh_2_lambda0.1-E

_

_()

∈∈∈∈=∈∈∈∈κκκκ

Sin

gu

lar

valu

e ( ςς ςς

i)

Sin

gu

lar

valu

e ( ςς ςς

i)

(a)

Figure 7.8.: Singular value spectrum (SVs) of the original and the regularized sensitivity matrixafter applying Reg=Tikh, S and STikh, respectively. Results are shown at initialand optimal experimental design, uIG and ucrit, respectively, for problem E2 with:(a) ucrit = uA and (b) ucrit = uE . The solid-black curve shows the SVs of theoriginal sensitivity matrix without regularization at the initial design S(uIG). Theblack-cross and gray-cross markers show the SVs of the regularized matrix at initialdesign STikh(uIG) and at optimal design STikh(ucrit), respectively. Note that thelower bound ϵκ is computed for the SVs of STikh(ucrit). (Figure taken from publicationIII - Lopez et al. (2015) in Appendix A.2 - reprinted from Computers & ChemicalEngineering with permission from Elsevier).

1E-02

1E-01

1E+00

1E+01

1E+02

1E+03

1 2 3 4


∈∈∈∈γγγγ

∈∈∈∈κκκκ|E1c

E1c

E1b

E1a

E1

∈∈∈∈κκκκ|E1

Sin

gu

lar

va

lue

ind

ex (

ςς ςςi)

∈∈∈∈κκκκ|E1a

Figure 7.9.: Comparison between the singular value spectrum (SVs) of the sensitivity matrix Sat the initial design uIG for example E1 with different experimental data sets. E1a-care adaptations of E1 including x1 and x2 as measurable variables and increasing thenumber of experimental points to 20, 80 and 160, respectively.

134


that ϵκ is a function of ς1. Therefore it is changing for each spectrum whilst ϵγ is constant

for all cases. The SVs for E1a, E1b and E1c are over the bounds ϵγ and ϵκ and thus, it

indicates well-conditioned problems with reduced condition number κ(S) and collinearity

index γ(S). As a result, E1a, b, c are well-posed. It can also be observed in Figure 7.9,

that the influence of the additional measured variable x2 in E1a-c is larger compared to

the influence of an increase in sampling points because it changes the slope (condition

number) of the spectrum of E1a. Nevertheless, the addition of more experimental points

(although only generating a slight parallel lift of the spectra) also reduce the parameter

variance of problems E1b and E1c.

1E-02

1E-01

1E+00

1E+01

1E+02

1E+03

1 2 3 4


∈∈∈∈γγγγ

E1c-D

E1c-A

E1c-E

E1c-IG

()

Sin

gu

lar

va

lue

ind

ex (

ςς ςςi)

∈∈∈∈κκκκ|E1c-D

∈∈∈∈κκκκ|E1c-IG

∈∈∈∈κκκκ|E1c-A/E

Figure 7.10.: Comparison between the singular value spectrum (SVs) of the sensitivity matrixS(ucrit) at the optimal design ucrit with crit=A,D,E for the well-posed case E1c(no regularization). Spectrum labeled S is evaluated at the initial design uIG.

Finally, the A, D and E-optimal designs for E1 and its adaptations E1a-c are computed.

Note that the singular value spectra in Figure 7.9 correspond to the initial guesses of the

corresponding OED problems. In Figure 7.10, the spectra are shown for S(ucrit) after

performed A, D and E-optimal designs for the adaptation E1c. For all adaptations E1a-c,

the A-, D-, E-optimal designs promote well-conditioned sensitivity matrices S(ucrit) with

reduced collinearity indices, i.e., γ(S(ucrit)) < γ(S), despite of an increase in the condition

number not exceeding κmax, i.e., κ(S) < κ(S(ucrit)) < κmax). On the contrary, the ill-

conditioned case E1 retained the ill-conditioning of the sensitivity matrix after conducting

D-design (data not shown here).

Moreover, for all evaluated problems E1, E1a, E1b, E1c, similar effects were observed

regarding the SVs of S(ucrit) when applying the different design criteria. According to

Figure2.2, the A and E-designs lifted the bottom section of the SVs. However, the top

section was also elevated for E1a-c. Generally, A and E-designs produced almost identical

SVs (see, for instance, Figure 7.10 for E1c). They also encouraged SVs with the smallest

collinearity index. In all D-optimal designs the largest increment was observed for the top

135


section of the SVs, see, for instance, Figure 7.10 for E1c. The D-optimal SVs had the

largest condition number κ (without exceeding κmax). For instance, the biggest increase

in κ was computed for the D-optimal design for E1c as 148%.

7.3.6. Monte Carlo study (New)

This section is intended to numerically evidence the weaknesses of optimal designs coming

from ill-posed parameter estimations. All results shown here are accomplished by applying

the Monte Carlo method (second version) described in Section 4.2.2 to the problem E2.

The sets of synthetic data are sampled from N (Y m, Cy), the number of replications is 5000

(L=5000) and the assumed standard deviation of measurement error is σ2y = 0.01. With

these results the estimator performance in terms of precision (Section 4.3.1) and accuracy

(Section 4.3.2) as well as ill-conditioning by Monte Carlo (Section 4.4.1) are presented.

Moreover, identifiability analysis is conducted by using the variance method in Section

4.4.2.

Study of the initial design

This section statistically investigates the estimator behavior computed from parameter

estimations using perturbed experimental data generated by the experimental conditions

of the initial design uIG without optimization, u = uIG. Figure 7.11a summarizes the

statistical results of the (normalized) parameter estimates in form of box plots. Therein

the cost function (residual) norm and the true parameter values are also schematized.

Estimator analysis The performance of the estimator Θj of the parameter θj for j =

1, · · · , Nθ is analyzed as follows. In this scenario the estimator of the parameters θ1, θ2, θ7

and θ9 are quite precise with relative standard deviation5 less than 8% and accurate with

relative bias6 less than 1% (see comparatively short box plots in Figure 7.11a). However,

parameters θ5, θ6, θ8 and θ10 have box plots comparatively tall with mean values (solid-

black points in Figure 7.11a) far away from the true values. That means those parameters

are the most uncertain in the system. Concretely, in terms of accuracy θ5, θ6, θ8 and

θ10 have large relative bias 60%, 61%, 283% and 2777%, respectively, whereas in terms

of precision the same parameters have relative standard deviations equal to 99%, 99%,

903% and 197%, respectively. Finally, in order to simultaneously consider the effects of

variance and bias of the estimator Θ the empirical MSE is computed. The value of MSE

is 5.75× 104 with 85% of variance variance and 15% of bias contribution.

Ill-conditioning and identifiability diagnosis Using the results in Figure 7.11a and Sec-

tion 7.3.6 is easy to conclude that the parameter estimation of E2 at uIG is unstable due

5with respect to the mean of the parameter distribution6with respect to the true value

136


10-2

100

102

104

r1max Ks K2 KM1 r3max KM2 Ksynmax KIB Yxs KIA ResNorm

Para

mete

r v

alu

es

Mean Value

True Value

θθθθ1

θθθθ2

θθθθ3

θθθθ4

θθθθ5

θθθθ6

θθθθ7

θθθθ8

θθθθ9

θθθθ10

CF

Norm

(a)

10-2

100

102

104

r1max Ks K2 KM1 r3max KM2 Ksynmax KIB Yxs KIA ResNorm

Para

mete

r v

alu

es

Mean Value

True Value

θθθθ1

θθθθ2

θθθθ3

θθθθ4

θθθθ5

θθθθ6

θθθθ7

θθθθ8

θθθθ9

θθθθ10

CF

Norm

(b)

Figure 7.11.: Monte Carlo problem E2: Box plots of normalized parameter estimates and corre-sponding cost function norm obtained at (a) initial design uIG and (b) E-optimaldesign uE without regularization.

to the large variability in estimators of some parameters, i.e., θ5, θ6, θ8 and θ10. This fact

proves the findings by using the sensitivity method in Section 7.3.2. However, it is here

said that these four parameters already reflect the ill-conditioning (being unidentifiable)

but they are not the complete consequence. Other parameters such as θ3 and θ4 also

have large relative standard deviations of 57% and 60%, respectively. If it is considered

a maximum acceptable parameter variability of 20% those parameters are also consid-

ered unidentifiable. In total six parameters are practically unidentifiable and the model

is therefore unidentifiable. It is important to point out that the the sensitivity method

for ill-conditioning and QR method for identifiability have pretty good qualitative results

compare to Monte Carlo studies but due to the effect of inflated parameter variance the

quantitative results are not reliable.

Study of the optimal design without regularization

This section statistically investigates the estimator behavior computed from parameter

estimations using perturbed experimental data generated by the experimental conditions

of the E-optimal design uE without optimization, u = uE . Figure 7.11b summarizes the

statistical results of the (normalized) parameter estimates in form of box plots. Therein

the cost function (residual) norm and the true parameter values are also schematized.

137


Estimator analysis The experimental conditions in the E-optimal design promotes es-

timates with reduced parameter variance for those parameters related to the most ill-

conditioning parameters θ5, θ6, θ8 and θ10 and not only to the most imprecise parameter

θ10, see Figure 7.11b. That is expected according to the explained in Section 2.7.2. The

new reduced relative standard deviation for parameters θ5, θ6, θ8 and θ10 are now 47%,

46%, 39% and 69%, respectively. That seems an appropriate behavior of an optimal design.

However, the cost is a lose of precision of parameters such as θ1, θ2, θ7 and θ9 which had

less variability in the scenario before conducting the optimal design (see Section 7.3.6).

The new relative standard deviations of θ1, θ2, θ7 and θ9 are 12%, 41%, 14%, and 15%,

respectively. Moreover, the bias of these parameters is larger than before reaching up to

7% (compared to the maximum 1% with uIG). The most affected parameter in this design

was θ4 whose relative standard deviation increased to 187% and accuracy deteriorated to

a relative bias of 106%.

On the other hand, the MSE of this scenario is 2.61× 103 with 40% of variance con-

tribution and 60% of bias contribution. These MSE reduction is basically generated by

the parameter variance control of the most uncertain parameters θ8 and θ10 which also

encourages an improvement in their accuracy. However, it is important to note that the

large variability of θ8 and θ10 is here generated at uIG by their low sensitivities. That

means those parameters did not have enough influence on the measured outputs at the

initial design. This situation does not change by the E-design and those parameters are

again in the last positions of insensitive parameters. Having to, to increase precision of

parameters which by definition does not have enough effect in the model at the expense

of those which really may be identified with the available data does not make sense. This

a consequence to conduct optimal experimental designs under ill-posedness.

Ill-conditioning and identifiability diagnosis According to Monte Carlo results in Figure

7.11b this scenario still preserves the ill-conditioning although it is significant improved

after the E-optimal design. As mentioned immediately above the parameter variability

has been reduced however the reduction is not sufficient to classify the problem as iden-

tifiable (all parameter variances should be less than 20%). The parameters which do not

fulfill the maximum acceptable parameter variability of 20% are θ3, θ4, θ5, θ6, θ8 and θ10.

Consequently they are categorized as unstable and also practically unidentifiable.

Study of the optimal design with regularization

In this section the regularized A-optimal design computed by using subset selection

(Reg=SsS) in OED (results are shown in Figure 7.5a) is implemented in two Monte Carlo

studies:

• Study I: Monte Carlo with parameter estimations without regularization, i.e.,

Reg=None but with experimental conditions of the A-optimal design when Reg=SsS.

138


• Study II: Monte Carlo with parameter estimations with regularization, i.e., Reg=SsS

but with experimental conditions of the A-optimal design when Reg=SsS.

Figure 7.12 summarizes the results of the (normalized) parameter estimates and the cost

function (residual) norm in form of box plots. The true parameter values are also schema-

tized. It will be shown that optimal designs which do not completely overcome the ill-

conditioning issues of the original problem do not perform well in parameter estimations

without regularization. However, if the parameter estimations are again regularized the

parameter variance control imposed by fixing the ill-conditioned parameters does not af-

fect the most independent parameters of the system. Special attention should be paid to

those parameters shown correlated behavior at least to monitor their new variance which

should not be larger than the scenario with the initial design.

10-2

100

102

\theta_1 \theta_2 \theta_3 \theta_4 \theta_5 \theta_6 \theta_7 \theta_8 \theta_9 \theta_10 ResNorm

Para

mete

r v

alu

es

Mean Value

True Value

θθθθ1

θθθθ2

θθθθ3

θθθθ4

θθθθ5

θθθθ6

θθθθ7

θθθθ8

θθθθ9

θθθθ10

CF

Norm

(a)

10-2

100

102

\theta_1 \theta_2 \theta_3 \theta_4 \theta_5 \theta_6 \theta_7 \theta_8 \theta_9 \theta_10 ResNorm

Para

mete

r v

alu

es

Mean Value

True Value

θθθθ3

θθθθ5

θθθθ7

θθθθ8

θθθθ9

θθθθ10

CF

Norm

θθθθ1

θθθθ2

θθθθ4

θθθθ6

(b)

Figure 7.12.: Monte Carlo problem E2: Box plots of normalized parameter estimates and cor-responding cost function norm obtained at regularized A-optimal design uA withReg=SsS by solving parameter estimations with (a) Reg=None (Study I) and (b)Reg=SsS (Study II).

Estimator analysis The study I yields estimators with large parameter variance and bias.

This time six parameters (θ3, θ4, θ5, θ6, θ8 and θ10) have parameter variance larger than

85% whereas in the scenario where the experimental conditions without optimization are

used i.e., at uIG only four parameters overpassed this percentage. The only parameters

with slightly parameter variance reduction are θ8 and θ10 with relative standard deviation

139


of 752% and 86%, respectively. Regarding bias the only parameter with improvement is

θ10 with new relative bias of 1870% (which is not promising). Other parameters remain

with similar accuracy to the case with the initial design although θ8 is more inaccurate

(relative bias of 516%). The new value of MSE is 3.12× 104 with 87% of variance contri-

bution and 13% of bias contribution. These overall performance measure indicates that

the new optimal design does not any effect in the still ill-conditioned parameter estima-

tions. Therefore, the effort invested in conducting and implementing this kind of optimal

experimental design is meaningless if the new estimations are not numerically stabilized

by regularization.

The Study II illustrates this new scenario. The results in Figure 7.12b display the

effect complete parameter variance control of the regularization. The former problematic

parameters (due to large variance) θ5, θ6, θ8 and θ10 at the initial design (see Section

7.3.6) are in this case the most precise parameters (except θ8 which has a relative standard

deviation of 23%) due to their fixation by regularization. Parameter θ10 is in all replications

fixed at its initial guess. Nonetheless, parameters θ1 and θ2 become more imprecise maybe

due to remaining ill-conditioning not detected by the ϵ-threshold. In terms of accuracy

the relative bias of all parameters is reduced except for θ1 and θ2 with 3% and 5%. The

apparently positive effect of the regularization by SsS on parameter bias might be related to

the low sensitivity and (maybe) high correlation of the selected unidentifiable parameters.

The combination of these factors makes possible to fix the problematic parameters without

introduce an extra bias in the remaining parameters. In other cases where the selection of

the parameters to be fixed does not considered these aspects might lead to highly biased

estimations.

Ill-conditioning and identifiability diagnosis According to Monte Carlo results in Figure

7.12a, Study I completely preserves the ill-conditioning without any significant improve-

ment after using the A-optimal design. Eight parameters (θ1, θ2, θ3, θ4, θ5, θ6, θ8 and θ10)

are considered unstable and unidentifiable overpassing the predefined relative standard de-

viation limit of 20%. On the contrary, the ill-conditioning in the regularized Monte Carlo

(Study II) seems to be better. However, parameters θ1, θ2, θ3, θ4 and θ8 are considered

slightly unstable but anyway unidentifiable because they have relative standard deviation

between 23% and 41%.

7.3.7. Summary and conclusions

In model-based parameter estimation and experimental design the formulation of well-

posed problems is crucial for the numerical computation of stable and unique solutions.

Thus, it is strongly advisable to perform the relatively simple local analysis of the ill-

conditioning of the sensitivity matrix (S). Moreover, if an ill-posed problem is identified,

its problem type, either rank deficient or of ill-determined rank, and problem severity

140


should be assessed. Important indicators suggested in this contribution are the singular

value spectrum, condition number, and collinearity index of the sensitivity matrix.

There exists a relationship between those singular values in the SVs of S and commonly

used metrics for parameter identifiability and OED, namely the eigenvalues of F (FIM)

and the eigenvalues of the parameter covariance matrix (C). This information is partic-

ularly useful when dealing with ill-posed experimental design problems, as it is highly

recommendable to do the numerical analysis and implementation based on the singular

values of S. Additionally, a graphical interpretation of the influence of the alphabetic ex-

perimental design criteria applied to the SVs has been presented. It turns out that A-

and E-optimal criteria mainly improve the smallest singular values of S while D-optimal

criterion improves the largest singular values. Thus, the potential of an experimental de-

sign for improving the parameter precision can be analyzed similarly to the well-known

graphical interpretation of the influence of the alphabetic design criteria applied to the

parameter variances stored in C. However, for ill-posed problems with ill-conditioned S

and consequently ill-conditioned F and C, the direct application of the alphabetic design

criteria may lead to numerical instabilities, especially in the eigensystem of F and mean-

ingless designs for the next parameter estimation. Whereas the computation of the SVs is

numerically stable also for ill-posed problems, small singular values of S (especially those

near zero) have a large influence on the design criterion evaluated by F (FIM) and C. They

can produce huge criterion values, which then complicate or even impede an appropriate

optimization. Thus, a control of the smallest singular values of S is needed (e.g., maximiz-

ing the smallest singular values) and by this, a control of the most uncertain parameters

(those related to the largest eigenvalues of C).

As a solution to this, different regularization techniques, namely SsS, TSVD and

Tikhonov, were presented together with details on their implementation. To illustrate

their application to parameter identification and OED problems, three different case stud-

ies of increasing complexity and ill-posedness were considered. The following particular

features, strengths and weaknesses of each technique were identified.

Each regularization technique implies a transformation of the original problem. Hence,

the regularized optimal design certainly improves the ill-conditioning of the regularized

problem but not necessarily the ill-conditioning of the original one. Moreover, the methods

ensure to obtain a solution but provide no information whether the obtained solution is

still useful in the original context.

Particular attention should be paid to ill-determined rank problems. Because of the

nearness to zero of the singular values, a strong regularization is needed. Unfortunately,

this pushes the regularized problem away from the original problem and increases the bias

in the solution.

The effect of each regularization technique modifies the problem to improve its ill-

conditioning which is evidenced in the SVs of the matrix S of the regularized problem.

In SsS, the problem is transformed excluding the unidentifiable parameters and by this

141


the related (ill-conditioned) singular values. In TSVD the ill-conditioned singular values

are substituted by zero and the new problem only considers singular values with large mag-

nitude in the original problem. Finally, the effect of Tikhonov regularization is to fix the

ill-conditioned singular values to the magnitude of the squared regularization parameter

(λ2), whilst the large singular values are not largely modified.

The selection of the regularization parameter, i.e., ϵ-threshold for SsS and TSVD, and

the scalar parameter λ in Tikhonov, is problem dependent and should be done carefully.

Firstly, the ill-conditioning of the problem should be improved and, secondly, the problem

should not be completely changed. For instance, in Tikhonov, the regularization parameter

λ should improve the ill-conditioning in the sensitivity matrix (which should also control

the solution norm of the PE) without greatly affecting the essence of the original problem

(controlling the norm of the residual in the PE). In that sense, the value of this parameter

(according to the construction of the regularized sensitivity matrix) should be as large as

the first singular value which is considered to cause the ill-conditioning.

When applying Tikhonov regularization, a computed E-optimal design is not viable,

if this new design does not promote a SVs with the smallest singular value larger than

λ2. Here, the regularization parameter continues fixing the small singular values (i.e.,

ςNθ≈ λ2) and makes an optimization impossible (fixing the E-criterion value to 1/(λ2)2

in each iteration).

For SsS and TSVD there exists the tendency to conserve the largest singular values which

are associated to the well-conditioned parameters. From an applications point of view, the

SsS seems the most natural approach, as the regularization acts in the original parameter

space. It preserves the physical meaning of parameters and provides useful information on

the number of identifiable parameters as well as on the ranking of parameters regarding

their linear independence and sensitivity.

However, it has to be noted, that a change in the dimension of the identifiable parameter

subset (in SsS) or the number of well-conditioned singular values (in TSVD) introduces

a discontinuity in the evaluation of the design criterion. This problem is similar to the

inherent non-differentiability in the definition of the E-optimal criterion, where a possi-

ble switching in the smallest eigenvalue introduces this behavior. Generally it might be

worthwhile to note, that any increment in the SVs (promoted by the OED) does not au-

tomatically mean a reduction in the ill-posedness of the problem. On the contrary, large

values on the top section of the SVs (without an adequate increase in the bottom section)

may lead to higher condition number which then give rise to ill-estimated parameters of

unstable PE. This behavior was observed here especially for the D-design.

Notice, that no improvement in identifiability is expected by introducing regularization.

It only enables to reliably estimate the most identifiable parameter. To improve identifia-

bility the experimental data should be increased in quantity and quality. New observable

variables could be included and the frequency and precision of the current measurements

could be also improved.

142

8. Chromatography system: Overcoming

deficiency of scarce experimental data in

online estimation

8.1. Abstract

1 The application of techniques of OED has proved to be an important strategy for model

selection and parameter precision improvement. Once the model structure is defined (e.g.,

by modeler heuristic, using model discrimination based on OED, etc.), the next task is to

recovered the model parameters from the available experimental data by using parameter

estimation techniques. When the first parameter vector is estimated and its estimator

performance is assessed, the next interest is to improve or refine its quality. The quality is

called accuracy if the estimator is unbiased, otherwise it is called precision. The sequential

redesign of experiments aims at reducing the uncertainty of the estimated vector and

in nonlinear models it compensates uncertainties arising, for instance from parameter

initial guesses. In the context of dynamic processes the potential of exploiting the new

experimental information as soon as it is available has recently attracted the attention of

different research groups. In that sense, the online redesign of experiments for improving

parameter precision plays an important role. Several applications of the online redesign

to nonlinear differential equations can be found in literature [39, 43, 62, 136, 102, 107].

However, all listed applications have been only theoretically implemented. Only very few

reports on the demonstration with experimental implementation have been published so

far, i.e., parameter identification of a chromatography system [8], a temperature controlled

tank [133] and a dynamic model of the Baxter Research Robot from Rethink Robotics [128].

In those experimental-based studies, the need of handling the arising ill-conditioned and

identifiability problems has been detected. Usually the assumption is that well-conditioned

PE and OED are solved at each time instant therefore all parameters are identifiable.

That is not true specially at the beginning of the experiment when the data is scarce. The

consequences of ill-conditioned PE and OED are commented in Section 3.2.1. The generate

practical problems such as instabilities and poor convergence in the parameter estimation,

and ineffective and/or meaningless experiment designs [8, 78]. Therefore, adequate actions

need to be taken to ensure the robustness and reliability of the algorithm for changing

1The content of this chapter is reprinted (adapted) with permission from (T. Barz, D. C. Lopez C.,M. N. Cruz B., S. Korkel, and S. F. Walter. Real-time adaptive input design for the determinationof competitive adsorption isotherms in liquid chromatography. Computers & Chemical Engineering,94:104-116, 2016). Copyright (2016) Elsevier. (Publication IV in Appendix A.2).

143

8. Chromatography system: Scarce experimental data in online estimation

experimental information.

In the available literature on the adaptive redesign of experiments ill-posed problems

had not been directly addressed. Instead, preliminary studies were carried out to test for

parameter identifiability and if necessary reduce the number of parameters to be estimated,

in order to guarantee parameter identifiability for all scenarios [21, 62]. Only in Barz et al.

2013 [8] and Yakut et al. [133, 132] the ill-posedness of the PE and corresponding OED

was explicitly mitigated by the application of the identifiable parameter subset selection

(SsS). It is important to be noted that an experimental study considering structural analy-

sis of the parameter estimation and numerical regularization had been missed in literature

till the publication of the peer-reviewed article in Ref. [8]) in which the author of this

thesis made contribution. As an new experimental implementation and an extension of the

findings of this aforementioned work, this chapter presents the online redesign of experi-

ments handling ill-conditioning and identifiability issues applied on a more complex liquid

chromatography system to determine competitive adsorption isotherm parameters. The

solution of regularized PE and OED problems as displayed in the consolidated framework

of Figure 4.1 is considered. Furthermore, the Tikhonov regularization is tested for the first

time in the context of online redesign of experiments. Special attention is paid to handle

scarceness of experimental data and therefore the parameter instability at the beginning of

the experiment. Although some results correspond to the actual experimental implemen-

tation, other theoretical aspects about the effect of several regularization techniques in the

online experimental redesign and selection of the corresponding regularization parameters

are also investigated.

It is important to point out that this chapter also contain new information which has

not been published or used to prepare a new peer-review article. This content is here

marked as “(New)“.

8.2. HPLC chromatography process

Chromatography is a mass transfer process involving adsorption. It is a technique in

analytical chemistry used for mixture separation. The principle in chromatography is

to separate a sample into its constituent parts because of the difference in the relative

affinities of different molecules for the mobile phase and the stationary phase used in the

separation. A sample is placed on a stationary phase, which is either a solid or a liquid,

and then the mobile phase, a gas or a liquid, is allowed to pass through the system. The

components of the sample will be separated based on their varying physical and chemical

properties, imparting different affinities for the two phases. Depending of the affinities

one component will migrate through the column faster than the other. Because molecules

of the same compound will generally move in groups, the compounds are separated into

distinct bands within the column. When the mobile phase is a liquid the chromatography

is called liquid chromatography. The mobile phase also called eluent should be selected

144


based on its polarity relative to the sample and the stationary phase. Most applications

are reported from pharmaceutical industry for the separation of fine chemicals at the

preparative scale, e.g. enantiomers, proteins and peptides [47].

The High Performance Liquid Chromatography (HPLC) was developed as method to

solve some of the shortcomings of standard liquid chromatography, e.g., slow separation

time and size of the column packing. It relies on pumps to pass a pressurized liquid

solvent containing the sample mixture through a column filled with a solid adsorbent

material. The use of high pressures in a narrow column allowed for a more effective

separation to be accomplished in much less time than was required for previous forms of

liquid chromatography. The typical hardware employed in HPLC includes a sampler, a

pump, a column and a detector. It has the capability for feedback control, including the

pumps, the mixing of concentration gradients as well as online concentration analysis.

Recently, the model-based optimization, monitoring and control of chromatography pro-

cesses has gained the attention, especially the simulated moving bed (SMB) process, see

e.g. 91, 119. Central to the modeling of these processes are the thermodynamics of the

phase equilibrium as well as dispersion and mass transfer phenomena. The online estima-

tion of the corresponding parameters in a protein separation case study were already the

subject of a recent research in which the author of this thesis was co-author [8].

8.2.1. Experiments

Experimental set-up

All online experiments were achieved in the HPLC system located in the laboratory of the

chair of process dynamics and operation of the Technische Universitat Berlin in Germany.

The HPLC system is shown in Fig. 8.1. Its specific components are a Smartline Man-

ager 5050 with Low Pressure Gradient module operated in combination with a Smartline

Pump 1050 (both from Knauer GmbH, Berlin, Germany), a Vertex Plus Column and the

Smartline ultraviolet (UV) Detector. The Vertex Plus Column 125 x 4.6 mm Euroshper

II 100-5 C18, with a length of 125 mm, an inner diameter of 4.6 mm and a total porosity

of 0.68 [67] is a preparative chromatography column for reversed phase applications in

the analytical as well as preparative range. Smartline UV Detector 2500 equipped with

a deuterium lamp and operated at a wavelength of 284 nm, the analog output signal is

scaled to a maximum value of 1 AU. All instruments and the column are from KNAUER,

Wissenschaftliche Gerate GmbH, Berlin, Germany. A schematic flow sheet is given in Fig.

8.2.

The Smartline Manager and Pump constitute a computer controlled gradient system

which can deliver step concentration gradients from four feed A, B, C and D. Stepwise

concentration changes are delivered at a constant flow rate to the column. The concen-

tration values (hereafter also referred to as input actions) are generated in the MATLAB

programming environment (MathWorks, Natick, MA). At the column outlet, the UV De-

145


Figure 8.1.: Experimental set up of the chromatography system. (Figure taken from publicationIV- Barz et al. (2016) in Appendix A.2 - reprinted from Computers & ChemicalEngineering with permission from Elsevier)

tector measures continuously the sum of all concentrations. These concentrations are

processed in MATLAB.

Smartline

UV Detector

Vertex Plus Column

concentration

Smartline Manager and Pump

feed A

feed B

product

computer with Matlab

programming environment

flow-ratio const.

flow

static mixer

feed C

feed D

measurement data

input actions

Figure 8.2.: Schematic flow sheet of the chromatography system in Fig. 8.1. (Figure taken frompublication IV- Barz et al. (2016) in Appendix A.2 - reprinted from Computers &Chemical Engineering with permission from Elsevier)

The communication between Smartline Pump and Manager and MATLAB is established

over a standard RS-232 serial port using MATLAB instrument control toolbox. The

continuous measurement from Smartline UV Detector is first processed in the Freelance

Controller AC 700F using the detectors analog output signal and the Analog Input/Output

Module AX 722F (ABB, Zurich, Switzerland). The communication between Freelance

controller and MATLAB software is established using the Open Process Control (OPC)

software interface.

Measurement standard deviations are σy,i = 0.1 K and each i representing individual

diagonal elements in the measurement standard deviation matrix in Eq. 2.44.

146


Operation

Two different feeding strategies (FS-1 and FS-2) have been considered, see Table 8.1. They

differ in the way the feeds A, B, C, D (see Figure 8.1 and 8.2) were prepared. For both,

FS-1 and FS-2, feed A contains eluent only. For FS-1 the feed B contains eluent and a

mixture of all three benzoate. For FS-2, the benozate are individually prepared in feed B,

C, D. Moreover, in FS-1 the ratio of the three species concentrations is kept constant. In

FS-2 the three species concentrations are individually defined, thus, the degree of freedom

is three times higher than in FS-1.

Table 8.1.: Different feeding strategies (FS) for the chromatography system. Abbreviations: ethyB= ethyl benzoate, propB = propyl benzoate, butyB = butyl benzoate. (Table takenfrom publication IV- Barz et al. (2016) in Appendix A.2 - reprinted from Computers& Chemical Engineering with permission from Elsevier)

feeding strategy feed channel concentrations [mol/l] sum concentrationchannel (ratio %) eluent ethyB propB butyB range [mol/l]

A (0· · · 100) yes - - -FS-1 B (0· · · 100) yes 0.212 0.212 0.212 0.0 · · · 0.636

C - - - -D - - - -

A (0· · · 100) yes - - -FS-2 B (0· · · 33.3) yes 0.636 - - 0.0 · · · 0.636

C (0· · · 33.3) yes - 0.636 -D (0· · · 33.3) yes - - 0.636

It has to be noted, that in FS-2 the maximal individual feed concentration delivered

by the manager and pump (i.e. concentration of ethyB, propB, butyB) was restricted to

0.212 mol/L. Accordingly, the maximal sum concentration is 0.636 mol/L, the same value

as for FS-1. This value has been found critical for the miscibility of the benzoate in the

eluent. The specific chemicals used in this case study are:

• eluent: prepared from 80% methanol CH4O (CAS registry number: 67-56-1) and

20% demineralized water CH2O,

• ethyB: ethyl benzoate C9H10O2 (CAS registry number: 93-89-0),

• propB: propyl benzoate C10H12O2 (CAS registry number: 2315-68-6),

• butyB: butyl benzoate C11H14O2 (CAS registry number: 136-60-7).

All online experiments were carried out at constant temperature of 23 C and feed flow

rate of 1.5 mL/min.

8.2.2. Modeling

The HPLC process model comprises three units, namely the manager and the pump, the

chromatography column and the UV sensor. These units and the variables for the PE and

147


OED problems are shown in Fig. 8.3. The desired concentration cfeedi is computed by

solving the OED problem online by the personal computer (see 8.3). The manager and the

pump are used to realize this concentration. The output concentration of the consolidated

unit manager and the pump is referred to as cini which is the inlet concentration to the

chromatography column unit. The model of each unit is given in the next sections.

Manager & Pump

Chromatography

column

= ,

= , ,

=

Figure 8.3.: Units of the process model, input/ output variables and unknown model parameters.(Figure taken from publication IV- Barz et al. (2016) in Appendix A.2 - reprintedfrom Computers & Chemical Engineering with permission from Elsevier)

Manager and pump

The experimental data which has been used for the identification of the model for the

manager and pump is shown in Fig. 8.4. A PT2T0 step response model (see the transfer

function in the Laplace domain of Eq. 8.1) was selected to describe the dynamic response

of the chromatography column inlet concentration cini as response to step changes in the

feed concentrations cfeedi .

Gi(s) =K · e−T0s

(1 + T1s)(1 + T2s)(8.1)

∀ i ∈ ethyB, propB, butyB

The same coefficients for all components i have been assumed. The deadtime e−T0s has

been approximated by an PTn of the form Gi(s) = 1/(1 + Tns)n.

The implementation in time domain of the model 8.1 yielded a sparse system of 180

ordinary differential equations of the form:

dcP,i,jdt

=cP,i,j−1 − cP,i,j

τj, (8.2)

∀ i ∈ ethyB, propB, butyB ; j ∈ 1, · · · , Nn

where cP,i,j for j = 0 is defined by the feed concentration to the manager and pump

unit such that cP,i,0 = cfeedi and cP,i,j for j = Nn is defined by the outlet concentration

of the system which is the inlet concentration of the chromatography column such that

148


Figure 8.4.: Responses of the manager and pump outlet concentrations (equal to the column inlet

concentrations cini ) for arbitray steps in the feed concentration cfeedi . Flow is keptconstant at 1.5 ml/min. (Figure taken from publication IV- Barz et al. (2016) inAppendix A.2 - reprinted from Computers & Chemical Engineering with permissionfrom Elsevier)

cP,i,Nn = cini .

For a volumetric flow of 1.5 ml/min the following parameter values were used: τj =

4.725 51× 10−3 min, with j = 1, · · · , Nn − 2; τNn−1 = 1.0481× 10−1 min; τNn =

1.0635× 10−3 min and Nn = 60.

Chromatography column

The equilibrium dispersive model is well established in liquid chromatography ([77, 47]).

This model yields convection-diffusion partial differential equations (PDEs) with domi-

nated convective terms. The corresponding differential mass balance for each component

i ∈ ethyB, propB, butyB read:

∂ci(t, z)

∂t+

1− ε

ε

∂qi (c)

∂t+ u

∂ci(t, z)

∂z= Dax,i

∂2ci(t, z)

∂z2(8.3)

where c and q are the concentrations in the liquid and solid phase, respectively, u is the

interstitial velocity of the liquid phase in the column, z is the spatial coordinate in axial

direction and ε is the total porosity. Dax,i is the axial dispersion coefficient which is

assumed to be equal for all components. An equilibrium relationship between liquid and

solid phase is assumed and defined by the nonlinear isotherm of the competitive Langmuir-

type:

qi(c) =Hici(t, z)

1 +∑

iKici(t, z)(8.4)

with Hi being the Henry constants and Ki thermodynamic coefficients. The following

initial conditions at t = 0 assume a clean column at the beginning of each experiment:

ci(0, z) = 0 (8.5)

149


The Danckwerts inflow and outflow boundary conditions are given as:

Dax,i∂ci∂z

z=0

= u(ci|z=0 − cini

), Dax,i

∂ci∂z

z=L

= 0 (8.6)

where L is the column length.

The PDEs in Eq. 8.3 were discretized by a high resolution finite volume scheme with

flux-limitation [61]. To accurately capture the discontinuous concentration shock fronts in

axial direction a number of NEls = 60 elements has been chosen which has proved to give

sound results.

For 3 components the complete equation system (Eqs. 8.1, 8.3) consists of Neq =

(Nn +NEls) · 3 = 360 differential equations.

UV sensor

The detectors output signal sUV is given in arbitrary units (AU) and transformed to

concentration values ym given in mol/l using the calibration curve in Eq. 8.7.

ym = a exp

[−(sUV − b

c

)2]− a exp

[−(b

c

)2], (8.7)

with sUV = sUV + d

Note that ym is the sum concentration of all components given in mol/L, see also section

8.2.1. Thus, a measurement of individual components is not possible and in Eq. 8.7 an

average set of calibration parameters a, b, c, d was used, with a = 8.0917× 10−1, b = 1.2277,

c = 5.5536× 10−1, d = 3.47× 10−2. The errors of the output signal are assumed to be

Gaussian and uncorrelated with a standard deviation σ = 0.01 mol/l.

8.3. Results

8.3.1. Assignment of variables

The variables of the discretized model of the HPLC chromatography system described in

Section 8.2.2 assigned to the general problem formulation in Eq. 2.1 are in the following

shown.

The predicted response variables were:

y :=∑i

ci|z=L (8.8)

with i ∈ ethyB, propB, butyB

In Eq. 8.8, y is the sum of all individual concentrations at the column outlet. It corresponds

150

8.3. Results

to the measured concentrations ym in Eq. 8.7.

The unknown model parameters θ are related to the column model in Section 8.2.2 and

taken from Eq. 8.4:

θ := [HethyB, kethyB, HpropB, kpropB, HbutyB, kbutyB]T (8.9)

The experiment design variables/ input actions u are the desired feed concentrations which

are delivered by the manager and pump, see Eq. 8.1.

u :=[cfeedethyB, c

feedpropB, c

feedbutyB

]T(8.10)

8.3.2. Parameters of the time horizon schemes

The parameters of the finite time horizon schemes (see Section 2.7.5) used to the experi-

mental implementation of the online redesign of experimentes were:

• measurement sampling time: tk − tk−1 = 5s,

• prediction horizon: tk+h − tk = 10 min,

• control grid: 20 sec,

• control horizon: 3 min.

The control horizon was defined to reduce the number of future input actions and by this

the computation times.


All computations were performed on an Intel(R) Core(TM)2 (CPU 6600 @2.40-GHz) com-

puter with 4-GB RAM. Parallel programming was not used. The PE and OED problems

were solved by single shooting and using MATLAB Optimization Toolbox solvers lsqnon-

lin/ trust-region-reflective and fmincon/sqp, respectively. The number of maximum itera-

tions of both solvers was restricted to one in order to restrict the computational effort and

the solvers were restarted at each sampling time point using the last results as initial values.

Model and parameter sensitivity equations were integrated using CVODES with sparse

direct solver from SundialsTB Toolbox [55]. The computed parameter sensitivities were

used to accurately calculate the gradients of the PE problem and the Fisher Information

Matrix (FIM). The gradients of the OED were supplied using finite differences. The Ja-

cobian and directional derivatives of the model equations were generated using Tapenade

[54].

151


8.3.4. Online Base Case: PE without regularization (New)

This section summarized unpublished results about the uncertainty of parameter estimates,

stability and ill-conditioning of parameter estimations without regularization in the course

of the online experimentation. Hereinafter, the online parameter determination without

regularization (Reg=None) will be considered the Base Case.

The quantity and quality of experimental data are important aspects to successfully

conduct a parameter estimation. Success means to achieve a stable parameter estimation

and then to compute a meaningful and accurate parameter estimate. In online experi-

mentation the scarceness of informative data special at the beginning of the run imposes

additional constraints to the online parameter determination. These limitations in nonlin-

ear estimation produce ill-conditioned matrices (sensitivity matrix and Fisher-information)

which leads to numerical issues such as instability and convergence problems. Moreover,

if the online (also offline) adaptive input design is also implemented the existence of ill-

conditioning promotes large parameter variability (see Section 3.2.1) which means unrea-

sonably large parameter estimates and variance. The optimal experimental design (see

Section 2.7) based on this parameter information is thus unreliable.

In the sequel the results of conducting online parameter estimation by using online exper-

imental data is shown. Special attention is paid to characterize the relationship between

large parameter variability, parameter estimation instability, and an ill-conditioned sensi-

tivity matrix. Furthermore, it will be demonstrated that even when a parameter estima-

tion reaches convergence its final result (estimated parameter vector) might be meaningless

(bad fitting) and inaccurate (large bias).

0 5 10 15 20 25 30-0.2

0

0.2

0.4

0.6

0.8

Time, tk [min]

Co

nce

ntr

atio

n [

mo

l/l]

cexp

csim

Figure 8.5.: Online Base Case (Reg=None): Model fitting using the parameter estimate θ at tk =30 (shown in Table 8.2) for the online parameter estimation without regularization.The markers show the measured sum outlet concentrations whereas the solid-line showsthe corresponding simulated concentrations.

In Figure 8.5 the system dynamic predicted by the HPLC process model (see Sec-

152

8.3. Results

Table 8.2.: Parameter estimates for the online parameter estimation using different regularizationstrategies. ’True values’ refers to the best estimates obtained from offline estimationwithout iteration limits.

Parameter Regularization θ1 θ2 θ3 θ4 θ5 θ6Initial guess - 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

online NONE 3.0150 9.7200 3.0070 0.9165 0.1226 0.1811estimation SsS 5.9797 4.0984 2.9081 0.6427 0.4727 0.4712@t=30 min Tikh 5.9817 4.0969 2.9077 0.6432 0.4737 0.4683

’true’ values - 5.9950 4.0995 2.9082 0.6396 0.4685 0.4819

tion 8.2.2) using the parameter vector θ which was estimated after 30 min of online

PE/experimentation is displayed. The model fitting in figure suggests low quality in

the parameter estimate vector because the error between the predictions (solid-line) and

the experimental data (markers) is large. However, that is not a definitely evidence of pa-

rameter estimation deficiencies. Therefore, the evaluation of the estimator was conducted

according to the estimator analysis in Section 2.5. The relative bias was used as an accu-

racy metric (see Section 2.5.2). The relative bias was computed with respect to the ’true’

parameter values which were obtained from offline estimations. They corresponded to the

best available estimate with the smallest cost function (residual) value.

Table 8.2 shows the corresponding values. The percentage relative bias exhibited in

Figure 8.6 was calculated for each element of the parameter vector θk estimated without

regularization at each instant time tk. Several points can be highlighted from these results.

First that effectively the parameter estimation was unstable specially when the experimen-

tal information was not enough. At the beginning of the experimentation the limited

quantity of measurements generated ill-conditioned sensitivities matrices and large step

directions υk (see Eq. 2.6). Therefore the parameters jumped to large values being total

inaccurate (i.e., large relative bias in Figure 8.6). Almost two third of the experiment (till

around 20 min) the estimates remained meaningless with large bias. However, as long as

the data collection was increasing the new estimates were refined and they slowly reached

an acceptable convergence from t=20 min.

At this point it is important to point out that the reached convergence for the online

parameter estimation did not mean final accurate parameters as seen in Figure 8.6 neither

good model fitting. This facts were expected according to the ill-conditioning and the

duration of the instability of the online parameter estimation. In order to analyze the

ill-conditioning in the online experimentation the singular value spectrum SVstk of each

sensitivity matrix S−k at instant k are shown in Figure 8.7. In figure is easily observed the

ascendent evolution of the SVs from the beginning till the end of the experiment. Note

that the addition of new data lifted the spectrum at each instant tk and from a time

point (around tk = 20), when the experimental data was sufficient informative, the SVs

exceeded a typical threshold (i.e., ϵ = 6.7 for κmax = 1000 and γmax = 15) to assure

153


0 5 10 15 20 25 300.01%

1%

100%

10,000%

Rela

tiv

e B

ias

(%)

Time, tk [min]

θ1

θ2

θ3

θ4

θ5

θ6

1%

100%

Stabilization start

Figure 8.6.: Online Base Case (Reg=None): Relative Bias (%) of each estimated parameter (w.r.t.its corresponding true parameter value in Table 8.2) computed at each sampling timeafter solving parameter estimations without regularization.

a well-conditioned problem (see Section 3.2.1). The apparently “overcoming“ of the ill-

conditioning in that late stage of the experiment although encouraged the convergence

did not tackle the large uncertainty of the previously unstable parameter estimates. That

could be presumably attributed to a possibly remaining ill-conditioning which cannot be

visualized by the magnitude of the considered regularization parameter ϵ. Consequently,

not only the need of regularizing the PE and OED was explicitly manifested but also the

requirement of a tuned regularization parameter. Regularization is here intended to get

rid of the ill-conditioning controlling the instability and parameter variability to smooth

the way to get a final accurate (or at least precise) parameter vector.

Figure 8.7.: Online Base Case (Reg=None): Singular value spectrum (SVsk) of each sensitivitymatrix S−

k computed at each sampling time tk after solving parameter estimationswithout regularization. The horizontal solid plane labeled ϵ = 6.7 (see Eq. 3.3 withγmax = 15 and κmax = 1000) is the typical threshold to selected the ill-conditionedsingular values in the sensitivity method ill-conditioning analysis of Section 4.4.1.

154

8.3. Results

8.3.5. Online Regularized Case: PE with regularization (New)

The usefulness of different regularization strategies was studied in order to increase the ro-

bustness of the online parameter estimation against wrong initial parameter estimates and

scarce measurement information. The aim was to control the instability of the parameter

estimation generated by ill-conditioned sensitivity matrices at the beginning of the exper-

iment. That provided a smooth transition from the initial parameter guess to the final

estimate at the end of the experiment. The desirable online behavior under regularization

of the bias in the parameter estimates should be then monotonously decreasing with a fast

and accurate convergence to the true parameter values. It has to be noted that deviations

from the true parameter values can be promoted not only by erroneous parameter initial

guesses but due to the ill-conditioning of the estimation which leads to instability prob-

lems. These deviations during the course of the parameter estimates do also affect the

experiment redesign (possibly) yielding ineffective or meaningless designs. This situation

makes crucial the effective implementation of the selected regularization technique. To do

so, the choice of an appropriate regularization parameter, namely ϵ-threshold for SsS (see

Section 3.3.1 ) and λ for Tikhonov (see Section 3.3.3) plays an important role. Therefore,

in the following (as new unpublished results in this thesis) the effects and selection of regu-

larization parameters for SsS and Tikhonov regularizations will be treated. Moreover, the

performance of each regularization (using the previously tuned regularization parameter)

in the online context will be discussed.

Regularization parameter selection

This section summarized unpublished results about the selection of an appropriate reg-

ularization parameter to be used in the stabilization of ill-posed parameter estimations.

Because the ill-posedness analyzed in this thesis is generated by ill-conditioned matrices,

regularization parameters are tuned according to the gravity of the ill-conditioning in the

case study.

Subset selection (Reg=SsS) In the regularization by parameter subset selection (see Sec-

tion 3.3.1), the regularization parameter is the ϵ-threshold used to select the ill-conditioned

singular values (see Section 4.4.1) and compute the number of identifiable parameters (see

Section 4.4.2).

As mentioned in Section 3.2.1 ϵ-threshold can be computed by Eq. 3.3 as a function of

κmax and γmax. In this case study the condition numbers from tk = 7 min on were less

than κmax = 1000 therefore the value of γmax was indeed considered the regularization

parameter. Having so, nine γmax, i.e., 1000, 500, 200, 50, 20, 15, 1, 0.2, 0.1 were tested.

For each γmax an online parameter estimation in “silico“ was run. Results are shown in

Figure 8.8 for the percentage relative bias with respect to the true values in Table 8.2

where each point in a curve corresponds to the average of the elements of θk estimated

155


0 5 10 15 20 25 300.1%

1%

10%

100%

1000%

10000%

Rel

ativ

e B

ias

(%)

Time, tk [min]

None

γmax

=1000, ∈=0.1

γmax

=500, ∈=0.2

γmax

=200, ∈=0.5

γmax

=50, ∈=2

γmax

=20, ∈=5

γmax

=15, ∈=6.7

γmax

=1, ∈=100

γmax

=0.2, ∈=500

Weak

Reg

Strong

Reg

Figure 8.8.: Online Case SsS (Reg=SsS): Relative Bias (%) of the estimated parameter vector θk(w.r.t. the true parameter vector in Table 8.2) computed at each sampling time tkwith k = 1, · · · , Nm after solving parameter estimations by using subset selection asregularization for several values of the regularization parameter ϵ-threshold. Regular-ization parameters with the best performance in terms of the accuracy of the estimatedparameter vector at the end of the experiment t = 30 min are enclosed in the red box.“None“ makes reference to parameter estimations without regularization (Reg=None).Note that ϵ-threshold is always defined by γmax according to Eq. 3.3.

at each instant tk. Notice that ϵ and γmax have an inverse relationship according to Eq.

3.3, thus small values of ϵ (large values of γmax) yielded weak regularization otherwise

regularization was strong.

In Figure 8.8 it is possible to observe that regularization parameters leading relative

weak regularizations, i.e., ϵ = 0.2, 0.5, 2 generated the most accurate final estimated

parameter vector (at tk = 30) with relative bias around 1%. Another conclusion of this

comparison is that there exists a well-defined interval of the regularization parameter in

where the online estimation is robust and accurate. Values of ϵ outside of this interval

produced either severe unstable estimations (for weak regularizations, e.g., ϵ = 0.1) or

parameter estimations without DoF (for strong regularizations, e.g., ϵ = 500, 1000). In

the former all parameters or many of them are considered identifiable and allowed to

be estimated remaining then the ill-conditioning problems. Whereas, in the latter all

parameters are considered unidentifiable and are fixed to the initial guess, accordingly

their relative bias is around 100%.

A more useful view of the overall performance of each regularization parameter can be

found in Figure 8.9. Therein each point represents the average of each curve in Figure 8.8.

Consequently, the contribution of the whole experiment is compacted in just a number. As

a result, ϵ = 2 had the best performance and it is selected as the appropriate regularization

parameter by using Reg=SsS. Nevertheless, it is clear that there exists an interval of

adequate ϵ which could be between 0.5 and 2.

156

8.3. Results

1000 500 200 50 20 15 1 0.2 0.110%

100%

1000%

Mea

n R

elat

ive

Bia

s (%

)

γmax

∈=0.1

∈=0.2

∈=0.5 ∈=2

∈=5∈=6.7

∈=100∈=500

∈=1000

Weak

RegStrong

Reg

Figure 8.9.: Online Case SsS (Reg=SsS): Mean Relative Bias (%) for the whole experiment dura-

tion of the Nm estimated parameter vectors θk with k = 1, · · · , Nm (w.r.t. the trueparameter vector in Table 8.2) obtained after solving the Nm parameter estimationsby using subset selection as regularization for several values of the regularization pa-rameter ϵ-threshold. Regularization parameters with the best global performance interms of the accuracy of the estimated parameter vectors during the whole experimentare enclosed in the red box.

A special remark should be here made regarding the feasible values of ϵ. They are

delimited by the largest and minimum singular values of the SVs of the sensitivity matrix,

i.e., ς1 and ςNθ, respectively. If ϵ > ς1 then the estimation in Eq. 3.8 does not take place

because all parameters are considered unidentifiable and then fixed. On the other hand,

if ϵ < ςNθthen all parameters are assumed identifiable and the estimation runs as without

having regularization.

This feature can be clearer seen in Figure 8.10 where the singular value spectra

SVstk of the corresponding sensitivity matrices S−k computed at selected instants tk =

3, 5, 10, 15, 20, 25, 30 spanned the whole experiment are displayed. The figure includes

also three values of ϵ-threshold of different magnitude, ϵ = 0.1, 2, 500. The large

value ϵ = 500 did not cut any SVs that meant each singular value ςi was considered

ill-conditioned according to the ill-conditioning procedure outlined in Section 4.4.1. The

regularized PE using this regularization parameter fixed all parameters to their initial

guess, thus the percentage relative bias in Figure 8.8 was all the time fixed to 100%. This

kind of effect is named strong regularization. The opposite case occurred with the small

value ϵ = 0.1. In that case the majority of singular values (and even all ςi) were above

of this threshold yielding a weak regularization in which all parameters were assumed

identifiable. Having done so, there was any ill-conditioning control and the parameter

estimation remained unstable (almost the first half of the experiment) and it never could

recover having the parameters large bias at the end of the experiment (see Figure 8.8).

The other value corresponds to the best found regularization parameter, i.e., ϵ = 2. In this

157


1 2 3 4 5 610

-4

10-2

100

102

104

Sin

gula

r val

ue

(ςi)

Singualr value index i

SVs

tk=3

SVstk=5

SVstk=10

SVstk=15

SVstk=20

SVstk=25

SVstk=30

∈=2

∈=500

∈=0.1

Weak Reg

Strong Reg

Figure 8.10.: Online Case SsS (Reg=SsS): Singular value spectrum SVstk of the sensitivity matrixS−k computed at each sampling time tk = 3, 5, 10, 15, 20, 25, 30 after solving param-

eter estimations by using subset selection as regularization for several values of theregularization parameter, i.e., ϵ = 0.1, 2, 500. Large values of ϵ determine strongregularizations of PE otherwise the regularization is weak.

scenario, a strong regularization is applied to PE at the beginning of the experiment (e.g.,

at tk = 3, 5), a moderate version appears when there was not still enough information

(e.g., at tk = 10, 15) and no regularization was active when the problem did not have

small singular values. In online redesign of experiment that would be the characteristics

of a suitable regularization parameter.

Tikhonov regularization (Reg=Tikh) In regularization by Tikhonov explained in Sec-

tion 3.3.3 the regularization parameter is the scalar λ used to balance the contribution of

the penalization term in the cost function of PE in Eq. 3.10. Tikhonov can incorporate a

priori information of the model parameters as known parameters values in the predefined

vector θR and parameter variances in the matrix L or in λ. Here it was used the form of

introducing a priori information of the parameter variances in λ as follows

λ =1

σθ, (8.11)

where σθ denotes the parameter variance beforehand known. If a current parameter

vector is considered precise the value of σθ is small thus λ is large and the regularized

parameter estimation in Eq. 3.10 is highly penalized. It is named a strong regularization.

On the contrary, if the current parameter vector is not reliable σθ is large, λ is small

and the penalization term might be neglected with respect to the least-square criterion.

That is a weak or even no existing regularization. If individual parameter variances are

previously known they might be also included in the diagonal elements of L (that is not

158

8.3. Results

here the case).

Nine priori parameter values were tested, i.e., σθ =

0.001, 0.002, 0.01, 0.02, 0.15, 0.2, 1, 2, 10. For each σθ (or λ) an online parameter

estimation “in silico“ was run. Results are shown in Figure 8.11 for the percentage relative

bias with respect to the true values in Table 8.2 where each point in a curve corresponds

to the average of the elements of θk estimated at each instant tk.

0 5 10 15 20 25 300.1%

1%

10%

100%

1000%

10000%

Rel

ativ

e B

ias

(%)

time [min]

None

σθ=10, λ=0.1

σθ=2, λ=0.5

σθ=1, λ=1

σθ=0.2, λ=5

σθ=0.15, λ=6.7

σθ=0.02, λ=50

σθ=0.01, λ=100

σθ=0.002, λ=500

σθ=0.001, λ=1000

Weak

Reg

Strong

Reg

Figure 8.11.: Online Case Tikh (Reg=Tikh) for λ regularization parameter selection: Relative

Bias (%) of the estimated parameter vector θk (w.r.t. the true parameter vector inTable 8.2) computed at each sampling time tk with k = 1, /cdots,Nm after solvingparameter estimations by using subset selection as regularization (Reg=Tikh) forseveral values of the regularization parameter λ in the course of the online experiment.Regularization parameters with the best performance in terms of the accuracy of theestimated parameter vector at the end of the experiment t = 30 min are enclosed inthe red box. “None“ makes reference to parameter estimations without regularization(Reg=None). Note that λ is here defined by σθ according to Eq. 8.11.

A similar analysis realized for Reg=SsS was also here applied. In Figure 8.11 it is

possible to observe that regularization parameters leading moderate regularizations, i.e.,

λ = 1, 5, 6.7, 50 generated the most accurate final estimated parameter vector (at

tk = 30) with relative bias around 1%. For this regularization a wider interval of the

regularization parameter, in where the online estimation was robust and accurate, i.e.,

1 ≤ λ ≤ 50 was found. Values of λ outside of this interval produced either severe unsta-

ble estimations (for weak regularizations, e.g., λ = 0.1, 0.5) or parameter estimations

without DoF (for strong regularizations, e.g., λ = 100, 500, 1000).A more useful view of the overall performance of each regularization parameter can be

found in Figure 8.12. Therein each point represents the average of each curve in Figure

8.11. Consequently, the contribution of the whole experiment is compacted in just a

number. As a result, λ = 50 (equivalent to σθ = 0.02) had the best performance and it is

selected as the appropriate regularization parameter by using Reg=Tikh. Nevertheless, it

is also clear that there exists an interval of adequate λ which could be between 5 and 50.

159


10 2 1 0.2 0.15 0.02 0.01 0.002 0.00110%

100%

1000%

Mea

n R

elat

ive

Bia

s (%

)

σθ|priori

λ=0.1

λ=0.5

λ=1

λ=5 λ=6.7 λ=50

λ=100

λ=500

λ=1000

Weak

RegStrong

Reg

Figure 8.12.: Online Case Tikh (Reg=Tikh) for λ regularization parameter selection: Mean Rela-tive Bias (%) for the whole experiment duration of the Nm estimated parameter vec-

tors θk with k = 1, · · · , Nm (w.r.t. the true parameter vector in Table 8.2) obtainedafter solving Nm parameter estimations by using subset selection as regularization forseveral values of the regularization parameter λ. Regularization parameters with thebest global performance in terms of the accuracy of the estimated parameter vectorsduring the whole experiment are enclosed in the red box. “None“ makes reference toparameter estimations without regularization (Reg=None). Note that λ is definedby σθ according to Eq. 8.11.

It is important to point out that Tikhonov regularization overcomes the ill-conditioning

by replacing the small singular values (those less than the regularization parameter) for

approximately the value of the regularization parameter [78]. Doing so, the dimension of

the problem remains as the original but the new regularized sensitivity matrix is numeri-

cally transformed as a well-conditioned matrix. This transformation although is remarked

at the bottom of the ill-conditioned SVs also affects the large singular values as exhibited

in Figure 8.13. For instance, comparing SVsT ikhtk=30 with SVstk=30 it is possible to observe

that the regularized spectrum SVsT ikhtk=30 did not have any singular value less than 50 and

also that the whole spectrum was lifted at least a bit regarding the original SVstk=30.

A similar final remark as made in the selection of the regularization parameter for SsS

in Section 8.3.5 should be here written. That means, a large λ will be strongly regularized

the parameter estimation because many (or maybe all) singular values could lie under

this value. In that case, all of them will be replaced for the corresponding value of λ,

the problem will be biased and parameter estimates will be fixed to the initial guess

as in Reg=SsS. That is expected taking into consideration that subset selection is the

extreme case of Tikhonov when a priori parameter information is reliable having small

parameter variance. On the other hand, a small λ will have the same effect of a small ϵ

in Reg=SsS. Look at the Tikhonov cost function in Eq. 3.10. If λ is small the weighting

of the penalization term is also small and the original least-square guides the estimation

bringing all structural problems. Then the regularization is weak or even non-existent.

160

8.3. Results

1 2 3 4 5 60

50

100

150

200

250

300

350


Sin

gu

lar

val

ue

(ςi)

SVs

tk=5

Tikh

SVstk=10

Tikh

SVstk=15

Tikh

SVstk=20

Tikh

SVstk=25

Tikh

SVstk=30

Tikh

SVstk=5

SVstk=10

SVstk=15

SVstk=20

SVstk=25

SVstk=30

Regularized S

Original S

λ=50

Figure 8.13.: Online Case Tikh (Reg=Tikh) for λ regularization parameter selection: Singularvalue spectrum SVsTikh

tkand SVstk of the regularized and original sensitivity matrices

S−,T ikhk and S−

k computed at each sampling time tk = 5, 10, 15, 20, 25, 30 aftersolving parameter estimations by using Tikhonov as regularization for regularizationparameter λ = 50. Notice that singular values ςi ≤ λ are approximated to valuesaround λ.

8.3.6. Online Redesign of Experiments

The results of the optimal input design are presented for FS-1 and FS-2 in Fig. 8.14 and

8.15, respectively.

The initial parameter guess values were set to one which meant relatively large deviations

from the true parameter values, compared with values in Table 8.2. It can be seen, that

during the first 10 - 15 min of the experiment run, relatively large deviations from the

finally estimated values existed (see Figs. 8.14e and 8.15e). Moreover, the available

measurements did not provide sufficient information for the identifiability of the whole

parameter space (see Figures 8.14d and 8.15d).

However, the Tikhonov regularization was able to realize a relatively smooth transition

to the final parameter values indicating robustness and stable convergence. The same

applies for the repeated solution of the ED problem, where the identifiable parameter

subset only was considered in the problem formulation.

161


Figure 8.14.: D-optimal adaptive input design for feeding strategy FS-1. Subfigures a, b showthe measured sum and predicted individual outlet concentrations. Subfigure c showsthe input design, i.e. the inlet concentrations. Subfigure d shows the results of theidentifiability analysis for the parameters θ1, · · · , θ6. If a parameter was identifiableand selected by the subset selection (SsS) algorithm, this parameter was active andits activity was indicated by a dot. If a dot was missing, the parameter was notactive and not identifiable. Subfigure e shows the course of the parameter estimates.(Figure taken from publication IV- Barz et al. (2016) in Appendix A.2 - reprintedfrom Computers & Chemical Engineering with permission from Elsevier)

8.3.7. Validation of the parameter estimates by Frontal Analysis

The presented determination of adsorption isotherm parameters by parameter estimation

is usually referred to as inverse or indirect method. In the inverse method (see Section

3.1) model predictions are fitted to experimental data for the determination of parameters

of a preselected isotherm model. An alternative well established method is the so called

Frontal Analysis (FA), which is a direct chromatographic method for the determination of

adsorption isotherms [47]. In the following, FA is used to validate the estimated Langmuir

isotherm model parameters.

In contrast to the inverse method, FA is model independent, i.e. the shape of the de-

termined isotherm is not predefined by a selected isotherm model structure. However, for

the determination of multi-component adsorption isotherms (especially for more than two

components), FA also requires a large amount of experimental data. To limit the corre-

162

8.3. Results

Figure 8.15.: D-optimal adaptive input design for feeding strategy FS-2. Subfigures a, b showthe measured sum and predicted individual outlet concentrations. Subfigure c showsthe input design, i.e. the inlet concentrations. Subfigure d shows the results of theidentifiability analysis for the parameters θ1, · · · , θ6. If a parameter was identifiableand selected by the subset selection (SsS) algorithm, this parameter was active andits activity was indicated by a dot. If a dot was missing, the parameter was notactive and not identifiable. Subfigure e shows the course of the parameter estimates.(Figure taken from publication IV- Barz et al. (2016) in Appendix A.2 - reprintedfrom Computers & Chemical Engineering with permission from Elsevier)

sponding laboratory work, the validation is carried out for single component adsorption

only, i.e. the competitive isotherm behavior is not considered.

In FA, a solution of the studied component, at a known, constant concentration, is

percolated through the column. Successive step changes of increasing concentration are

performed at the column inlet and the breakthrough curves (transient concentration pro-

files) at the column outlet are determined. For each new inlet concentration the mass of

the solute adsorbed at equilibrium is determined from the integral of the breakthrough

curve [47]. In doing so, the adsorption isotherms are directly determined from the available

measurement data, a mass balance equation and geometric data of the column.

The results of the validation are shown in Fig. 8.16 for both feeding strategies, FS-

1 and FS-2. It can be seen, that butyl benzoate shows the strongest adsorption while

propyl and methyl benzoate are weaker adsorbed. Moreover, the model predictions for

the parameter estimates obtained using D-optimal designs are close to the calculated

163


equilibrium points from FA. Larger deviations exist for the parameter estimates obtained

from heuristic designs. These results are confirmed by analysis of the accuracy of the

estimated parameter values, i.e. their standard deviations, for different input designs in

Table 8.3.

Figure 8.16.: Validation of the parameter estimates for single component adsorption. Adsorptionisotherms obtained by FA are shown by calculated equilibrium points. Predictedadsorption isotherms using the Langmuir model are shown by lines. Predictionsare made using parameter estimates from D-optimal designs and standard inputdesigns (uniform and pulse). (Figure taken from publication IV- Barz et al. (2016) inAppendix A.2 - reprinted from Computers & Chemical Engineering with permissionfrom Elsevier)

8.3.8. Optimal input designs vs standard input designs

In the following, the optimal input designs are compared to standard input designs in

terms of the achieved parameter precision (see Section 2.5.1). The standard input designs

were generated for FS-1 and FS-2 using the same control grid and concentration range as

for the optimal input design. The following standard designs were considered:

• a sum of sinusoids (sine),

• rectangular pulses (pulse),

• uniformly distributed random signals (uniform),

Table 8.3 shows the results for the D-optimal criterion and the corresponding standard

deviations of the individual parameter values.

As expected, compared to most standard input designs, the D-optimal experiments

can significantly improve the parameter precision. Moreover, the individual feeding of

concentrations (FS-2) provides more informative measurement data than the feeding with

constant and equal concentration ratio. Comparing the results for the D-optimal design

for FS-2 with the widely used standard pulse experiments for FS-1, a reduction by a factor

164

8.4. Conclusions

Table 8.3.: Parameter accuracy for experimental data obtained from different input designs andfeeding strategies FS-1 and FS-2. Input designs marked by a star were realized experi-mentally, all other are ’in silico’ experiments. (Table taken from publication IV- Barzet al. (2016) in Appendix A.2 - reprinted from Computers & Chemical Engineeringwith permission from Elsevier)

Input Design D-optimal Standard Deviation

criterion θ1 θ2 θ3 θ4 θ5 θ6

FS-1: sine 0.0009295 0.0732 0.0539 0.0272 0.0911 0.1088 0.0516FS-1: pulsea* 0.0005713 0.0463 0.0266 0.0349 0.0650 0.0647 0.0804FS-1: pulseb* 0.0005221 0.0342 0.0457 0.0262 0.0785 0.0648 0.0636FS-1: uniforma* 0.0003479 0.0431 0.0316 0.0174 0.0470 0.0585 0.0349FS-1: uniformb* 0.0003476 0.0313 0.0171 0.0426 0.0589 0.0330 0.0491FS-1: D-optimala* 0.0001732 0.0283 0.0217 0.0130 0.0285 0.0396 0.0264FS-1: D-optimalb* 0.0001638 0.0133 0.0245 0.0186 0.0235 0.0323 0.0370

FS-2: sine 0.0003988 0.0496 0.0340 0.0228 0.0363 0.0353 0.0274FS-2: uniform* 0.0002389 0.0182 0.0283 0.0356 0.0203 0.0269 0.0301FS-2: D-optimal* 0.0001100 0.0119 0.0183 0.0238 0.0134 0.0156 0.0187

of five in the D-optimal criterion is achieved, i.e. 0.0005713 to 0.0001100. Figures 8.17

and 8.18 show the results for different input designs for FS-2.

Figure 8.17.: Input design (subfigure c) and outlet concentrations (subfigure a, b) for feeding strat-egy FS-2 and a standard input design generated by a sum of sinusoids; ’in silico’experiment. (Figure taken from publication IV- Barz et al. (2016) in Appendix A.2- reprinted from Computers & Chemical Engineering with permission from Elsevier)

8.4. Conclusions

The numerical results of the experimental realization of an adaptive optimal input de-

sign for parameter determination in liquid chromatography was in this chapter presented.

Competitive Langmuir isotherm parameters of a three component mixture were identified

165


Figure 8.18.: Input design (subfigure c) and outlet concentrations (subfigure a, b) for feeding strat-egy FS-2 and a standard input design generated by an uniform sampling; real ex-periment. (Figure taken from publication IV- Barz et al. (2016) in Appendix A.2 -reprinted from Computers & Chemical Engineering with permission from Elsevier)

by fitting model predictions to measure concentrations at the column outlet for a given

experiment.

The accurate and precise determination of the parameters was secured by the treat-

ment of instabilities (ill-conditioning issues) by performing regularized online parameter

estimations and optimal experiment designs. The quality of the estimated parameters was

validated by comparing the predicted isotherms with results from Frontal Analysis (FA).

In liquid chromatography the optimal adaptive experimental input design compared to

other established methods, see e.g. Frontal Analysis [77], proved to be significantly more

efficient regarding the experimental effort. This is especially true if mixtures with more

than two components need to be analyzed. Moreover, for the presented case study, all

adsorption parameters could be identified using a non-selective concentration UV detector

which only provided the sum concentration of all components. This is a big advantage, as

for conventional established methods in which individual concentration measurements are

required but normally not available, see e.g. [77].

It was also discussed that numerical regularization is crucial in order to stabilize the

parameter estimation, i.e. minimize the variations in the parameter estimates during

the online estimation, and to compute meaningful experiment design criteria. This is

especially true at the beginning of an experiment where measurement information is scarce

and the sensitivity matrix is ill-conditioned. However, a careful tuning of corresponding

regularization parameters is required. In this direction, some guidelines to select the

corresponding regularization parameters for subset selection and Tikhonov were given.

Although the regularization parameters are application depending, it was here shown that

the analysis of the singular value spectrum (specifically the largest and smallest singular

value) assists in selecting these parameters. Moreover, the effect of weak and strong

166

8.4. Conclusions

regularizations in the context of online parameter determination by using subset selection

and Tikhonov was addressed. It was also shown that not only a single regularization

parameter value adequately performs in getting rid of the ill-conditioning but there exists

an interval to find them.

In this case study the Tikhonov regularization outperformed the parameter subset se-

lection based regularization in terms of smoothness of the parameter stability and conver-

gence during the online estimation. That is because Tikhonov regularization is based on a

smooth weighting between initial parameter guess and best available estimate (result from

last iteration) as well as identifiable and non identifiable parameters. In contrast, subset

selection completely excludes non identifiable parameters from the problem and thus these

parameters cannot be improved. Accordingly, if the initial guess of these non identifiable

parameters is far away from the true value and also has large parameter variance, it may

stronger contribute to a meaningless experimental redesign of the remaining active param-

eters (although for D-design that fact is not so crucial [78]) . However, parameter selection

methods, such as subset selection are attractive from a practical viewpoint as they provide

useful information for the monitoring of adaptive design strategies, as the importance of

individual parameters and correlations between them.

It is important to highlight that the application of regularization only numerically aids

the ill-conditioning in the estimation. The improvement of parameter identifiability is

only enhanced by the optimization-based experimental redesign (by optimal experimental

design), collection of new informative experimental data and properly selection of the

initial guess when the model structure is reliable. Nevertheless, without regularization the

optimal experimental design could yield inefficient designs affecting also the identifiability.

167

9. Summary and Outlook

9.1. Summary

The development of high-quality and validated models of process systems is justified not

only for the research trend to increase the knowledge about the process, but mainly for

the economic benefits generated by model-based product and process design, simulation,

control and optimization. Physical, chemical or biological laws are used to build these

appropriate and much desired mechanistic models which have the important task to repre-

sent the investigated process. These laws substantially impose the model structure which

invariably contains unknown parameters to be determined.

In the identification of these models the parameter estimation problem using e.g., max-

imum likelihood or least-squares estimation, is solved. The parameter estimation is an

inverse problem where unknown parameters have to be inferred from measurement data.

Lack of informative experimental data, highly noisy and correlated measurements, inad-

equate initial guesses, correlated and insensitive parameters among others are common

challenges hampering an accurate model identification and leading to ill-conditioning and

identifiability issues. When that happens, the estimation becomes unstable with possibly

convergence and numerical problems. In that case, it is said that the problem is ill-posed

for the presence of ill-conditioned matrices.

Although the ill-conditioning analysis and identifiability diagnosis have been separately

and widely discussed in literature they had not yet been considered as required components

in the same framework of process model development. Their application had rather been

the matter of special cases, situations or complicated models. However, it was proved

along the thesis that ill-posed problems arising from ill-conditioned matrices more than

special cases could be a normal situation in any stage of the model development.

In this thesis a consolidated computational framework to systematically evaluate ill-

posedness in model-based parameter estimation and experimental design was proposed

and successfully tested. It included the ill-conditioning and identifiability analysis as well

as the implementation of numerical regularization. The computational framework also

considered two major paradigms to evaluate the quality of parameter estimates, namely

the Sensitivity and Monte Carlo methods. The framework may be used as a whole piece

or segregated to work either on parameter estimation or optimal experimental design

of a previously selected model structure. It is also conceived to treat adaptive designs

and online parameterizations. Moreover, it should be noted that the framework might

be applied even to well-posed problems. In fact, it can be applied to general problems

regardless their ill-posedness state. The framework could be easily used to determine if

169


other sources of experimental information (including type of measures, measuring error,

sample frequency, etc.) are indeed aiding the ill-conditioning and identifiability of any

process model. This situation was evidenced in the case study of the Lithium battery.

The application of the framework proved to efficiently work on detecting and dealing

with a strongly over-parameterized models (e.g., the case study of Bioethanol). After

successful termination, the number of considered parameters was reduced to a relatively

small subset of the original parameter space in order to regularize the ill-posed problem.

Thus, the most influencing parameters for selected operating conditions were identified

and their uncertainty was significantly decreased.

Ill-conditioning analysis to diagnose identifiability problems. Throughout this thesis

was presented the necessity of formulating well-posed problems for the numerical computa-

tion of stable and unique solutions in model-based parameter estimation and experimental

design. Thus, it is strongly advisable to perform the relatively simple local analysis of the

ill-conditioning of the sensitivity matrix. Moreover, if an ill-posed problem is identified,

its ill-posedness type (either rank deficient or of ill-determined rank) and severity should

be assessed. Important indicators suggested in this thesis are the singular value spectrum,

condition number, and collinearity index of the sensitivity matrix.

In this thesis the connection between the eigenvalues of the Fisher-information matrix

and the parameter covariance matrix with the singular values of the sensitivity matrix to

identify the presence and sources of ill-conditioning was exploited. The rationale was that,

if certain parameters were unidentifiable, some columns of the sensitivity matrix were lin-

early dependent which implied that the Fisher and parameter covariance matrices were

singular or nearly singular [31, 60, 78, 120]. Columns of the sensitivity matrix that are

nearly linear dependent are a typical source of poor estimator performance [12, 78]. Thus,

computing the numerical rank of the sensitivity matrix says if the sensitivity matrix is ill-

conditioned. By using orthogonal decompositions (i.e., SVD and QRP decomposition) the

influence of ill-conditioned singular values on parameter variances can be analyzed. These

procedures ultimately result in parameter rankings that are used to determine unidentifi-

able parameters under different sources of experimental information. These rankings can

be used to provide recommendations for regularization and design of priors [78].

Sensitivity method vs Monte Carlo method. It was here demonstrated that local identi-

fiability methods based on sensitivities (i.e., SVD, variance and QR methods) in presence

of ill-posedness can only detect the presence of the identifiability problems and even the

number and the unidentifiable parameters. This provides an advantage over the most

rigorous but also more expensive Monte Carlo method. However, the computed variances

and therefore the confidence intervals/regions are not reliable in the sensitivity method

because the variances are inflated by the presence of ill-conditioned singular values of the

sensitivity matrix. For a better quantification of the parameter uncertainty the Monte

170

9.1. Summary

Carlo method is highly recommended.

Influence of ill-posed problems in optimal experimental design. In optimal experimen-

tal designs coming from ill-posed parameter estimations it is highly recommendable to do

the numerical analysis and implementation (computation of the alphabetic criteria) based

on the singular values of the sensitivity matrix. The direct application of the alphabetic de-

sign criteria based on the eigensystem of the Fisher-information/covariance matrices may

lead to numerical instabilities and meaningless designs for the next parameter estimation.

Although the computation of the singular values is numerically more stable than the

eigenvalues also for ill-posed problems, small singular values of the sensitivity matrix

(especially those near zero) will have a large influence on the design criteria. They can

produce huge criterion values (specially for A- and E- designs), which will then complicate

or even impede an appropriate optimization. Thus, a further action to control the smallest

singular values of the sensitivity matrix (e.g., the regularization of the sensitivity matrix),

which directly control the most uncertain parameters, is needed.

Additionally, a graphical interpretation of the influence of the alphabetic experimental

design criteria applied to the singular value spectrum was also presented. It turned out

that A- and E-optimal criteria mainly improved (incremented) the smallest singular values

of the sensitivity matrix while D-optimal criterion improved the largest singular values.

Thus, the potential of an experimental design of improving the parameter precision and

the ill-conditioning could be similarly analyzed as the well-known graphical interpretation

of the influence of the alphabetic design criteria applied to the parameter variances.

It is important to point out that during this research it was possible to establish that any

increment in the singular value spectrum (promoted by the optimal experimental design)

does not automatically mean an improvement in the ill-posedness of the problem despite

of the parameter variance reduction. On the contrary, computed optimal designs might

generate singular value spectra with large values on the top section which also may lead

to large condition numbers if an adequate increase in the bottom section is not achieved.

In that case, the next parameter estimation will be again ill-conditioned and its estimates

unstable.

Numerical regularization. In this regard, the thesis was showing that each regularization

technique modified the problem to somehow improve (reduce) its ill-conditioning. This

modification was directly evidenced in the singular value spectrum (specifically in the zone

of the ill-conditioned singular values) of the sensitivity matrix of the regularized problem.

In the regularization techniques here treated, the ill-conditioned singular values were con-

trolled either by elimination or transformation. In the identifiable subset selection, the

problem was transformed excluding the unidentifiable parameters and by this the related

ill-conditioned singular values. In truncated singular value decomposition (TSVD) the

ill-conditioned singular values were substituted by zero and the new problem only consid-

171


ered the largest singular values of the original problem. Finally, the effect of Tikhonov

regularization in the singular value spectrum was to fix the ill-conditioned singular values

to the magnitude of the squared regularization parameter (λ2), whilst the large singular

values were not largely modified.

For subset selection and TSVD there exists the tendency to conserve the largest singular

values which are associated to the well-conditioned parameters. From an application

point of view, the subset selection seems the most natural approach, as the regularization

acts in the original parameter space. It preserves the physical meaning of parameters

and provides useful information on the number of identifiable parameters as well as on

the ranking of parameters regarding their linear independence and sensitivity. That is

specially attractive from a practical viewpoint as they provide useful information for the

monitoring of adaptive design strategies as the importance of individual parameters and

linear dependence between them.

However, it has to be noted, that a change in the dimension of the identifiable parameter

subset (in subset selection) or the number of well-conditioned singular values (in TSVD)

introduces a discontinuity in the evaluation of the design criterion. This problem is similar

to the inherent non-differentiability in the definition of the E-optimal criterion, where a

possible switching in the smallest eigenvalue introduces this behavior.

It was shown that the Tikhonov regularization yielded an increment (lift) of the singular

value spectrum of the sensitivity matrix maintaining constant the original parameter space

dimension and therefore the number of singular values. The lift of the singular value

spectrum yielded a transformation of the bottom section of the singular value spectrum

by fixation of the small singular values to λ2. Accordingly, singular values less than the

respective regularization parameter are replaced by λ2.

In terms of the optimal experimental design under regularization, the regularized op-

timal design although sometimes could improve the ill-conditioning of the regularized

problem it rarely reduced the ill-conditioning of the original one. Moreover, although

the regularizations ensured the obtention of a solution they did not provide information

whether the obtained solution was still useful in the original context. Therefore, the ill-

conditioning analysis after conducting the optimal design of ill-posed problems is highly

recommended.

When applying Tikhonov regularization, a computed E-optimal design was not viable if

the new design did not promote a singular value spectrum with the smallest singular value

larger than λ2. Here, the regularization parameter continued fixing the small singular

values to lambda (i.e., ςNθ≈ λ2) and made an optimization impossible fixing the E-

criterion value to 1/(λ2)2 in each iteration.

In this thesis some guidelines to select the regularization parameter for subset selection

and TSVD (i.e., ϵ-threshold), and Tikhonov (i.e., λ) were given. Although these parame-

ters are application-dependent, it was here shown that the analysis of the singular value

spectrum of the sensitivity matrix can assist in their selection. For instance, the regulariza-

172

9.2. Outlook

tion parameters are bounded for the largest and smallest singular values. Regularization

parameters larger than the largest singular value totally regularize the problem letting

it without degrees of freedom, whereas values smaller than the smallest singular value

promote a regularization-free problem. Therefore the regularization parameter candidates

should be taken from the interval spanned by these singular values.

A further hint to select the regularization parameter was that the proper regularization

parameter should be as large as the first singular value which was considered to cause the ill-

conditioning. Accordingly, not only a nominal regularization parameter value adequately

performed in getting rid of the ill-conditioning but instead there was an interval where the

appropriate values could be found.

In the case of online estimation, the Tikhonov regularization outperformed the param-

eter subset selection in terms of smoothness when the parameter estimates were updated

and re-designs were adapted. It was an expected behavior not only in online approaches but

in offline approaches when the prior information is available. That was because Tikhonov

regularization by definition smoothly weights a priori information (initial parameter guess)

and the best available estimate (result from last iteration) in the penalty term. Doing so,

the unstable (unidentifiable) parameters are gently controlled (depending on the regular-

ization parameter value and L matrix) but not fixed/excluded in the parameter estimation.

In contrast, subset selection completely fixed/excluded the unidentifiable parameters in

the estimation and thus these parameters could not be improved. This was specially con-

cerning in early estimations when large errors in the estimated parameters are usual. If the

unidentifiable parameters are kept on very wrong values, it can introduce possibly a large

bias on estimates of the identifiable parameters when parameter correlations exist. The

same can occur to the Tikhonov regularization nevertheless this technique is less sensitive

to it due to the mentioned smooth weighting in its formulation.

The application of regularization only numerically aided the ill-conditioning of the es-

timation. The improvement of parameter identifiability (when the model structure is

reliable) was only enhanced by collection of sufficient informative and precise experimen-

tal data, properly selection of initial guesses and appropriate optimal experimental designs.

Nevertheless, without regularization the available experimental information could not be

satisfactorily exploited in the parameter estimation and the optimal experimental design

could yield inefficient experimental conditions affecting also the parameter identifiability.

9.2. Outlook

Implementation of the computational framework to systematically evaluate ill-posedness

in model-based parameter estimation and experimental design in a general and visible

simulation/optimization platform such as MOSAIC (a modeling environment based on

internet standards XML and MathML developed at the chair of process dynamics and

operation, Technische Universitaet Berlin) [72]. It will expand the usability of the frame-

173


work allowing more users to independently apply the consolidated framework on their

corresponding tasks of process model development.

The selection of the regularization parameter is still an heuristic task. Although, it

was here presented some guidelines to start the search and conduct the selection, a more

systematic approach is still missed to reduce the time and effort of getting this parameter.

A deep study of the information given by the parameter variance-decomposition along with

the analysis of the singular value spectrum of the sensitivity matrix could supply better

insights of the suitable value of this parameter. In the case of Tikhonov regularization

the inclusion of the regularization parameter in the optimal experimental design as a new

decision variable (see Ref. 4 for the linear case) is a strategy to be proved.

During this thesis in some case studies (see for instance Chapter 7) was assumed that no

a priori information of the parameters (i.e., nominal values and variances) was available

in the Tikhonov regularization. For instance, the predefined vector θR was set to zero and

no prior individual parameter variances were considered to be known in the estimation.

Accordingly, the exploitation of including a priori information in the Tikhonov scheme

could be still achieved as it was accomplished in the online estimation and redesign of

Chapter 8. That could be done either including the best available information of the

parameters in θR or the known parameter precision in the matrix L as the inverse of the

parameter covariance, or both information simultaneously. In that sense, the formulation

of the Bayesian approach [105] for parameter estimation could be also considered.

Test the adapted experimental design criterion (i.e., Mean Squared Error, MSE) for

nonlinear ill-posed problems proposed by Ref. [49] which considers both, the estimator

variance and its bias. This technique is computationally demanding as it leads to a bi-

level optimization problem, where at each design iteration the inverse problem must be

calculated for quite a large number of training parameter vectors (as true values) and noise

realizations. The implementation of this technique requires special numerical methods and

large computation time therefore it is not suitable for online applications, however it would

be still interesting to compare its performance to that obtained with the alphabetic design

criteria only based on the parameter variance.

Employ hybrid approaches for parameter estimation and optimal experimental design

by combining stochastic optimization techniques (e.g., simulated annealing and genetic

algorithms) [111] and gradient-based methods (e.g., Levenberg-Marquardt and interior

point algorithms). Initial guesses obtained from stochastic optimization could be then

provided to the gradient-based algorithm. Doing so, the benefits of a global search of

the stochastic techniques with the fast converge of the gradient-based methods could be

exploited.

Apply more global techniques to determine identifiable and active parameter subspaces

[32] in order to consider a more reliable output’s variability over the full range of pa-

rameters. Those techniques are based on an augmented sensitivity matrix made of the

sensitivities at random parameter values sampled from an specific probability density.

174

9.2. Outlook

Finally, for large-scale nonlinear models, the development of an implementation tailored

to high-performance computing architectures [135] to accelerate analysis times can be done.

For those cases the use of automatic differentiation to construct the sensitivity equations

could be also tested.

175

A. Appendix

A.1. Own publications and presentations

A.1.1. Articles in Journals

1. T. Barz, D. C. Lopez C., M. N. Cruz B., S. Korkel, S. F. Walter. Real-time adaptive

input design for the determination of competitive adsorption isotherms in liquid

chromatography. Computers & Chemical Engineering, 94:104-116, 2016

2. T. Barz, C. Zauner, D. Lager, D. C. Lopez C., F. Hengstberger, M. N. Cruz B., K.

Marx. Experimental analysis and numerical modeling of a shell and tube heat storage

unit with phase change materials. Industrial & Engineering Chemistry Research,

55(29):8154-8164, 2016

3. D. C. Lopez C., G. Wozny, A. Flores-Tlacuahuac, R. Vasquez-Medrano, and V. M.

Zavala. A computational framework for identifiability and ill-conditioning analy-

sis of lithium-ion battery models. Industrial & Engineering Chemistry Research,

55(11):3026-3042, 2016

4. D. C. Lopez C., T. Barz, S. Korkel, and G. Wozny. Nonlinear ill-posed problem

analysis in model-based parameter estimation and experimental design. Computers

& Chemical Engineering, 77:24-42, 2015

5. D. Muller, E. Esche, D. C. Lopez C., and G. Wozny. An algorithm for the iden-

tification and estimation of relevant parameters for optimization under uncertainty.

Computers & Chemical Engineering, 71:94-103, 2014

6. N. Yakut, T. Barz, D. C. Lopez C., and G. Wozny. Online Redesign Technique for

Closed-loop System Identification. AIDIC Conference series; 11: 421-430, 2013

7. T. Barz, D. C. Lopez Cardenas, H. Arellano-Garcia, and G. Wozny. Experimental

evaluation of an approach to online redesign of experiments for parameter determi-

nation. AIChE Journal, 59:1981-1995, 2013

8. D. C. Lopez C., T. Barz, M. Penuela, A. Villegas, S. Ochoa, and G. Wozny. Model-



29(4):1064-1082, 2013

177

A. Appendix

9. D. C. Lopez C., L. J. Hoyos, C. Mahecha, H. Arellano-Garcia, and G. Wozny. Opti-

mization Model of Crude Oil Distillation Units for Optimal Crude Oil Blending and

Operating Conditions. Industrial & Engineering Chemistry Research, 52(36):12993-

13005, 2013

A.1.2. Oral Presentations and Posters

(∗ presenting author) († with publication in Proceedings)

Oral Presentations

1. D. Muller∗ †, E. Esche, D. C. Lopez C., and G. Wozny. Systematic parameter

selection for optimization under uncertainty. 8th International Conference on Foun-

dations of Computer-Aided Process Design - FOCAPD 2014, Washington, USA,

2014

2. D. C. Lopez C.∗, T. Barz, and G. Wozny. Strategies for dealing with ill-posed prob-

lems in on-line optimum experiment design. AIChE Annual Meeting, San Francisco,

USA, 2013

3. N. Yakut†, D. C. Lopez C.∗, T. Barz, H. Arellano-Garcia, and G. Wozny. Online

model-based redesign of experiments for parameter estimation applied to closed-loop

controller tuning. 11th International Conference on Chemical & Process Engineering

(ICheap11), Milan-Italy, 2-5 June 2013

4. D. C. Lopez C.∗, T. Barz, and H. Arellano-Garcia. Online model-based experimental

design. AIChE Annual Meeting, Pittsburgh, USA, 2012

5. D. C. Lopez C.∗ †, T. Barz, H. Arellano-Garcia, G. Wozny, A. Villegas, and S. Ochoa.

Subset selection for improved parameter identification in a Bio-Ethanol production

process. 19th International Conference of Process Engineering and Chemical Plant

Design, Cracow, Poland, 2012

6. D. C. Lopez∗, T. Barz, H. Arellano-Garcia, and G. Wozny. An improved approach

to parameter subset selection for parameter estimation in online applications. 25th

European Conference on Operational Research (EURO-2012), Vilnius, Lithuania,

2012

Poster

1. D. C. Lopez∗, M. N. Cruz, T. Barz, and P. Neubauer. Parameter estimation,

ill-conditioning and identifiability analysis of the anaerobic digestion model No 1

(ADM1) for biogas production. Society for Industrial and Applied Mathematics

(SIAM) Annual Meeting, Boston, USA, 2016

178

A.2. Own publications used for the cumulative thesis

2. E. Anane∗, C. Reitz, F. V. Ebert, M. N. Cruz, D. C. Lopez, S. Junne, and P.

Neubauer. Modelling dissolved oxygen and glucose gradients in pulse-based fed batch

culture of Escherichia coli. 4th BioProScale Symposium “Bioprocess intensification

through Process Analytical Technology (PAT) and Quality by Design (QbD)“, Berlin,

Germany, 2016

3. D. C. Lopez†, L. J. Hoyos, A. Uribe, S. M. Chaparro, H. Arellano-Garcia∗, and G.

Wozny. Improvement of Crude Oil Refinery Gross Margin using a NLP Model of

a Crude Distillation Unit System. 22nd European Symposium on Computer Aided

Process Engineering (ESCAPE-22), London, UK, 2012

A.1.3. Proceedings

1. D. Muller, E. Esche, D. C. Lopez C., and G. Wozny. Systematic parameter selection

for optimization under uncertainty. Computer Aided Chemical Engineering, 717-722,

2014

2. N. Yakut, D. C. Lopez Cardenas, T. Barz, H. Arellano-Garcia, and G. Wozny. Online

Model-Based Redesign of Experiments for Parameter Estimation Applied to Closed-

loop Controller Tuning. Chemical Engineering Transactions, 32:1195-1200, 2013

3. D. C. Lopez, T. Barz, H. Arellano-Garcia, G. Wozny, A. Villegas, and S. Ochoa.

Subset selection for improved parameter identification in a Bio-Ethanol production

process. Technical Transactions -Mechanics, 109(5):137-147, 2012

4. D. C. Lopez, L. J. Hoyos, A. Uribe, S. Chaparro, H. Arellano-Garcia, and G. Wozny.

Improvement of Crude Oil Refinery Gross Margin using a NLP Model of a Crude

Distillation Unit System. Computer Aided Chemical Engineering, 22nd European

Symposium on Computer Aided Process Engineering, 30:987-991, 2012

A.2. Own publications used for the cumulative thesis

This thesis is based on the following publications. The order is chronological. Please note

that the full text papers are only available via the publisher or via personal communication

with the author.

Publication I:

D. C. Lopez C., G. Wozny, A. Flores-Tlacuahuac, R. Vasquez-Medrano, and V. M. Zavala.

A computational framework for identifiability and ill-conditioning analysis of lithium-

ion battery models. Industrial & Engineering Chemistry Research, 55(11):3026-3042, 2016

179

A. Appendix

Publication II:

D. C. Lopez C., T. Barz, M. Penuela, A. Villegas, S. Ochoa, and G. Wozny. Model-



29(4):1064-1082, 2013

Publication III:

D. C. Lopez C., T. Barz, S. Korkel, and G. Wozny. Nonlinear ill-posed problem analysis

in model-based parameter estimation and experimental design. Computers & Chemical

Engineering, 77:24-42, 2015

Publication IV:

T. Barz, D. C. Lopez C., M. N. Cruz B., S. Korkel, and S. F. Walter. Real-time

adaptive input design for the determination of competitive adsorption isotherms in liquid

chromatography. Computers & Chemical Engineering, 94:104-116, 2016

A.3. Bio-processes: Parameter variance and

variance-decomposition

In this appendix the variance-decomposition computed according to Section 3.2.2 used to

conduct the identifiability analysis in Section 4.4.2 is summarized. The mentioned decom-

position was performed before OED, i.e., evaluated at the initial design uIG on the sensi-

tivity matrix S(uIG). Double underlined variance-decomposition proportions indicate pa-

rameters preliminary selected as unidentifiable according to SVD method in Section 4.4.2.

For examples E1 and E2, the selection criterion is an individual variance-decomposition

proportion πi greater than 0.5. For Example E3, this criterion considers the sum of πi

associated to the ill-conditioned singular values (i.e. ς8 to ς44) greater than 0.5.

180

A.3. Bio-processes: Parameter variance and variance-decomposition

Table A.1.: Variance-decomposition: E1 - Fed Batch Fermentation. (Figure taken from publicationIII - Lopez et al. (2015) - reprinted from Computers & Chemical Engineering withpermission from Elsevier).

Parameter

Parameter

variance

Variance-decomposition

proportion () associated with

singular value

1 2,361E+01 0,0% 0,6% 7,5% 91,9%

2 7,706E+02 0,0% 0,0% 0,7% 99,3%

3 2,038E+02 0,0% 0,0% 7,9% 92,1%

4 1,047E+01 0,0% 0,2% 26,2% 73,5%

Table A.2.: Variance-decomposition: E2 - Biochemical network. (Figure taken from publicationIII - Lopez et al. (2015) - reprinted from Computers & Chemical Engineering withpermission from Elsevier).

Parameter

Parameter

variance

Variance-decomposition proportion () associated with singular value

1 7,41E+00 0,0% 0,0% 0,0% 0,1% 0,3% 0,4% 1,6% 7,1% 2,6% 87,9% 97,5%

2 1,26E+03 0,0% 0,0% 0,0% 0,0% 0,0% 0,5% 0,9% 7,0% 2,4% 89,2% 98,6%

3 1,33E+03 0,0% 0,0% 0,0% 0,0% 0,0% 0,2% 0,3% 0,0% 0,0% 99,5% 99,5%

4 9,67E+00 0,0% 0,0% 0,0% 0,3% 0,1% 49,6% 47,6% 1,9% 0,1% 0,4% 2,4%

5 2,67E+07 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 35,2% 64,8% 100,0%

6 2,71E+07 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 35,2% 64,8% 100,0%

7 3,06E+03 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 41,2% 1,7% 57,1% 100,0%

8 3,20E+04 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 41,1% 1,6% 57,2% 100,0%

9 1,50E-01 0,0% 0,1% 1,2% 15,6% 52,9% 0,6% 27,3% 0,0% 0,8% 1,4% 2,2%

10 2,41E+11 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 100,0% 100,0%

181

A. Appendix

TableA.3.:Variance-decom

position:E3-ASM3.

(Figure

takenfrom

publicationIII-Lopez

etal.

(2015)-reprintedfrom

Computers

&Chem

icalEngineeringwith

permission

from

Elsevier).

Pa

ram

eter

P

ara

met

e

r va

ria

nce

( )

Va

ria

nce

-dec

om

po

siti

on

pro

po

rtio

n (

) a

sso

cia

ted

wit

h s

ing

ula

r va

lue

11

,9E

+0

90

,0%

0,0

%0

,0%

0,0

%0

,0%

0,4

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,1%

2,6

%1

,5%

12

,1%

33

,0%

49

,4%

0,7

%0

,0%

10

0,0

%

21

,4E

+1

40

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

2,0

%0

,8%

11

,6%

28

,1%

56

,3%

1,2

%0

,0%

10

0,0

%

32

,5E

+0

90

,1%

0,0

%0

,0%

0,1

%0

,0%

0,4

%1

,1%

0,5

%0

,0%

0,4

%0

,1%

2,0

%0

,3%

0,8

%1

,1%

8,0

%3

4,0

%5

0,6

%0

,2%

0,0

%1

00

,0%

45

,6E

+0

80

,1%

0,0

%4

,6%

0,9

%0

,5%

0,7

%0

,1%

0,0

%2

,2%

1,3

%0

,0%

1,3

%0

,9%

8,0

%1

,4%

8,0

%8

,9%

3,9

%5

6,3

%0

,0%

10

0,0

%

51

,3E

+1

30

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,3%

0,4

%0

,3%

3,7

%0

,0%

3,1

%0

,2%

6,5

%4

,2%

16

,5%

64

,6%

0,2

%1

00

,0%

68

,8E

+0

72

,6%

0,0

%0

,8%

0,2

%2

,2%

0,0

%0

,1%

0,1

%0

,3%

6,1

%4

,2%

9,8

%6

,4%

11

,0%

1,3

%6

,2%

5,9

%3

2,4

%9

,9%

0,0

%1

00

,0%

72

,1E

+1

10

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,9%

0,7

%2

,7%

2,0

%0

,7%

0,3

%0

,6%

52

,2%

0,0

%3

,4%

36

,7%

0,0

%1

00

,0%

81

,9E

+1

10

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%1

,0%

0,6

%2

,6%

1,8

%0

,5%

0,4

%0

,6%

51

,0%

0,1

%2

,8%

38

,6%

0,1

%1

00

,0%

91

,4E

+0

80

,1%

0,0

%4

,8%

0,1

%0

,0%

16

,2%

4,0

%0

,5%

0,3

%0

,0%

0,7

%2

,2%

2,7

%7

,7%

30

,2%

15

,0%

6,4

%0

,0%

2,1

%0

,1%

10

0,0

%

10

2,4

E+

12

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

1,0

%0

,6%

2,6

%1

,8%

0,5

%0

,3%

0,6

%5

2,4

%0

,1%

2,8

%3

7,2

%0

,0%

10

0,0

%

11

2,5

E+

12

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,9

%0

,7%

2,6

%2

,0%

0,6

%0

,3%

0,6

%4

9,8

%0

,0%

3,8

%3

8,7

%0

,0%

10

0,0

%

12

4,6

E+

11

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,3

%0

,8%

11

,2%

12

,5%

1,0

%0

,3%

0,5

%9

,0%

31

,6%

6,2

%2

6,0

%0

,5%

10

0,0

%

13

4,5

E+

11

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,2

%0

,8%

11

,2%

12

,5%

1,0

%0

,3%

0,5

%9

,0%

31

,7%

6,2

%2

6,0

%0

,5%

10

0,0

%

14

1,2

E+

06

0,9

%0

,1%

0,3

%0

,4%

0,6

%0

,8%

0,2

%0

,1%

0,2

%2

,1%

1,9

%8

,1%

0,1

%1

,5%

3,0

%1

,2%

46

,9%

5,3

%2

5,6

%0

,2%

10

0,0

%

15

8,0

E+

08

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,1%

0,0

%0

,0%

0,7

%0

,2%

11

,5%

43

,7%

28

,4%

11

,4%

3,8

%0

,0%

10

0,0

%

16

1,0

E+

10

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,1%

1,4

%0

,4%

0,5

%8

,6%

1,2

%5

,1%

82

,4%

0,4

%1

00

,0%

17

4,4

E+

11

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,3

%0

,5%

11

,4%

15

,5%

2,6

%0

,1%

0,2

%1

1,3

%2

3,8

%1

,0%

32

,5%

0,6

%1

00

,0%

18

4,9

E+

11

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,3

%0

,4%

10

,6%

14

,0%

2,2

%0

,2%

0,1

%8

,7%

17

,4%

1,9

%4

3,5

%0

,6%

10

0,0

%

19

1,3

E+

10

0,8

%0

,1%

0,2

%0

,1%

0,2

%0

,7%

3,9

%0

,0%

0,9

%4

,5%

3,1

%1

5,9

%7

,1%

3,6

%1

1,5

%3

9,7

%4

,2%

0,5

%2

,9%

0,1

%1

00

,0%

20

4,0

E+

10

0,0

%0

,2%

0,3

%0

,1%

1,0

%0

,2%

0,2

%0

,0%

1,5

%0

,1%

1,3

%0

,3%

7,5

%2

1,0

%4

,7%

0,0

%1

,1%

0,4

%5

9,4

%0

,6%

10

0,0

%

21

9,6

E+

12

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,4%

1,5

%0

,5%

0,7

%0

,7%

0,8

%1

,0%

23

,7%

67

,6%

3,2

%0

,0%

10

0,0

%

22

1,4

E+

13

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,2

%0

,1%

0,5

%0

,9%

0,1

%0

,7%

16

,2%

7,3

%7

3,6

%0

,2%

10

0,0

%

23

3,8

E+

11

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,9

%0

,6%

2,6

%1

,9%

0,6

%0

,3%

0,6

%5

1,7

%0

,0%

3,3

%3

7,5

%0

,0%

10

0,0

%

24

3,6

E+

11

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

1,0

%0

,7%

2,7

%1

,9%

0,6

%0

,3%

0,6

%5

1,1

%0

,0%

3,0

%3

8,1

%0

,0%

10

0,0

%

25

6,7

E+

10

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

5,1

%0

,4%

0,0

%0

,0%

0,0

%0

,0%

2,1

%8

,2%

6,4

%1

2,2

%1

3,2

%2

2,0

%2

9,9

%0

,2%

10

0,0

%

26

6,4

E+

10

0,0

%0

,0%

0,0

%0

,1%

0,4

%0

,0%

1,7

%1

,5%

0,0

%1

,0%

0,6

%2

,9%

5,7

%1

3,7

%3

,4%

1,2

%1

8,3

%2

6,6

%2

2,6

%0

,3%

10

0,0

%

27

2,3

E+

14

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,3%

0,4

%1

,3%

35

,0%

61

,4%

1,5

%0

,0%

10

0,0

%

28

7,1

E+

07

1,6

%0

,0%

0,3

%0

,1%

0,0

%0

,2%

1,0

%0

,2%

0,1

%0

,3%

0,1

%0

,0%

0,0

%0

,6%

6,6

%4

5,8

%2

0,2

%1

6,9

%4

,4%

0,0

%1

00

,0%

29

1,8

E+

06

0,2

%1

2,3

%0

,5%

5,1

%6

,8%

1,2

%5

,8%

2,8

%0

,0%

0,7

%0

,3%

8,6

%0

,1%

2,8

%1

,1%

2,8

%2

5,8

%0

,2%

18

,7%

0,3

%1

00

,0%

30

3,2

E+

07

0,1

%4

,5%

2,5

%1

,5%

0,2

%6

,3%

12

,6%

7,7

%2

,5%

2,8

%6

,0%

0,0

%0

,3%

5,6

%1

3,0

%5

,6%

0,8

%8

,9%

1,7

%0

,3%

10

0,0

%

31

4,5

E+

13

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,1%

0,0

%0

,0%

0,8

%0

,1%

12

,0%

42

,3%

29

,7%

11

,2%

3,7

%0

,0%

10

0,0

%

32

8,8

E+

17

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

99

,7%

0,3

%1

00

,0%

33

1,7

E+

10

0,0

%0

,0%

0,0

%1

,8%

1,9

%1

,9%

5,3

%0

,4%

2,4

%0

,5%

3,3

%1

3,2

%2

1,9

%4

,6%

1,3

%3

2,7

%0

,0%

8,6

%0

,1%

0,0

%1

00

,0%

34

2,0

E+

11

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%1

2,7

%0

,6%

5,8

%3

,7%

22

,5%

5,3

%0

,7%

2,0

%2

2,6

%4

,2%

1,2

%1

8,6

%0

,0%

10

0,0

%

35

1,2

E+

15

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,1

%0

,1%

0,0

%1

,4%

0,1

%0

,5%

97

,4%

0,3

%1

00

,0%

36

2,2

E+

09

0,1

%0

,0%

0,0

%0

,0%

0,0

%0

,1%

0,0

%0

,0%

0,1

%0

,1%

0,3

%0

,9%

1,1

%2

,1%

0,0

%1

,1%

0,4

%5

,8%

87

,3%

0,2

%1

00

,0%

37

1,4

E+

07

0,0

%0

,0%

0,5

%0

,0%

0,0

%4

,3%

26

,7%

0,1

%0

,2%

2,2

%6

,6%

16

,7%

0,6

%0

,0%

3,8

%0

,1%

3,9

%1

9,2

%0

,4%

0,0

%1

00

,0%

38

3,0

E+

08

0,1

%2

,6%

2,1

%1

,3%

0,9

%4

,8%

10

,4%

5,9

%3

,4%

5,5

%6

,0%

0,0

%0

,0%

4,2

%1

3,3

%4

,5%

1,2

%1

7,4

%1

,1%

0,2

%1

00

,0%

39

8,9

E+

14

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

16

,4%

25

,5%

57

,9%

0,2

%1

00

,0%

40

8,8

E+

13

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,1%

0,4

%1

,8%

5,6

%1

4,7

%0

,0%

10

,4%

66

,7%

0,3

%1

00

,0%

41

1,0

E+

13

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%1

,2%

0,0

%3

,7%

10

,7%

1,2

%1

,6%

3,4

%4

9,7

%2

3,9

%4

,5%

0,0

%1

00

,0%

42

1,2

E+

31

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%1

00

,0%

10

0,0

%

43

4,4

E+

29

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%0

,0%

0,0

%1

00

,0%

10

0,0

%

182

A.4. Implication of structural properties of the sensitivity matrix on Fisher-information matrix

A.4. Implication of structural properties of the sensitivity matrix

on Fisher-information matrix

Here it will be shown how the structural problems of the sensitivity matrix S are directly

reflected on the Fisher-information matrix F . This fact is illustrated demonstrating that

a full-rank sensitivity matrix implies a nonsigular Fisher matrix. Therefore, analyzing the

structural properties of S the structural problems of F may be detected.

Facts and implications:

• Let S ∈ RNy ·Nm·Ne×Nθ be of rank equal Nθ, i.e., r(S) = Nθ. As Ny ·Nm ·Ne ≥ Nθ

then S is of full-rank and S has Nθ linearly independent columns (see Definition

A.5.4 and its implications).

• Then the Nθ singular values of S are all positive, i.e., ςi > 0, i = 1, · · · , Nθ (see

Definition A.5.9).

• Taking into account that F is a symmetric matrix formed by the cross-product of S

(see Eq. 2.13 and Proposition A.5.3) and the relationship between the eigendecom-

position and SVD (see Proposition A.5.9), the eigenvalues of F are also all positive,

i.e., λi(F ) = ςi(S)2 > 0, i = 1, · · · , Nθ.

• That means that zero is not an eigenvalue of F and then F is nonsingular according

to Theorem A.5.6.

A.5. Matrices notions

In this section the notation and the most important linear algebra definitions used through-

out this thesis are declared. It is here assumed that all matrices and vectors in this work

belong to the real numbers (R), however most definitions are already extended to complex

numbers (C).

A.5.1. Matrices and vectors

Definition A.1 Matrix.

A matrix is simply a rectangular array of real or complex numbers. A m× n matrix A is

an array having m rows and n columns, such as A ∈ Km×n, where K = R or C ⇔ A =

[aij ] =

⎛⎜⎜⎝a11 · · · a1n

.... . .

...

am1 · · · amn

⎞⎟⎟⎠.

If m = n then A is called an square matrix of degree n.

A column matrix is usually called a vector. The sets of all n × 1 real and complex

column matrices (or vectors) are denoted by Rn and Cn, respectively. It will be used the

183

A. Appendix

notation (a1, a2, · · · , an)T to express the column matrix

⎛⎜⎜⎜⎜⎝a1

a2...

an

⎞⎟⎟⎟⎟⎠ (A.1)

in a more compact form. The subscript T is the matrix transpose which will be later

defined, and a1, a2, · · · , an are scalars, that means real numbers or elements of R.

Definition A.2 Linear combination.

If u1, u2, u3, · · · , um are vectors in Cn and if a1, a2, a3, · · · , am are scalars, that is ele-

ments of R, then the vector a1u1, a2u2, a3u3, · · · , amum is called a linear combination of

u1, u2, u3, · · · , um.

Definition A.3 Column space of a matrix.

Suppose that A is a m×n matrix with columns A1, A2, A3, · · · , An. Then the column space

of A, written C(A), is the subset of Cm containing all linear combinations of the columns

of A,

C(A) = ⟨A1, A2, A3, · · · , An⟩ (A.2)

Definition A.4 Row space of a matrix.

Suppose A is a m× n matrix. Then the row space of A, R(A), is the column space of AT ,

i.e., R(A) = C(AT ).

Definition A.5 Linear independence of vectors.

Given the vectors u1, u2, u3, · · · , un ∈ V where V denotes a not necessarily finite dimen-

sional vector space over an arbitrary field F . Then it is said that u1, u2, u3, · · · , un arelinearly independent (or, simply, independent) if and only if the only linear combination

a1u1 + a2u2 + a3u3 + · · ·+ anun = 0 (A.3)

with a1, a2, a3, · · · , an ∈ F is the trivial combination a1 = 0, a2 = 0, a3 = 0, · · · , an = 0. If

Eq. A.3 has a solution where some ai = 0, it is said that u1, u2, u3, · · · , un are linearly

dependent (or, simply, dependent or collinear). It will be also said that a finite subset S

of V is independent if the vectors contained in S are independent. Notice that any finite

set of vectors in V containing 0 is dependent.

184


Another form to see the Definition A.5.1 is considering the vectors u1, u2, u3, · · · , unas the columns of the matrix U , then they are said to be linearly dependent or collinear

if there exists a vector A = (a1, a2, a3, · · · , an)T with ∥A∥ = 0 such that UA = 0. If this

equation holds approximately, the columns uj , j = 1, · · · , n are said to be nearly linearly

dependent or nearly collinear [112].

Definition A.6 Orthogonal set of vectors.

Suppose that S = u1, u2, u3, · · · , un is a set of vectors from Cm. Then S is an orthogonal

set if every pair of different vectors from S is orthogonal, that is ⟨ui, uj⟩ whenever i = j.

Remember that orthogonal sets are linearly independent.

Definition A.7 Orthonormal set of vectors.

Suppose that S = u1, u2, u3, · · · , un is an orthogonal set of vectors such that ∥ui∥ = 1

for all 1 ≤ i ≥ n. Then the set of vectors S is an orthonormal set.

Definition A.8 Dimension.

Suppose that V is a vector space and v1, v2, v3, · · · , vt is a basis of V . Then the dimension

of V is the number of elements in the basis of V , that means dim(V ) = t. If V has no

finite bases, it is said V has infinite dimension.

A.5.2. Matrix operations

In this subsection the sum of two matrices, the matrix-scalar, the matrix-vector product,

the matrix product, along with the transpose, inverse and differentiation of a matrix are

summarized.

Let A ∈ Rm×n, B ∈ Rm×n, C ∈ Rm×n, D ∈ Rn×p, E ∈ Rm×p, F ∈ Rn×m, G ∈ Rn×n

and H ∈ Rn×n be eight matrices of different dimensions x ∈ Rn and y ∈ Rn two vectors

and α ∈ R a scalar.

Definition A.9 Matrix addition.

A matrix sum A + B of two matrix A and B is defined to be the matrix C such that

C = A+B with cij = aij + bij , ∀ i = 1, · · · ,m, j = 1, · · · , n.

Definition A.10 Matrix-scalar product.

A matrix-scalar product αA of the matrix A by a real number (or scalar) α is defined to

be the matrix C such that C = αA with cij = αaij , ∀ i = 1, · · · ,m, j = 1, · · · , n.

Definition A.11 Matrix-vector product.

A matrix-vector product Ax of the matrix A by the (column) vector x is defined to be the

vector y such that y = Ax with yi =∑n

k=1 aikxk, ∀ i = 1, · · · ,m.

185

A. Appendix

Definition A.12 Matrix product.

A matrix product AD of two matrix A and D is only defined when the number of columns

of A equals the number of rows of B and is the matrix E such that E = AD with eij =∑nk=1 aikdkj , ∀ i = 1, · · · ,m, j = 1, · · · , n.

Definition A.13 Matrix transpose.

The transpose AT of the matrix A is defined to be the matrix F such that F = AT with

fij = aji, ∀ i = 1, · · · ,m, j = 1, · · · , n.

Definition A.14 Complex conjugate of a matrix.

Suppose A ∈ Cm×n. Then the conjugate of A, written A is an m × n matrix defined by

aij = aij, where aij is the conjugate of aij ∈ A.

Definition A.15 Adjoint of a matrix.

Suppose A ∈ Cm×n. Then its adjoint is A∗ = (A)T . The matrix A∗ is obtained from A

by taking the transpose and then taking the complex conjugate of each entry (i.e., negating

their imaginary parts but not their real parts).

The adjoint of a matrix with only real entries is equal to the transpose of that matrix.

A.5.3. Some special matrices

Definition A.16 Symmetric matrix.

Suppose G ∈ Cn×n. Then the matrix G is symmetric if it is equal to its transpose, i.e.,

G = GT

The subspace of symmetric matrices, in the space Rn×n of all square matrices of degree

n, which are not necessarily symmetric, is here denoted as Sym(n).

When any matrix is multiplied by itself its product (so-called cross-product) has the

interesting property to be symmetric. This result is summarized in the next proposition.

Proposition A.1 The cross-product of a matrix is symmetric.

Suppose A ∈ Rm×n, then AAT and ATA are symmetric.

Definition A.17 Positive, negative definite and indefinite matrices.

Suppose A ∈ Rn×n is symmetric and let q(x) = xAxT be its associated quadratic form.

Then it is said that A is positive-definite if and only if q(x) > 0 whenever x = 0. Similarly,

it is said that q is negative-definite if and only if q(x) < 0 whenever x = 0. Otherwise, it

is said that q is indefinite.

186


A symmetric real matrix A is called positive semi-definite if its quadratic form q satisfies

q(x) ≥ 0 for all x ∈ Rn. In the subspace Sym(n), the subsets of positive definite, positive

semi-definite and negative definite matrices are denoted as PD(n), PSD(n) and ND(n),

respectively.

The generalization of a symmetric matrix in the space of the complex numbers is the

Hermitian matrix:

Definition A.18 Hermitian matrix.

Suppose A ∈ Cn×n. Then A is Hermitian (or self-adjoint) if A = A∗.

Definition A.19 Unitary matrix.

Suppose U ∈ Cn×n such that UU∗ = In. Then it is said U is unitary.

If a matrix A has only real number entries (it is a real matrix) then the defining prop-

erty of being unitary simplifies to AAT = In. The matrix A is then called orthogonal.

Moreover, unitary matrices have easily computed inverses. They also have columns that

form orthonormal sets.

Definition A.20 Diagonal matrix.

Suppose D ∈ Cm×n such that D = (dij). Then D is called diagonal if and only if dij = 0

whenever i = j

Definition A.21 Upper-triangular matrix.

Suppose A ∈ Cn×n is a square matrix such that A = (aij). Then A is called upper-

triangular if aij = 0 whenever i > j

Definition A.22 Permutation matrix.

A permutation matrix is a matrix P ∈ Rn×n such that there are row swap matrices

S1, · · · , Sk ∈ Rn×n for which P = S1, · · · , Sk. (Remember that a row swap matrix is

by definition an elementary matrix obtained by interchanging two rows of In.)

A.5.4. Matrix rank

Definition A.23 Suppose that A ∈ Cm×n. Then the rank of A is the dimension of the

column space of A, r(A) = dim(C(A)).

Taking into account the definitions of column space of a matrix in Definition A.5.1 and

the dimension of a vector space in Definition A.5.1, it is easily to conclude that the rank

of A is the number of linearly independent column vectors in A. It should be noted that

if m < n, then the maximum rank of A is m. In contrast, if m > n, then the maximum

rank of A is n. If A = 0, then the rank of A is 0. If A had even one non-zero element, its

minimum rank would be one. When all of the vectors in a matrix are linearly independent,

the matrix is said to be full rank.

187

A. Appendix

A.5.5. Eigenvalues and eigenvectors

Before to define the factorization of a matrix into a canonical form by using the eigendecom-

position, it is important to define the fundamental theory of eigenvalues and eigenvectors

of a matrix and some of their properties.

Definition A.24 Eigenvalues and eigenvectors of a matrix.

Suppose A ∈ Cn×n, x ∈ Cn with x = 0 and λ ∈ C. Then λ is called an eigenvalue of A,

if Ax = λx. The vector x is called an eigenvector corresponding to λ. The pair (λ, x) is

called an eigenpair of A.

The next theorem establish the connection between the eigenvalues of a matrix and

whether or not the matrix is nonsingular.

Theorem A.1 A singular matrix has a zero eigenvalue.

Suppose A ∈ Cn×n. Then A is singular if and only if λ = 0 is an eigenvalue of A.

The eigenvalues are not scale-invariant therefore when a square matrix is scaled by an

scalar its eigenvalues are also scaled as will be seen in the next theorem.

Theorem A.2 Eigenvalues of a scaled matrix.

Suppose A ∈ Cn×n, (λ, x) an eigenpair of A and α ∈ Cn. Then (αλ, x) is an eigenpair of

αA.

The effect of inverting and transposing a matrix on its eigenvalues is described in the

next two theorems.

Theorem A.3 Eigenvalues of the inverse of a matrix.

Suppose A ∈ Cn×n is a nonsingular matrix and (λ, x) an eigenpair of A. Then (λ−1, x) is

an eigenpair of A−1.

Theorem A.4 Eigenvalues of the transpose of a matrix.

Suppose A ∈ Cn×n and (λ, x) an eigenpair of A. Then (λ, x) is an eigenpair of AT .

Other interesting properties of the eigenvalues and eigenvectors take place when the

matrix is Hermitian (see Definition A.5.3). In that case, the eigenvalues of that matrix are

real numbers and the eigenvectors of two different eigenvalues are orthogonal. Remember

that a matrix A whose entries are all real numbers, being Hermitian is identical to be

symmetric (see Definition A.5.3). That is here especially important because the Fisher-

information matrix (see 2.10) and the parameter covariance matrix (see 2.22) belong to

the space of symmetric matrices Sym(Nθ). The mentioned properties are justified by the

following two theorems:

Theorem A.5 A Hermitian matrix has real eigenvalues.

Suppose A ∈ Cn×n is an Hermitian matrix and λ is an eigenvalue of A. Then λ ∈ R

188


Theorem A.6 A Hermitian matrix has orthogonal eigenvectors.

Suppose A ∈ Cn×n is an Hermitian matrix and x and y are two eigenvectors of A for

different eigenvalues. Then x and y are orthogonal vectors.

In terms of eigenvalues the subspaces PD(n), ND(n) and PSD(n) of the subsets of

positive definite, negative definite and positive semi-definite matrices, respectively, may

be defined as follows:

Proposition A.2 Definiteness of a symmetric matrix by using eigenvalues.

Suppose A ∈ Rn×n is a symmetric matrix. Then the associated quadratic form q(x) =

xAxT is positive definite (or A ∈ PD(n)) if and only if all the eigenvalues of A are

positive and negative definite (or A ∈ ND(n)) if and only if all the eigenvalues of A are

negative. Moreover, A is positive semi-definite (or A ∈ PSD(n)) if and only if every

eigenvalue of A is non-negative.

In other words, only analyzing the smallest eigenvalue of a matrix it may be characterized

as positive definite or positive semi-definite in PD(n) and PSD(n), respectively according

to the next Lemma:

Lemma A.1 Definiteness of a symmetric matrix by using eigenvalues 2.

Suppose A ∈ Rn×n is a symmetric matrix with smallest eigenvalue λmin(A). Then

A ∈ PD(n)⇔ λmin(A) > 0 (A.4)

A ∈ PSD(n)⇔ λmin(A) ≥ 0 (A.5)

A.5.6. Inverse of a matrix

Definition A.25 Matrix inverse.

Let G ∈ Rn×n and H ∈ Rn×n be square matrices of degree n. Then suppose that G and

H have the property that GH = HG = In, where In is the identity matrix of degree n.

Then G is said the inverse of H and is denoted by H−1 (and H is the inverse of G and is

denoted by G−1). A matrix with an inverse is said to be invertible.

Proposition A.3 If the square matrix G of degree n is nonsingular, there exists a square

matrix H of degree n such that HG = In which is called the left inverse of G.

Theorem A.7 Let G ∈ Rn×n. Then the following statements are equivalent:

1. G is nonsingular if and only if A has left inverse;

189

A. Appendix

2. if G has left inverse H, then G is invertible and H is the unique inverse of G (that

is, HG = In, then GH = In and so H = G−1);

3. G is nonsingular if and only if it is invertible.

The next theorem relates all equivalent statements which define what it means for a

matrix to be nonsingular. Some of those statements will be used in this thesis in order to

connect ill-conditioned matrices with identifiability and ill-posed parameter estimations.

Theorem A.8 The invertible matrix theorem.

Suppose G ∈ Rn×n. Then the following statements are equivalent:

1. G is nonsingular;

2. G is an invertible matrix;

3. the rank of G is n, i.e., G is of full-rank;

4. G is row-equivalent to the identity matrix In;

5. G is column-equivalent to the identity matrix In;

6. G has n pivot positions;

7. the equation Gx = 0 has only the trivial solution;

8. the columns of G form a linearly independent set;

9. the linear transformation x→ Gx is one-to-one;

10. the equation Gx = b has at least one solution for each b ∈ Rn;

11. the columns of G span Rn;

12. the linear transformation x→ Gx is a surjection;

13. there is a n× n matrix H such that HG = In;

14. there is a n× n matrix H such that GH = In;

15. the transpose matrix GT is invertible;

16. the columns of G form a basis for Rn;

17. the column space of G is equal to Rn, C(G) = Rn;

18. the dimension of the column space of G is n;

19. the null space of G is 0;

190


20. the dimension of the null space of G is 0;

21. the determinant of G is not zero, det(G) = 0;

22. λ = 0 is not an eigenvalue of G;

23. the orthogonal complement of the column space of G is 0;

24. the orthogonal complement of the null space of G is Rn;

25. the row space of G is equal to Rn, R(G) = Rn;

Moreover, the following properties hold for an invertible matrix G:

• G−1 is invertible;

• (G−1)−1 = G;

• (kG−1) = k−1G−1 for nonzero scalar k;

• (GT )−1 = (G−1)T ;

• for any invertible G,H ∈ Rn×n, (GH)−1 = H−1G−1. More generally, if G1 · · ·Gk

are invertible n-by-n matrices, then (G1G2 · · ·Gk−1Gk)−1 = G−1

k G−1k−1 · · ·G

−12 G−1

1 ;

• det(A−1) = (det(A))−1

A.5.7. Matrix decompositions

In this section several methods to factorize a matrix into a product of matrices are de-

scribed. The most applied decomposition for square matrices, namely the eigendecompo-

sition is firstly defined. Then the most accurate rank-revealing factorization, that means

the singular value decomposition is presented. The QR-factorization with column pivoting

as another important matrix decomposition and an alternative economic method to reveal

the rank of a matrix is finally defined.

A.5.8. Eigendecomposition

The decomposition of a matrix into matrices composed of its eigenvectors and eigenvalues

is called eigendecomposition. This decomposition is also known as matrix diagonalization.

The definition of the eigendecomposition reads as follows

Definition A.26 Eigendecomposition.

Suppose A ∈ Cn×n is a square matrix with eigenvalues λ1, λi, · · · , λn and n corresponding

linearly independent eigenvectors, qi (i = 1, · · · , n). Then A can be factorized as

A = QΛQ−1 (A.6)

191

A. Appendix

where Q ∈ Cn×n is the square matrix whose i-th column is the eigenvector qi of A and Λ ∈Cn×n is the diagonal matrix whose diagonal elements are the corresponding eigenvalues,

i.e., Λii = λi.

A.5.9. Singular value decomposition (SVD)

This factorization is known as an proper orthogonal decomposition or rank-revealing fac-

torization [52, 75]. The SVD as a proper orthogonal decomposition aims at obtaining

low-dimensional approximations of a high-dimensional problem. On the other hand, tak-

ing into account the intimate relationship of SVD with matrix rank it is the most accurate

factorization to reveal this property of a matrix. In addition, SVD allows to analyze the

difficulties associated with the ill-conditioning of a matrix because it is also related to its

condition number.

The definition of the SVD reads as follows:

Definition A.27 The singular value decomposition (SVD).

For any A ∈ Rm×n (m ≥ n) there are unitary matrices U ∈ Cm×m and V ∈ Cn×n such

that

A = U

(Sv

0

)V T =

r∑i=1

ςiuivTi (A.7)

where Sv ∈ Rm×r is a diagonal matrix Sv = diag (ς1, ς2, · · · , ςr) with nonnegative diagonal

elements ς1 ≥ ς2,≥ · · · ,≥ ςr > 0 which are the singular values of A. The number r ≤min(m,n) is equal to the rank of A, and the triplet (U, Sv, V ) is called the singular value

decomposition (SVD) of A. The columns of vi ∈ V with i = 1, · · · , n and uj ∈ U with

j = 1, · · · ,m are the right and left singular vectors of A, respectively.

The set of singular values of A will be here called as the singular value spectrum (SVs).

It is important to notice that the SVD in Eq. A.7 does not have scale invariance, therefore

if A is scaled by a positive scalar α the corresponding SVs of the scaled matrix αA is also

scaled by α (i.e., αA =∑Nθ

i=1(αςi)uivTi , where αςi is the i-th singular value of αA). The

SVs in this thesis will be represented in the semi-logarithmic graphics. Therein, the effect

of α is seen as a spectrum shift such that the new SVs of αA is a parallel curve to the SVs

of A.

Finally, note that the singular value decomposition can be seen as an extension of the

eigendecomposition for rectangular matrices. Nevertheless, there exist relations between

these decompositions which are details in the next proposition.

Proposition A.4 Relation between SVD and eigendecomposition.

Suppose A ∈ Rm×n whose SVD described by Eq. A.7 is given by (U, Sv, V ). Then the

192


following two relations are true:

ATA = V (STv Sv)V

T =r∑

i=1

ς2i vivTi (A.8a)

AAT = U(SvSTv )U

T =r∑

i=1

ς2i uiuTi (A.8b)

where the right-hand sides of these relations are the eigendecompositions of the left-hand

sides.

Accordingly:

• The columns of V (right-singular vectors of A) are the eigenvectors of ATA.

• The columns of U (left-singular vectors of A) are the eigenvectors of AAT .

• The non-zero elements of Sv (non-zero singular values of A) are the square roots of

the non-zero eigenvalues of ATA or AAT :

ςi(A) =√

λi(ATA) (A.9a)

ςi(A) =√

λi(AAT ) (A.9b)

A.5.10. QR-factorization with column pivoting (QRP)

It is along with SVD another important matrix decomposition. QRP is also a rank-

revealing decomposition which is a more practical (and cheaper) method although less

accurate than the SVD which has been specially used in solving rank-deficiency least

squares methods [45, 58]. In order to build a complete panorama of QRP as a matrix

factorization method and even as a rank-revealing decomposition, the definition of a QR-

factorization is firstly established.

Definition A.28 QR-factorization.

Every A ∈ Rm×n of rank n can be factored A = QR, where Q ∈ Rm×n has orthonormal

columns and R ∈ Rn×n in invertible and upper triangular.

Having the definition of a QR-factorization of a full-rank matrix the next step is to

consider the extended method for a matrix A singular or near singular. That means

another method to calculate the numerical rank of A.

Definition A.29 QR-factorization with column pivoting.

Suppose A ∈ Rm×n. If there exists a permutation matrix Π ∈ Rn×n such that AΠ = QR is

193

A. Appendix

the QR-factorization of AΠ, with Q ∈ Rm×n an orthogonal matrix and R ∈ Rm×n an upper-

triangular matrix with decreasing diagonal elements partitioned as R =

(R11 R12

0 R22

),

where R11 ∈ Rr×r and R22 is (hopefully) small in norm. If say ∥R22∥2 = O(µ), then

from the fact that ςr+1 ≤ ∥R22∥2 (see Lema A.5.10), it is concluded that the original

matrix A is guaranteed to have at most numerical rank r.

In order to justify QRP as a rank-revealing factorization of a matrix A, the next lemma

relating the QR-factorization and the numerical rank of A is now established.

Lemma A.2 QR-factorization and numerical rank.

Suppose A = QR is the QR-factorization of A ∈ Rm×n (m ≥ n), with

R =

(R11 R12

0 R22

)(A.10)

and R11 ∈ Rr×r. If

ςmin(R11) >> ∥R22∥2 = O(µ), (A.11)

where µ is the machine precision, then A has numerical rank r.

Finally, the subsequent definition of the rank-revealing QR-factorization with column

pivoting (RRQRP) of a matrix A is here given.

Definition A.30 RRQRP factorization.

Suppose that a matrix A ∈ Rm×n (m ≥ n) has numerical rank r (< n). If there exists a

permutation matrix Π ∈ Rn×n such that AΠ has a QR-factorization AΠ = QR, with R =(R11 R12

0 R22

), R11 ∈ Rr×r satisfying the inequality A.11, then the factorization AΠ = QR

is called a Rank-Revealing QR factorization with column pivoting of A

Note that the main point in this definition is to use the column pivoting strategy to

determine the permutation matrix Π which yields a small R22.

A.5.11. Numerical rank

The definition in A.5.4 refers to the ordinary rank (also called the exact rank), which is a

discontinuous function of the elements of a matrix. However, in the context of computer

arithmetic the existence of an exact rank is rare, therefore the definition of a numerical

rank is here introduced. Mathematically, in terms of the singular values a matrix has

rank r (in the sense of Definition A.5.4) if and only if ςr > 0 and ςr+1 = 0. Nevertheless,

computationally speaking and in many real-world applications ςr+1 is not exactly equal to

zero but close to zero or even just much smaller than ςr. In this scenario, when a matrix

194


has a cluster of small singular values with a well-determined gap between large and small

singular values limited by a ϵ-threshold, the number of these largest singular values reveals

the numerical rank rϵ of the matrix.

Definition A.31 Numerical rank of a matrix.

Suppose A ∈ Rm×n (m ≥ n)). Then A is said to have ϵ-numerical rank rϵ if

ς1, ς2, · · · , ςrϵ ≥ ϵ ≥ ςrϵ+1, · · · ςn (A.12)

where ς1, · · · , ςn ≥ 0 are the singular values of A and ϵ ∈ R is an scalar defining the cluster

of the small singular values.

In this definition is still considered that the matrix A has repsilon column linearly

independent. Note that a matrix with a single large gap in its singular value spectrum

is the appropriate candidate to compute the numerical rank. However, when a low rank

approximation of a matrix is needed this concept is again reasonable but its application

should be carefully taken.

A.5.12. Condition number

The condition number measures how small perturbations in the data affect the solution,

or how sensitive is the solution with respect to errors in the data. The formal definition

of the condition number reads as follows.

Definition A.32 Condition number.

Suppose A a nonsingular matrix. Then the condition number is

κ(A) = ∥A∥2A−1

2

(A.13)

where ∥·∥2 is the the spectral norm, that is, the matrix norm induced by the Euclidean

norm of vectors.

The condition number of A is always greater than one, and it is invariant when A is

multiplied by a nonzero constant [112]. If A is singular then κ(A) = ∞. In numerical

analysis the condition number of a matrix A is a way of describing how well or bad the

system Ax = b could be approximated. If κ(A) is small the problem is well-conditioned and

if κ(A) is large the problem is rather ill-conditioned. Furthermore, κ(A) is also recognized

to be a near-linear dependence (collinearity) measure.

Another expression for the condition number is

κ(A) = ςmax/ςmin, (A.14)

195

A. Appendix

where ςmax and ςmin are the maximum and minimum singular values of A. If A is a

symmetric matrix then

κ(A) = λmax/λmin, (A.15)

where λmax and λmin denote the largest and smallest eigenvalues of A.

In terms of the singular values, an ill-conditioned matrix has a large ratio of the maxi-

mum ς1 to the minimum singular value ςNp . That means a large condition number κ. This

condition checks how small the minimum singular value ςNp is relative to the maximum

singular value ς1. In some matrices this smallness is defined by comparing ςNp to the

standard zero [12] (if ς1 has moderate value) whilst in other cases a large ratio might occur

for singular values being all far from zero.

A.5.13. Collinearity index

In Definition A.5.1 is established the sense of exact and near-linear dependence of vectors.

In practical cases the exact linear dependence is not commonly found and instead of the

vectors (or columns of a matrix) can be only near-linearly dependent. An indicator of

the degree of near-linear dependence of the columns of a matrix is the collinearity index

[18, 17, 46].

Definition A.33 Collinearity index.

Suppose A a nonsingular matrix. Then the collinearity index is defined as

γ(A) = 1/ςmin, (A.16)

where ςmin is the smallest singular value of A.

The origin of this definition goes back to the fact that to measure the near collinearity

of the matrix A it should be looked for the minimal norm of the linear combination Aβ

under the constraint ∥β∥. Therefore, the smallest singular value of A equals the minimal

norm ∥Aβ∥ [11, 18] and its inverse is the collinearity index.

A.5.14. Covariance Matrix

The linear relationship between the random variables x1, x2, · · · , xn is commonly mea-

sured by the covariance. If the random variables are contained in a random vector

x = (x1, x2, · · · , xn)T its covariances are stored in the covariance matrix C which is defined

as follows:

Definition A.34 Covariance matrix.

Suppose x ∈ Rn is a random vector of random variables of finite variance. Then the

196


covariance matrix

C := E[(x− E[x])(x− E[x])T

](A.17)

is a matrix whose (i, j)-element cij = E[(xi − E(xi))(xj − E(xj))T

]= σ2

ij is the covariance

of the i-th random variable xi with respect to the j-th random variable xj. The diagonal

element cjj = E[(xj − E(xj))2

]= σ2

j is the variance of the j-th random variable xj.

Note that E(xi) is the expected value (or mean) of the i-th entry of the vector x. The

covariance matrix is also known as dispersion matrix or covariance matrix.

The graphical illustration of the variance of an estimator is displayed in Fig. A.1,

where the probability distribution of the estimator Θ1 (with small variance) and of the

estimator Θ2 (with large variance) are shown. Note that the distribution of each estimator

is centered at the expected value E[Θ] or the mean of the probability distribution of Θi

which is (hopefully) close to the true parameter value θ∗. In the figure, the estimator Θ2

is more precise than Θ1 because of its estimates from several sample data sets will lay on

the narrowest probability distribution.

Figure A.1.: Probability distribution of the estimators Θ1 and Θ2

Properties

For the covariance matrix C defined by Eq. A.17 the following basic properties apply:

• C is symmetric positive semi-definite, i.e., C ∈ PSD(n) ⊂ Sym(n);

• The trace of C is positive, i.e, Tr(C) =∑n

j=1 cjj > 0;

• The eigenvalues of C are all real and positive and the eigenvectors that belong to

distinct eigenvalues are orthogonal;

• The determinant of C is nonnegative, i.e., det(C) =∏n

j=1 λj(C) ≥ 0.

197

A. Appendix

A.5.15. Fisher-information Matrix

Before defining the Fisher-information matrix, it is important to know that the Fisher-

information describes the (maximum) information that may be extracted from an ob-

servable random variable y ∈ Rn to estimate the unknown parameter θ ∈ Rm with an

specified distribution. The probability density function pdf for y given θ, which is also the

log-likelihood function for θ, is here p(y; θ).

Definition A.35 Fisher-information Matrix.

Suppose y ∈ Rn is a random vector denoting the available data vector and θ ∈ Rm is the

unknown parameter vector determined by y. Also assume the probability density function

p(y; θ) and θ as an estimator based on y. Then the Fisher-information matrix (FIM) is

the covariance of the partial derivative with respect to θ of the natural logarithm of the

likelihood function:

F = E[∆∆T ], ∆ =∂ ln p(y; θ)

∂θ(A.18)

where F ∈ Rn×n is a positive semi-definite symmetric matrix.

Assuming a probability density function following a Gauss distribution

p(y; θ) = (2π)−n/2 |Cy|−1/2 exp[−1

2(y − ym)TC−1

y (y − ym)], (A.19)

where n denotes the number of parameters and ym is the mean. Considering this assump-

tion the Fisher-information matrix reduces to

F = STC−1y S (A.20)

where S = ∂y/∂θ is the sensitivity matrix and C−1y is the measurement covariance matrix

(measurement error).

198

Bibliography

[1] S. P. Asprey and S. Macchietto. Statistical tools for optimal dynamic model building.

Computers & Chemical Engineering, 24(2):1261–1267, 2000.

[2] S. P. Asprey and S. Macchietto. Designing robust optimal dynamic experiments.

Journal of Process Control, 12(4):545–556, 2002.

[3] Y. Bard. Nonlinear parameter estimation. Academic Press, New York, 1974.

[4] A. Bardow. Optimal experimental design of ill-posed problems: The meter approach.

Computers & Chemical Engineering, 32:115–124, 2008.

[5] A. Bardow and W. Marquardt. Incremental and simultaneous identification of reac-

tion kinetics: methods and comparison. Chemical engineering science, 59(13):2673–

2684, 2004.

[6] T. Barz, H. Arellano-Garcia, and G. Wozny. Handling uncertainty in model-

based optimal experimental design. Industrial & Engineering Chemistry Research,

49(12):5702–5713, 2010.

[7] T. Barz, S. Kuntsche, G. Wozny, and H. Arellano-Garcia. An efficient sparse ap-

proach to sensitivity generation for large-scale dynamic optimization. Computers &

Chemical Engineering, 35(10):2053–2065, 2011.

[8] T. Barz, D. C. Lopez C., H. Arellano-Garcia, and G. Wozny. Experimental evalua-

tion of an approach to online redesign of experiments for parameter determination.

AIChE Journal, 59:1981–1995, 2013.

[9] T. Barz, D. C. Lopez C., M. N. Cruz B., S. Korkel, and S. F. Walter. Real-time

adaptive input design for the determination of competitive adsorption isotherms in

liquid chromatography. Computers & Chemical Engineering, 94:104–116, 2016.

[10] D. M. Bates and D. G. Watts. Nonlinear Regression Analysis and its Applications.

Wiley, Toronto, Canada, 1988.

[11] D. A. Belsley. Conditioning diagnostics: Collinearity and weak data in regression.

John Wiley & Sons, New York, USA, 1991.

[12] D. A. Belsley, E. Kuh, and R. E. Welsch. Regression diagnostics: Identifying influ-

ential data and sources of collinearity. John Wiley & Sons, New York, USA, 1980.

199

Bibliography

[13] M. Bertero, T. A Poggio, and V. Torre. Ill-posed problems in early vision. Proceedings

of the IEEE, 76(8):869–889, 1988.

[14] E. Bingham and H. Mannila. Random projection in dimensionality reduction: ap-

plications to image and text data. In Proceedings of the seventh ACM SIGKDD

international conference on Knowledge discovery and data mining, pages 245–250.

ACM, 2001.

[15] S. Bitterlich and P. Knabner. Experimental design for outflow experiments based on

a multilevel identification method for material laws. Inverse Problems, 19:1011–30,

2003.

[16] Cruz Bournazou, H Arellano-Garcia, G Wozny, G Lyberatos, and C Kravaris. Asm3

extended for two-step nitrification–denitrification: a model reduction for sequencing

batch reactors. Journal of Chemical Technology and Biotechnology, 87(7):887–896,

2012.

[17] R. Brun, M. Kuhni, H. Siegrist, W. Gujer, and P. Reichert. Practical identifiability

of asm2d parameters - systematic selection and tuning of parameter subsets. Water

Research, 36(16):4113–4127, 2002.

[18] R. Brun, P. Reichert, and H. R. Kunsch. Practical identifiability analysis of large

environmental simulation models. Water Resources Research, 37(4):1015–1030, 2001.

[19] M. Burth, G. C. Verghese, and M. Velez-Reyes. Subset selection for improved pa-

rameter estimation in on-line identification of a synchronous generator. IEEE Trans-

actions on Power Systems, 14(1):218–225, 1999.

[20] G. Buzzi-Ferraris. New trends in building numerical programs. Computers & Chem-

ical Engineering, 35(7):1215–1225, 2011.

[21] G. Calafiore, Marina Indri, and Basilio Bona. Robot dynamic calibration: Optimal

excitation trajectories and experimental parameter estimation. Journal of robotic

systems, 18(2):55–68, 2001.

[22] E. F. Camacho and C. Bordons. Model Predictive Control (Advanced Textbooks in

Control and Signal Processing). Advanced Textbooks in Control and Signal Process-

ing. Springer, 2nd edition, 2004.

[23] C. A. Cardona and O. J. Sanchez. Fuel ethanol production: process design trends

and integration opportunities. Bioresource technology, 98(12):2415–2457, 2007.

[24] D. Carlson. Minimax and interlacing thoerems for matrices. Linear Algebra and its

Applications, 54:153–172, 1983.

200

Bibliography

[25] J. Carrera and S. P. Neuman. Estimation of aquifer parameters under transient

and steady state conditions: 1. maximum likelihood method incorporating prior

information. Water Resources Research, 22(2):199–210, 1986.

[26] P. Chandrakant and V. Bisaria. Simultaneous bioconversion of cellulose and hemi-

cellulose to ethanol. Critical Reviews in Biotechnology, 18:295–331, 1998.

[27] O. T. Chis, J. R. Banga, and E. Balsa-Canto. Structural identifiability of systems

biology models: a critical comparison of methods. PloS one, 6(11):e27755, 2011.

[28] Y. Chu and J. Hahn. Parameter set selection for estimation of nonlinear dynamic

systems. AIChE journal, 53(11):2858–2870, 2007.

[29] Y. Chu and J. Hahn. Parameter set selection via clustering of parameters into

pairwise indistinguishable groups of parameters. Industrial & Engineering Chemistry

Research, 48(13):6000–6009, 2008.

[30] Y. Chu and J. Hahn. Generalization of a parameter set selection procedure based on

orthogonal projections and the d-optimality criterion. AIChE Journal, 58(7):2085–

2096, 2012.

[31] C. Cobelli and J. J. DiStefano. Parameter and structural identifiability concepts

and ambiguities: a critical review and analysis. American Journal of Physiology-

Regulatory, Integrative and Comparative Physiology, 239(1):R7–R24, 1980.

[32] Paul G Constantine, Eric Dow, and Qiqi Wang. Active subspace methods in theory

and practice: applications to kriging surfaces. SIAM Journal on Scientific Comput-

ing, 36(4):A1500–A1524, 2014.

[33] M. Doyle, T. F. Fuller, and J. Newman. Modeling of galvanostatic charge and dis-

charge of the lithium/polymer/insertion cell. Journal of the Electrochemical Society,

140(6):1526–1533, 1993.

[34] M. Doyle, J. Newman, A. S. Gozdz, C. N. Schmutz, and J.M. Tarascon. Comparison

of modeling predictions with experimental data from plastic lithium ion cells. Journal

of the Electrochemical Society, 143(6):1890–1903, 1996.

[35] R. E. T. Drissen, R. H. W. Maas, J. Tramper, and H. H. Beeftink. Modelling ethanol

production from cellulose: separate hydrolysis and fermentation versus simultaneous

saccharification and fermentation. Biocatalysis and Biotransformation, 27(1):27–35,

2009.

[36] R. E. T. Drissen, R. H. W. Maas, M. J. E. C. Van Der Maarel, M. A. Kabel, H. A.

Schols, J. Tramper, and H. H. Beeftink. A generic model for glucose production

from various cellulose sources by a commercial cellulase complex. Biocatalysis and

Biotransformation, 25(6):419–429, 2007.

201

Bibliography

[37] R. Faber, P. Li, and G. Wozny. Sequential parameter estimation for large-scale

systems with multiple data sets: 1. computational framework. Ind. Eng. Chem. Res.,

42:5850–5860, 2003.

[38] J. A. Fessler. Mean and variance of implicitly defined biased estimators (such as

penalized maximum likelihood): Applications to tomography. IEEE Transactions

on Power Systems, 5:493–506, 1996.

[39] S. Flila, P. Dufour, and H. Hammouri. Optimal input design for on-line identification:

a coupled observer-MPC approach. In International Federation of Automatic Control

(IFAC) World Congress, pages Paper 1722, pp. 11457–11462, Seoul, South Korea,

July 2008.

[40] I. Ford, D. M. Titterington, and C. P. Kitsos. Recent advances in nonlinear experi-

mental design. Technometrics, 31(1):49–60x, 1989.

[41] G. Franceschini and S. Macchietto. Model-based design of experiments for parameter

precision: State of the art. Chemical Engineering Science, 63(19):4846–4872, 2007.

[42] G. Franceschini and S. Macchietto. Model-based design of experiments for parameter

precision: State of the art. Chemical Engineering Science, 63:4846–4872, 2008.

[43] F. Galvanin, M. Barolo, and F. Bezzo. Online model-based redesign of experiments

for parameter estimation in dynamic systems. Industrial & Engineering Chemistry

Research, 48:4415–4427, 2009.

[44] F. Galvanin, M. Barolo, G. Pannocchia, and F. Bezzo. Online model-based redesign

of experiments with erratic models: a disturbance estimation approach. Computers

& Chemical Engineering, 42:138–151, 2012.

[45] G. H. Golub and C. F. Van Loan. Matrix computations, volume 3. Johns Hopkins

University Press, 1996.

[46] A. Grah. Entwicklung und Anwendung modularer Software zur Simulation und Pa-

rameterschatzung in gaskatalytischen Festbettreaktoren. PhD thesis, Martin Luther

University Halle-Wittenberg, 2004.

[47] G. Guiochon, A. Felinger, D. G. Shirazi, and A. M. Katti. Fundamentals of prepara-

tive and nonlinear chromatography. Elsevier Inc., San Diego, USA, 2 edition, 2006.

[48] E. Haber, L. Horesh, and Tenorio L. Numerical methods for experimental design of

large-scale linear ill-posed inverse problems. Inverse Problems, 24:055012, 2008.

[49] E. Haber, L. Horesh, and Tenorio L. Numerical methods for the design of large-scale

nonlinear discrete ill-posed inverse problems. Inverse Problems, 26:025002, 2010.

202

Bibliography

[50] J. Hadamard. Lectures on cauchy’s problem in linear partial differential equations,

1923.

[51] J. M. Hammersley and D. C. Handscomb. Monte carlo methods, volume 1. Springer,

1964.

[52] P. C. Hansen. Rank-deficient and discrete ill-posed problems. Numerical aspects of

linear inversion. SIAM, Philadelphia, USA, 1998.

[53] P. C. Hansen. A matlab package for analysis and solution of discrete ill-posed prob-

lems. Technical report, Technical University of Denmark, www.netlib.org/numeralgo

and www.mathworks.com/matlabcentral/fileexchange, September 2007.

[54] L. Hascoet and V. Pascual. The tapenade automatic differentiation tool. ACM

Transactions on Mathematical Software, 39(3):1–43, 2013.

[55] A. C. Hindmarsh, P. N. Brown, K. E. Grant, S. L. Lee, R. Serban, D. E. Shumaker,

and C. S. Woodward. Sundials: Suite of nonlinear and differential/algebraic equation

solvers. ACM Transactions on Mathematical Software (TOMS), 31(3):363–396, 2005.

[56] A. E. Hoerl and R. W. Kennard. Ridge regression: applications to nonorthogonal

problems. Technometrics, 12(1):69–82, 1970.

[57] A. E. Hoerl and R. W. Kennard. Ridge regression: Biased estimation for nonorthog-

onal problems. Technometrics, 12:55–67, 1970.

[58] Y. P. Hong and C. T. Pan. Rank-revealing qr factorizations and the singular value

decomposition. Mathematics of Computation, 58(197):213–232, 1992.

[59] L. Horesh, E. Haber, and L. Tenorio. Optimal experimental design for the large-scale

nonlinear ill-posed problem of impedance imaging. Large-Scale Inverse Problems and

Quantification of Uncertainty, John Wiley & Sons, Ltd, pages 273–290, 2010.

[60] E. A. Jacobson. A statistical parameter estimation method using singular value

decomposition with application to Avra Valley aquifer in southern Arizona. PhD

thesis, The University of Arizona, 1985.

[61] S. Javeed, S. Qamar, A. Seidel-Morgenstern, and G. Warnecke. Efficient and ac-

curate numerical simulation of nonlinear chromatographic processes. Computers &

Chemical Engineering, 35(11):2294–2305, 2011.

[62] B. Jayasankar, B. Huang, and A. Ben-Zvi. Receding horizon experiment design

with application in sofc parameter estimation. In Mayuresh Kothare, Moses Tade,

Alain Vande Wouwer, and Ilse Smets, editors, 9th International Symposium on Dy-

namics and Control of Process Systems (DYCOPS 2010), pages 527–532, 2010.

203

Bibliography

[63] T. A. Johansen. On tikhonov regularization, bias and variance in nonlinear system

identification. Automatica, 33:441–446, 1997.

[64] SI Kabanikhin. Definitions and examples of inverse and ill-posed problems. Journal

of Inverse and Ill-Posed Problems, 16(4):317–357, 2008.

[65] D. Kaelin, R. Manser, L. Rieger, J. Eugster, K. Rottermann, and H. Siegrist. Ex-

tension of asm3 for two-step nitrification and denitrification and its calibration and

validation with batch tests and pilot scale data. Water research, 43:1680–1692, 2009.

[66] B. Kaltenbacher, A. Neubauer, and O. Scherzer. Iterative regularization methods for

nonlinear ill-posed problems, volume 6. Walter de Gruyter, 2008.

[67] KNAUER, Wissenschaftliche Gerate GmbH, Support. Personal communication

(2014-05-01).

[68] S. Korkel and H. Arellano-Garcia. Online experimental design for model valida-

tion. In Rita Maria de Brito Alves, Claudio Augusto Oller do Nascimento, and

Evaristo Chalbaud Biscaia Jr., editors, 10th International Symposium on Process

Systems Engineering - PSE 2009, volume 27, pages 1–8, Salvador-Bahia, Brazil,

2009. Elsevier.

[69] S. Korkel, I. Bauer, H. G. Bock, and J. P. Schloder. A sequential approach for

nonlinear optimum experimental design in dae systems. Scientific Computing in

Chemical Engineering II: Simulation, Image Processing, Optimization, and Control,

page 338, 1999.

[70] C. Kravaris, J. Hahn, and Y. Chu. Advances and selected recent developments in

state and parameter estimation. Computers & Chemical Engineering, 51:111–123,

2013.

[71] A. Kremling, S. Fischer, K. Gadkar, T. Doyle, F. J.and Sauter, E. Bullinger, F. All-

gower, and E. D. Gilles. A benchmark for methods in reverse engineering and model

discrimination: problem formulation and solutions. Genome research, 14:1773–1785,

2004.

[72] S. Kuntsche, T. Barz, R. Kraus, H. Arellano-Garcia, and G. Wozny. Mosaic a

web-based modeling environment for code generation. Computers & Chemical Engi-

neering, 35(11):2257–2273, 2011.

[73] T. Lahmer. Optimal experimental design for nonlinear ill-posed problems applied

to gravity dams. Inverse Problems, 27:125005 (20pp), 2011.

[74] L. Lennart and P. E. Caines. Asymptotic normality of prediction error estimators

for approximate system models. Stochastics, 3(1-4):29–46, 1980.

204

Bibliography

[75] Y. C. Liang, H. P. Lee, S. P. Lim, W. Z. Lin, K. H. Lee, and C. G. Wu. Proper

orthogonal decomposition and its applications -part i: Theory. Journal of Sound

and Vibration, 252(3):527–544, 2002.

[76] Y. Lin and S. Tanaka. Ethanol fermentation from biomass resources: current state

and prospects. Applied microbiology and biotechnology, 69(6):627–642, 2006.

[77] O. Lisec, P. Hugo, and A. Seidel-Morgenstern. Frontal analysis method to determine

competitive adsorption isotherms. Journal of Chromatography A, 908(1-2):19–34,

2001.

[78] D. C. Lopez C., T. Barz, S. Korkel, and G. Wozny. Nonlinear ill-posed problem

analysis in model-based parameter estimation and experimental design. Computers

& Chemical Engineering, 77:24–42, 2015.

[79] D. C. Lopez C., T. Barz, M. Penuela, A. Villegas, S. Ochoa, and G. Wozny. Model-

based identifiable parameter determination applied to a simultaneous saccharifica-

tion and fermentation process model for bio-ethanol production. Biotechnology

Progress, 29(4):1064–1082, 2013.

[80] D. C. Lopez C., G. Wozny, A. Flores-Tlacuahuac, R. Vasquez-Medrano, and V. M.

Zavala. A computational framework for identifiability and ill-conditioning analy-

sis of lithium-ion battery models. Industrial & Engineering Chemistry Research,

55(11):3026–3042, 2016.

[81] T. Maly and L. R. Petzold. Numerical methods and software for sensitivity analysis

of differential-algebraic systems. Applied Numerical Mathematics, 20(1):57–79, 1996.

[82] D. W. Marquardt. An algorithm for least-squares estimation of nonlinear parameters.

Journal of the Society for Industrial & Applied Mathematics, 11(2):431–441, 1963.

[83] D. W. Marquardt. Generalized inverses, ridge regression, biased linear estimation,

and nonlinear estimation. Technometrics, 12:591–612, 1970.

[84] S Marsili-Libelli, S Guerrizio, and N Checchi. Confidence regions of estimated pa-

rameters for ecological systems. Ecological Modelling, 165(2):127–146, 2003.

[85] E. Martınez-Rosas, R. Vasquez-Medrano, and A. Flores-Tlacuahuac. Modeling and

simulation of lithium-ion batteries. Computers & Chemical Engineering, 35(9):1937–

1948, 2011.

[86] M. D. McKay, R. J. Beckman, and W. J. Conover. Comparison of three methods for

selecting values of input variables in the analysis of output from a computer code.

Technometrics, 21(2):239–245, 1979.

205

Bibliography

[87] K. A. P. McLean, S. Wu, and K. B. McAuley. Mean-squared-error methods for select-

ing optimal parameter subsets for estimation. Industrial & Engineering Chemistry

Research, 51:6105–6115, 2012.

[88] R. K. Mehra. Optimal input signals for parameter estimation in dynamic systems

-survey and new results. Automatic Control, IEEE Transactions on, 19:753–768,

1974.

[89] D. C. Montgomery and G. C. Runger. Applied statistics and probability for engineers.

John Wiley & Sons, 2010.

[90] J. J. More and D. C. Sorensen. Computing a trust region step. SIAM Journal on

Scientific and Statistical Computing, 4(3):553–572, 1983.

[91] S. Natarajan and J. H. Lee. Repetitive model predictive control applied to a sim-

ulated moving bed chromatography system. Computers & Chemical Engineering,

24(2-7):1127–1133, 2000.

[92] S. Ochoa, A. Yoo, J. U. Repke, G. Wozny, and D. R. Yang. Modeling and parameter

identification of the simultaneous saccharification-fermentation process for ethanol

production. Biotechnology progress, 23(6):1454–1462, 2007.

[93] F. OSullivan. A statistical perspective on ill-posed inverse problems. Statistical

science, pages 502–518, 1986.

[94] K. Palmer and K. L. Tsui. A minimum bias latin hypercube design. IIE Transactions,

33(9):793–808, 2001.

[95] M. Penuela V. Desenvolvimento de processo de hidrolise enzimatica e fermentacao

simultaneas para a producao de etanol a partir de bagaco de cana-de-acucar. PhD

thesis, Universidade Federal de Rio de Janeiro, 2007.

[96] M. Penuela V., J. Nascimento C., M. Bezerra, and N. Pereira Jr. Enzymatic hy-

drolysis optimization to ethanol production by simultaneous saccharification and

fermentation. In Applied Biochemistry and Biotecnology, pages 141–153. Springer,

2007.

[97] G. P. Philippidis and C. Hatzis. Biochemical engineering analysis of critical process

factors in the biomass-to-ethanol technology. Biotechnology progress, 13(3):222–231,

1997.

[98] G. P. Philippidis, T. K. Smith, and C. E. Wyman. Study of the enzymatic hydrolysis

of cellulose for production of fuel ethanol by the simultaneous saccharification and

fermentation process. Biotechnology and Bioengineering, 41(9):846–853, 1993.

206

Bibliography

[99] G. P. Philippidis, D. D. Spindler, and C. E. Wyman. Mathematical modeling of

cellulose conversion to ethanol by the simultaneous saccharification and fermentation

process. Applied biochemistry and biotechnology, 34(1):543–556, 1992.

[100] W. H. Press. Numerical recipes 3rd edition: The art of scientific computing. Cam-

bridge university press, 2007.

[101] F. Pukelsheim. Optimal design of experiments. Wiley, New York, 1 edition, 1993.

[102] J. Qian, P. Dufour, and M. Nadri. Observer and model predictive control for on-line

parameter identification in nonlinear systems. In IFAC International Symposium

on Dynamics and Control of Process Systems (DYCOPS), pages 571–576, Mumbai,

India, December 2013.

[103] V. Ramadesigan, V. Boovaragavan, M. Arabandi, K. Chen, H. Tsukamoto, R. Braatz,

and V. Subramanian. Parameter estimation and capacity fade analysis of lithium-

ion batteries using first-principles-based efficient reformulated models. ECS Trans-

actions, 19(16):11–19, 2009.

[104] V. Ramadesigan, K. Chen, N. A. Burns, V. Boovaragavan, R. D. Braatz, and

V. R. Subramanian. Parameter estimation and capacity fade analysis of lithium-

ion batteries using reformulated models. Journal of The Electrochemical Society,

158(9):A1048–A1054, 2011.

[105] C. E. Rasmussen and C. K. I. Williams. Gaussian processes for machine learning.

MIT press, Cambridge, USA, 2006.

[106] J. G. Reid. Structural identifiability in linear time-invariant systems. Automatic

Control, IEEE Transactions on, 22(2):242–246, 1977.

[107] R. Schenkendorf and M. Mangold. Online model selection approach based on un-

scented kalman filtering. Journal of Process Control, 23(1):44–57, 2013.

[108] A. P. Schmidt, M. Bitzer, A. W. Imre, and L. Guzzella. Experiment-driven electro-

chemical modeling and systematic parameterization for a lithium-ion battery cell.

Journal of Power Sources, 195(15):5071–5080, 2010.

[109] J. C. Schoneberger, H. Arellano-Garcia, G. Wozny, S. Korkel, and H. Thielert. Model-

based experimental analysis of a fixed-bed reactor for catalytic so2 oxidation. Indus-

trial & Engineering Chemistry Research, 48(11):5165–5176, 2009.

[110] D. C. Sorensen. Newton’s method with a model trust region modification. SIAM

Journal on Numerical Analysis, 19(2):409–426, 1982.

[111] James C Spall. Introduction to stochastic search and optimization: estimation, sim-

ulation, and control, volume 65. John Wiley & Sons, 2005.

207

Bibliography

[112] G. W. Stewart. Collinearity and least squares regression. Statistical Science, pages

68–84, 1987.

[113] J. D. Stigter, D. Vries, and K. J. Keesman. On adaptive optimal input design: a

bioreactor case study. AIChE journal, 52(9):3290–3296, 2006.

[114] P. Stoica and T. L. Marzetta. Parameter estimation problems with singular infor-

mation matrices. Signal Processing, IEEE Transactions on, 49(1):87–90, 2001.

[115] V. R. Subramanian, V. Boovaragavan, and V. D. Diwakar. Toward real-time simu-

lation of physics based lithium-ion battery models. Electrochemical and Solid-State

Letters, 10(11):A255–A260, 2007.

[116] V. R. Subramanian, V. Boovaragavan, V. Ramadesigan, and M. Arabandi. Math-

ematical model reformulation for lithium-ion battery simulations: Galvanostatic

boundary conditions. Journal of The Electrochemical Society, 156(4):A260–A271,

2009.

[117] R. C. Thompson. Principal submatrices ix: Interlacing inequalities for singular

values of submatrices. Linear Algebra and its Applications, 5(1):1–12, 1972.

[118] A.N. Tikhonov and V.Y. Arsenin. Solutions of ill-posed problems. V. H. Winston,

Washington, USA, 1977.

[119] A. Toumi and S. Engell. Optimization-based control of a reactive simulated moving

bed process for glucose isomerization. Chemical Engineering Science, 59(18):3777–

3792, 2004.

[120] S. Vajda, H. Rabitz, E. Walter, and Y. Lecourtier. Qualitative and quantitative

identifiability analysis of nonlinear chemical kinetic models. Chemical Engineering

Communications, 83:191–219, 1989.

[121] V. S. Vassiliadis, E. B. Canto, and J. R. Banga. Second-order sensitivities of general

dynamic systems with application to optimal control problems. Chemical Engineer-

ing Science, 54(17):3851–3860, 1999.

[122] M. Velez-Reyes. Decomposed algorithms for parameter estimation. PhD thesis, Mas-

sachusetts Institute of Technology, 1992.

[123] M. Velez-Reyes and G. C. Verghese. Subset selection in identification, and applica-

tion to speed and parameter estimation for induction machines. In Proceedings of

the 4th IEEE Conference on Control Applications, pages 991–997, Albany, 1995.

[124] E. Walter and L. Pronzato. On the identifiability and distinguishability of nonlinear

parametric models. Mathematics and Computers in Simulation, 42:125–134, 1996.

208

Bibliography

[125] E. Walter and L. Pronzato. Identification of parametric models. Springer, New-York,

USA, 1997.

[126] B. A. Walther and J. L. Moore. The concepts of bias, precision and accuracy, and

their use in testing the performance of species richness estimators, with a literature

review of estimator performance. Ecography, 28(6):815–829, 2005.

[127] S. R. Weijers and P. A. Vanrolleghem. A procedure for selecting best identifiable

parameters in calibrating activated sludge model no. 1 to full-scale plant data. Water

science and technology, 36:69–79, 1997.

[128] A. D. Wilson, J. A. Schultz, A. Ansari, and T. D. Murphey. Real-time trajectory

synthesis for information maximization using sequential action control and least-

squares estimation. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS),

2015.

[129] S. Wu, K. A. P. McLean, T. J. Harris, and K. B. McAuley. Selection of optimal

parameter set using estimability analysis and mse-based model-selection criterion.

International Journal of Advanced Mechatronic Systems, 3:188–197, 2011.

[130] W. Wu, D. L. Massart, and S. De Jong. The kernel pca algorithms for wide data.

part i: theory and algorithms. Chemometrics and Intelligent Laboratory Systems,

36(2):165–172, 1997.

[131] P. Xu. Truncated svd methods for discrete linear ill-posed problems. Geophysical

Journal International, 135:505–514, 1998.

[132] N Yakut, T Barz, DC Lopez Cardenas, H Arellano-Garcia, and G Wozny. Online

model-based redesign of experiments for parameter estimation applied to closed-loop

controller tuning. Chemical Engineering Transactions, 32(June):1195–1200, 2013.

[133] N. Yakut, T. Barz, D.C. Lopez Cardenas, and G. Wozny. Online redesign technique

for closed-loop system identification. In Selected papers of the 11th International

Conference on Chemical and Process Engineering, volume 11 of AIDIC Conference

Series, pages 421–430, Milano, Italy, 2013. AIDIC.

[134] K. Z. Yao, B. M. Shaw, B. Kou, K. B. McAuley, and D. W. Bacon. Modeling

ethylenebutene copolymerization with multi-site catalysts parameter estimability

and experimental design. Polymer Reaction Engineering, 11:563–588, 2003.

[135] V. M. Zavala, C. D. Laird, and L. T. Biegler. Interior-point decomposition ap-

proaches for parallel solution of large-scale nonlinear parameter estimation problems.

Chemical Engineering Science, 63(19):4834–4845, 2008.

209

Bibliography

[136] Y. Zhu and B. Huang. Constrained receding-horizon experiment design and parame-

ter estimation in the presence of poor initial conditions. AIChE Journal, 57(10):2808–

2820, 2011.

210

Systematic evaluation of ill-posed problems in model-based ......Systematic evaluation of ill-posed...

Documents

Transcript of Systematic evaluation of ill-posed problems in model-based ......Systematic evaluation of ill-posed...