Systematic evaluation of ill-posed problems in model-based ......Systematic evaluation of ill-posed...
Transcript of Systematic evaluation of ill-posed problems in model-based ......Systematic evaluation of ill-posed...
Systematic evaluation of ill-posed problems inmodel-based parameter estimation and
experimental design
vorgelegt von
M.Sc.
Diana Carolina Lopez Cardenas
aus Bucaramanga-Kolumbien
von der Fakultat III - Prozesswissenschaften
der Technischen Universitat Berlin
zur Erlangung des akademischen Grades
Doktor der Ingenieurwissenschaften
- Dr.-Ing. -
genehmigte Dissertation
Promotionsausschuss:
Vorsitzender: Prof. Dr. Peter Neubauer
1. Gutachter: Prof. Dr.-Ing. habil. Prof. h.c. Dr. h.c. Gunter Wozny
2. Gutachter: Dr.-Ing. Tilman Barz
3. Gutachter: Prof. Dr. Flavio Manenti
Tag der wissenschaftlichen Aussprache: 16.08.2016
Berlin 2016
II
Acknowledgements
To my dear adviser Prof. Gunter Wozny who gave me the incredible opportunity to work
for him and to know how is the work and life in Germany. This decision changed my life.
I am very grateful for the trust and freedom I received from him from the beginning of
my research to develop my own ideas. Thanks Prof. Wozny for always having kind words
to me and strong support when I needed it.
To my little baby girl Juliana Lucıa the most wonderful motivation to reach the sky
and to my Oliver, the most lovely daddy ever, to support, teach, encourage and love me
throughout this project.
To my lovely Mom Ana in Colombia to always send to me her best wishes and prayers
and to support all my issues in Colombia whilst I have been abroad. To my lovely aunts
Ceci and Fala because your prayers and best energy have been always with me. To my
dear aunts, uncles and cousins in Bucaramanga and Enciso, all of you are really close to
my heart. I hope my effort can be an example for the new generation and all of you can
reach your dreams. Mis logros siempre estan dedicados a ustedes.
To my dear Geli und Kalle to become my German Mom and Dad. Their continuous and
lovely support gave me the last energies to finish this writing. To my sweet Omi Gitti and
Opi Manny to include me in their beautiful family and think of each small but wonderful
detail to make my life nicer. Dankeschon fur die viel Liebe und Unterstutzung!
To my Latin American Girls (Aglys, Caro, Dani and my Janet) who were the best
friends and company during my adaptation to the German culture.
To Tilman who has always guided and wisely advised me. Nobody could have had
a better scientific mentor and friend...thanks for your always helpful, constructive and
honest feedback on my work.
To Victor Zavala who gave me the opportunity to know the other side of the force when I
did my internship at Argonne National Laboratory in Chicago, USA. The several technical
discussions and sharp ideas that we shared during this period were fundamentals to finish
my research.
To my dear Sandris to be the best friend at the office specially in the cold and dark winter
days. To Alejo to be the best colombian support at the DBTA. To my kind colleagues of
the old and new buildings to make a great place to work and being so kind to me since
we met.
To all my “team“ which is really numerous counting family, friends and colleagues in
two continents. You have played an important role during this phase of my life but in
general during all my life. And to those who gave me the best of their lives when I was
still in Colombia. Your mark is indelible.
Finally but not less important to my dear Lord. Thank you for giving to me my sweet
Juliana Lucıa, letting me dream and reaching my earthly goals. You are and will be my
company, strength and hope always!!!
Berlin, August 2016
III
IV
Abstract
The lack of informative experimental data, the complexity of first-principles models, over-
parameterization and parameter correlations make, among others, the recovery of kinetic,
transport, and thermodynamic parameters complicated. These issues are sources of non-
identifiability and consequently ill-posedness. This research investigates the features,
sources, effects and treatments of ill-posedness in nonlinear parameter estimation (PE)
and optimal experimental design (OED). The connection between identifiability problems
and ill-posed problems is established. There are two main focuses: the detection of ill-
conditioning to diagnose identifiability issues, and the application and discussion of regu-
larization techniques. This thesis develops and tests the idea that a deep analysis of the
singular value spectrum of the sensitivity matrix enables the diagnosis of non-identifiability.
By using this approach the effects of regularization in PE and OED can be predicted.
Monte Carlo studies are accomplished in order to support intermediate conclusions ob-
tained by the singular value analysis. With all these components in mind this thesis
proposes a computational framework to systematically evaluate ill-posed problems in PE
and OED.
Multiple methods to assess estimator performance, to determine ill-conditioning and
non-identifiability and to regularize parameter estimations are employed. Techniques
such as singular value analysis, parameter variance-decomposition, orthogonal decompo-
sitions, dynamic sensitivity profiles among others are used to investigate ill-conditioning
and non-identifiability. Three regularization techniques, namely orthogonal decomposi-
tion based techniques (i.e., Subset Selection -SsS and Truncated Singular Value Decom-
position -TSVD) and the Tikhonov regularization are applied. In parameter estimation,
two paradigms to analyze an estimator are described. The first paradigm uses parameter-
output sensitivity information, whereas the second paradigm is conducted via Monte Carlo
studies. To illustrate the significance of the computational framework offline applications
in Lithium-ion batteries, bio-ethanol production and bioreactors for several purposes are
examined. One case in chromatography separation is studied in the context of online
parameter estimation and redesign of experiments.
The application of the framework emphasizes various deficiencies in the studied cases
and demonstrates how to handle them. In the Lithium-ion battery case it is demonstrated
that the use of voltage discharge curves only enables the identification of a small parameter
subset, regardless of the number of experiments considered. In the Bio-ethanol produc-
tion case it is shown that parameter estimations of over-parameterized and parameter-
correlated models can be successfully treated by restricting oneself to the estimation of
the identifiable parameters. For the bioreactor systems optimal design solutions are proven
to be ineffective and/or meaningless because they are obtained from unidentifiable models.
Finally, in the online liquid chromatography case the instabilities and poor robustness of
the parameter estimation algorithm due to scarce experimental data at the beginning of
the experiment is treated by means of regularization techniques.
V
VI
Zusammenfassung
Unzureichende experimentelle Daten, die Komplexitat physikalischer Modelle, Uber-
parametrisierung und Korrelationen unter den Parametern, um nur eine kleine Auswahl
zu nennen, verkomplizieren die Schatzung von kinetischen, thermodynamischen und
Transport-Parametern. Diese Probleme sind Ursachen fur Nichtidentifizierbarkeit und fol-
glich “ill-posedness”(schlecht gestellte Probleme). In dieser Arbeit werden die Auswirkun-
gen, Ursachen und Behebungsstrategien von ill-posed, nicht-linearen Parameterschatzun-
gen (PE1) und optimaler Versuchsplanung (OED2) untersucht. Die Verbindung zwischen
Identifizierbarkeitsproblemen und ill-posed Problemen wird besprochen. Das Hauptaugen-
merk liegt dabei auf folgenden beiden Punkten: Die Bestimmung von “ill-conditioning“
(schlecht konditioniertes Problem) um Identifizierbarkeitsprobleme zu diagnostizieren und
die Anwendung sowie Diskussion von Regularisierungstechniken. In dieser Arbeit wird die
Idee entwickelt, dass eine profunde Analyse des Singularwertspektrums der Sensitivitats-
matrix die Diagnose von Nichtidentifizierbarkeit ermoglicht. Unter Verwendung dieses
Ansatzes konnen die Effekte der Regularisierung in PE und OED vorhergesagt werden.
Monte Carlo Tests werden durchgefuhrt um die Vorhersagen der Singularitatsanalyse zu
uberprufen. Mit den angefuhrten Methoden wird in dieser Arbeit ein Rahmen zur system-
atischen Evaluierung von ill-posed Problemen in PE und OED geschaffen.
Es werden verschiedene Methoden zur Bestimmung der Qualitat der Parameter-
schatzung, von ill-conditioning und Nichtidentifizierbarkeit und zur Regularisierung ver-
wendet. Techniken wie Singularitatsanalyse, Parameter-Varianz-Dekomposition, orthogo-
nale Dekomposition, die Auswertung von dynamischen Sensitivitatsprofilen und weitere
werden angewendet um ill-conditioning und Nichtidentifizierbarkeit zu untersuchen. Drei
Regularisierungstechniken, darunter zwei Techniken basierend auf orthogonale Dekomposi-
tion (Subgruppenauswahl -SsS3 und abgeschnittene Singularwertzerlegung -TSVD4) sowie
die Tikhonov-Regularisierung werden verwendet. In der Parameterschatzung werden zwei
Paradigmen zu deren Untersuchung beschrieben. Das erste Paradigma verwendet Sensi-
tivitatsinformationen, wohingegen das zwei Paradigma auf Monte Carlo Studien basiert.
Um die Anwendbarkeit und Bedeutung der zuvor besprochenen Methoden zu illustrieren
werden“offline“ Anwendungen in Lithium-Ionen-Batterien, in der Bio-Ethanol Produktion
sowie in Bioreaktoren untersucht. Des weiteren wird eine Anwendung zur chromatographis-
chen Trennung im Kontext von optimaler Versuchsplanung und Redesign untersucht.
Die Anwendung des vorgeschlagenen methodologischen Rahmens zeigt eine Vielzahl von
Mangeln in den untersuchten Fallen auf und bespricht deren Behebungsstrategien. Im Fall
der Lithium-Ionen-Batterien wird gezeigt, dass die alleinige Verwendung von Entladungs-
daten nur die verlassliche Identifizierung eines kleinen Teils der erforderlichen Modellpa-
rameter ermoglicht, und das unabhangig von der Anzahl der durchgefuhrten Experimente.
1Abkurzung der englischen Bezeichnung Parameter Estimation2Abkurzung der englischen Bezeichnung Optimal Experimental Design3Abkurzung der englischen Bezeichnung Subset Selection4Abkurzung der englischen Bezeichnung Truncated Singular Value Decomposition
VII
In der Bioethanolproduktion wird gezeigt, dass Probleme bei der Parameterschatzung
von uberparametrisierten Modellen mit korrelierten Parametern durch die Bestimmung
weniger, aber essentieller Parameter behoben werden konnen. Die Losungen zur optimalen
Versuchsplanung von Bioreaktoren stellen sich als ineffektiv heraus, da sie fur nicht identi-
fizierbare Modelle berechnet wurden. Schlussendlich werden im Fall der Chromotographie
Instabilitaten in der Parameterschatzung, verursacht durch die dunne Datenlage zu Beginn
des Experimentes, durch Regularisierungstechniken behoben.
VIII
Contents
Abstract V
Zusammenfassung VII
List of Figures VIII
List of Tables XVIII
Nomenclature XXIX
Co-authorship XXXIII
List of publications used for this thesis 1
1. Introduction 3
1.1. Research motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2. Research scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3. Research outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2. Theoretical background I: Model-based parameter estimation and experimental
design 9
2.1. Model development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2. Model formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3. Parameter estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.1. Mathematical formulation . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.2. Parameter-output sensitivity matrix . . . . . . . . . . . . . . . . . . 15
Sensitivity measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4. Singular value decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4.1. SVD of the sensitivity matrix . . . . . . . . . . . . . . . . . . . . . . 17
2.4.2. SVD of the sensitivity matrix vs the eigensystem of Fisher-
information matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5. Parameter estimator analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5.1. Estimator precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Covariance matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Confidence interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5.2. Estimator accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5.3. Reliability tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Hypothesis test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Confidence interval test . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6. Identifiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6.1. Qualitative identifiability . . . . . . . . . . . . . . . . . . . . . . . . 23
IX
Contents
2.6.2. Quantitative identifiability . . . . . . . . . . . . . . . . . . . . . . . 24
2.7. Optimal experimental design . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.7.1. OED design criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.7.2. Graphical interpretation of OED design criteria . . . . . . . . . . . . 26
2.7.3. Observation about singular matrices in OED . . . . . . . . . . . . . 28
2.7.4. Sequential OED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.7.5. Online OED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Finite time horizon schemes . . . . . . . . . . . . . . . . . . . . . . . 29
Online mathematical formulation of PE and OED . . . . . . . . . . 31
2.8. Initial guess sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.8.1. Minimum bias Latin hypercube design (MBLHD) . . . . . . . . . . . 32
3. Theoretical background II: Ill-posed problems and numerical regularization 35
3.1. Direct and inverse problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2. Ill-posed problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.1. Ill-conditioned problems . . . . . . . . . . . . . . . . . . . . . . . . . 37
Ill-conditioning and collinearity measures . . . . . . . . . . . . . . . 38
Classification of ill-conditioned problems . . . . . . . . . . . . . . . . 39
Relationship between identifiability problems and ill-conditioning . . 40
Effect of ill-conditioning on parameter estimation . . . . . . . . . . . 41
Effect of ill-conditioning on optimal experimental design . . . . . . . 42
3.2.2. Parameter variance-decomposition . . . . . . . . . . . . . . . . . . . 42
3.3. Numerical regularization for parameter estimation . . . . . . . . . . . . . . 43
3.3.1. Parameter subset selection (Reg=SsS) . . . . . . . . . . . . . . . . . 45
3.3.2. Truncated singular value decomposition (Reg=TSVD) . . . . . . . . 46
3.3.3. Tikhonov regularization (Reg=Tikh) . . . . . . . . . . . . . . . . . . 46
3.3.4. Regularized parameter covariance matrix . . . . . . . . . . . . . . . 47
4. Computational framework 49
4.1. Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2. Analysis paradigms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2.1. Sensitivity method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2.2. Monte Carlo method . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3. Estimator performance assessment . . . . . . . . . . . . . . . . . . . . . . . 56
4.3.1. Estimator precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Covariance based on the Sensitivity Matrix . . . . . . . . . . . . . . 57
Covariance based on Monte Carlo . . . . . . . . . . . . . . . . . . . 57
Confidence intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.3.2. Estimator accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.4. Structural Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
X
Contents
4.4.1. Ill-Conditioning Analysis . . . . . . . . . . . . . . . . . . . . . . . . 58
Sensitivity method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Monte Carlo method . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.4.2. Identifiability diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . 60
Variance Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
SVD Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
QR Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.5. Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.5.1. Regularization in parameter estimation . . . . . . . . . . . . . . . . 63
4.5.2. Regularization in optimal experimental design . . . . . . . . . . . . 64
4.5.3. Selection of the regularization parameter . . . . . . . . . . . . . . . . 64
4.6. Other analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.6.1. Parameter sensitivity analysis . . . . . . . . . . . . . . . . . . . . . . 64
4.6.2. Selection of a parameter initial guess . . . . . . . . . . . . . . . . . . 65
5. Lithium-ion battery: Finding adequate experimental data 67
5.1. Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.2. Li-Ion Battery Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.3. Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.3.1. Case 1: Single Discharge Curve. . . . . . . . . . . . . . . . . . . . . 75
Sensitivity Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Monte Carlo Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.3.2. Case 2: Multiple Discharge Curves. . . . . . . . . . . . . . . . . . . . 81
Sensitivity Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Monte Carlo Method. . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.3.3. Case 3: Discharge curves and electrolyte concentration profile. . . . 85
5.3.4. Computational Issues . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.4. Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6. Bioethanol: Identifying an over-parameterized model with large parameter cor-
relations 91
6.1. Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2. Bio-ethanol from cane bagasse by SSF process . . . . . . . . . . . . . . . . . 92
6.2.1. Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.2.2. Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.3. Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.3.1. Model selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.3.2. Parameter initial guess selection . . . . . . . . . . . . . . . . . . . . 100
6.3.3. Iterative parameter estimation with structural analysis . . . . . . . . 102
6.3.4. Estimator performance assessment . . . . . . . . . . . . . . . . . . . 107
XI
Contents
6.3.5. Validation of the identified model . . . . . . . . . . . . . . . . . . . . 109
Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Assessment of the identifiable parameter subset . . . . . . . . . . . . 110
6.3.6. Discussion of the results . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.4. Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7. More cases from bioprocessing: the effect of ill-posed parameter estimation on
optimal experimental design 117
7.1. Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.2. Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7.3. Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7.3.1. Computational Issues . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.3.2. Ill-conditioning and identifiability diagnosis . . . . . . . . . . . . . . 121
E1 - Fed Batch Fermentation . . . . . . . . . . . . . . . . . . . . . . 121
E2 - Biochemical network . . . . . . . . . . . . . . . . . . . . . . . . 122
E3 - ASM3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.3.3. Optimal design without regularization . . . . . . . . . . . . . . . . . 125
E1 - Fed Batch Fermentation . . . . . . . . . . . . . . . . . . . . . . 125
E2 - Biochemical network . . . . . . . . . . . . . . . . . . . . . . . . 125
E3 - ASM3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.3.4. Optimal design with regularization . . . . . . . . . . . . . . . . . . . 128
Subset Selection (Reg=SsS) . . . . . . . . . . . . . . . . . . . . . . . 128
Truncated Singular Value Decomposition (Reg=TSVD) . . . . . . . 130
Tikhonov Regularization (Reg=Tikh) . . . . . . . . . . . . . . . . . 131
7.3.5. Influence of the available measurement information on ill-posedness
(New) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.3.6. Monte Carlo study (New) . . . . . . . . . . . . . . . . . . . . . . . . 136
Study of the initial design . . . . . . . . . . . . . . . . . . . . . . . . 136
Study of the optimal design without regularization . . . . . . . . . . 137
Study of the optimal design with regularization . . . . . . . . . . . . 138
7.3.7. Summary and conclusions . . . . . . . . . . . . . . . . . . . . . . . . 140
8. Chromatography system: Scarce experimental data in online estimation 143
8.1. Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.2. HPLC chromatography process . . . . . . . . . . . . . . . . . . . . . . . . . 144
8.2.1. Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Experimental set-up . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
8.2.2. Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Manager and pump . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
XII
Contents
Chromatography column . . . . . . . . . . . . . . . . . . . . . . . . 149
UV sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
8.3. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
8.3.1. Assignment of variables . . . . . . . . . . . . . . . . . . . . . . . . . 150
8.3.2. Parameters of the time horizon schemes . . . . . . . . . . . . . . . . 151
8.3.3. Computational Issues . . . . . . . . . . . . . . . . . . . . . . . . . . 151
8.3.4. Online Base Case: PE without regularization (New) . . . . . . . . . 152
8.3.5. Online Regularized Case: PE with regularization (New) . . . . . . . 155
Regularization parameter selection . . . . . . . . . . . . . . . . . . . 155
8.3.6. Online Redesign of Experiments . . . . . . . . . . . . . . . . . . . . 161
8.3.7. Validation of the parameter estimates by Frontal Analysis . . . . . . 162
8.3.8. Optimal input designs vs standard input designs . . . . . . . . . . . 164
8.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
9. Summary and Outlook 169
9.1. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
9.2. Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
A. Appendix 177
A.1. Own publications and presentations . . . . . . . . . . . . . . . . . . . . . . 177
A.1.1. Articles in Journals . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
A.1.2. Oral Presentations and Posters . . . . . . . . . . . . . . . . . . . . . 178
Oral Presentations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
Poster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
A.1.3. Proceedings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
A.2. Own publications used for the cumulative thesis . . . . . . . . . . . . . . . . 179
A.3. Bio-processes: Parameter variance and variance-decomposition . . . . . . . 180
A.4. Implication of structural properties of the sensitivity matrix on Fisher-
information matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
A.5. Matrices notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
A.5.1. Matrices and vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
A.5.2. Matrix operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
A.5.3. Some special matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 186
A.5.4. Matrix rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
A.5.5. Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . 188
A.5.6. Inverse of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
A.5.7. Matrix decompositions . . . . . . . . . . . . . . . . . . . . . . . . . . 191
A.5.8. Eigendecomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
A.5.9. Singular value decomposition (SVD) . . . . . . . . . . . . . . . . . . 192
A.5.10.QR-factorization with column pivoting (QRP) . . . . . . . . . . . . 193
XIII
Contents
A.5.11.Numerical rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
A.5.12.Condition number . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
A.5.13.Collinearity index . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
A.5.14.Covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
A.5.15.Fisher-information Matrix . . . . . . . . . . . . . . . . . . . . . . . . 198
Bibliography 199
XIV
List of Figures
1.1. Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1. Iterative work cycle of model-based experimentation for model development 10
2.2. Influence of alphabetic experimental design criteria on the singular value
spectrum (SVs) of the sensitivity matrix S. (Figure from publication III -
Lopez et al. (2015) - reprinted with permission from Elsevier Science) . . . 27
2.3. Discretization grids and time horizons used in the online algorithm. (Figure
taken from Barz et al. (2013) [8] - reprinted with permission from AIChE
Journal) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.1. Graphical representation of a) the direct problem and b) the inverse problem
in nonlinear modeling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2. Singular value spectrum (SVs) of a rank-deficient problem. (Figure taken
from publication III - Lopez et al. (2015) - reprinted with permission from
Elsevier Science) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3. Singular value spectrum (SVs) of an ill-determined rank problem. (Figure
taken from publication III - Lopez et al. (2015) - reprinted with permission
from Elsevier Science) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.1. Consolidated framework for development and experimental validation of
process models with possible ill-posed problems . . . . . . . . . . . . . . . . 50
4.2. Model parameters estimated and analyzed based on the sensitivity method 55
4.3. Model parameters estimated and analyzed based on the Monte Carlo method 56
4.4. Ill-conditioning analysis based on the sensitivity method. . . . . . . . . . . . 59
4.5. Identifiability diagnosis techniques based on the sensitivity method. . . . . 60
5.1. Li-Ion cell during discharge process. Cell consists of a LixC6 negative elec-
trode, a LiyMn2O4 positive electrode, and a separator with LiPF6 salt-
based electrolyte. (Figure taken from publication I - Lopez et al. (2016)
in Appendix A.2 - reprinted from Industrial & Engineering Chemistry Re-
search with permission from American Chemical Society). . . . . . . . . . . 68
5.2. Discharge curves for Case 1 (base rate I1 = 1C) and Case 2 simultaneously
considering fast (I2 = 2C, I3 = 3C, I4 = 4C), and slow rates (I5 = 0.5C
and I6 = 0.1C). Markers are experimental data and solid lines are model
predictions after parameter estimation at estimator θ. (Figure taken from
publication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Indus-
trial & Engineering Chemistry Research with permission from American
Chemical Society). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
XV
List of Figures
5.3. Singular value spectra. Left panel is Case 1 (single discharge curve) and
right panel is Case 2 (multiple discharge curves). (Figure taken from publi-
cation I - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial &
Engineering Chemistry Research with permission from American Chemical
Society). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.4. Case 1: Variance decomposition for SVD identifiability method of Section
4.4.2. (Figure taken from publication I - Lopez et al. (2016) in Appendix
A.2 - reprinted from Industrial & Engineering Chemistry Research with
permission from American Chemical Society). . . . . . . . . . . . . . . . . . 79
5.5. Sensitivity time profiles of cell voltage with respect to parameters
dVcell(t)/dθ at nominal discharge rate I1 for Case 1. (Figure taken from
publication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Indus-
trial & Engineering Chemistry Research with permission from American
Chemical Society). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.6. Case 1: Marginal pdfs obtained from Monte Carlo. Solid-black lines and
filled regions represent the normal and the non-parametric distributions of
each estimator, respectively. Parameters with a star are nominated as iden-
tifiable. (Figure taken from publication I - Lopez et al. (2016) in Appendix
A.2 - reprinted from Industrial & Engineering Chemistry Research with
permission from American Chemical Society). . . . . . . . . . . . . . . . . . 82
5.7. Sensitivity time profiles of cell voltage with respect to parameters
dVcell(t)/dθ at slow I6, nominal I1, and fast I4 discharge rates for scenario
Case 2-SC6. (Figure taken from publication I - Lopez et al. (2016) in Ap-
pendix A.2 - reprinted from Industrial & Engineering Chemistry Research
with permission from American Chemical Society). . . . . . . . . . . . . . . 83
5.8. Case 2: Marginal pdfs for parameters obtained with Monte Carlo analysis
for scenarios Case 2-SC4 (left) and Case 2-SC6 (right). (Figure taken from
publication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Indus-
trial & Engineering Chemistry Research with permission from American
Chemical Society). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.9. Case 3: Voltage and electrolyte concentration profile at separator for sce-
nario Case 3-SC1. (Figure taken from publication I - Lopez et al. (2016)
in Appendix A.2 - reprinted from Industrial & Engineering Chemistry Re-
search with permission from American Chemical Society). . . . . . . . . . . 86
5.10. Case 3: Spectrum of singular values under different scenarios. (Figure
taken from publication I - Lopez et al. (2016) in Appendix A.2 - reprinted
from Industrial & Engineering Chemistry Research with permission from
American Chemical Society). . . . . . . . . . . . . . . . . . . . . . . . . . . 86
XVI
List of Figures
5.11. Sensitivity time profiles of cell voltage with respect to parameters
dVcell(t)/dθ at nominal I1 and fast I4 discharge rates for scenario Case
3-SC4. (Figure taken from publication I - Lopez et al. (2016) in Appendix
A.2 - reprinted from Industrial & Engineering Chemistry Research with
permission from American Chemical Society). . . . . . . . . . . . . . . . . . 87
5.12. Sensitivity time profiles of electrolyte concentration in the separator with
respect to parameters dce(ℓa+ℓs/2, t)/dθ at nominal I1 and fast I4 discharge
rates for scenario Case 3-SC4. (Figure taken from publication I - Lopez
et al. (2016) in Appendix A.2 - reprinted from Industrial & Engineering
Chemistry Research with permission from American Chemical Society). . . 87
5.13. Case 3: Marginal pdfs for scenarios Case 3-SC1 (left) and Case 3-SC4
(right). (Figure taken from publication I - Lopez et al. (2016) in Appendix
A.2 - reprinted from Industrial & Engineering Chemistry Research with
permission from American Chemical Society). . . . . . . . . . . . . . . . . . 89
6.1. Simplified reaction mechanisms in SSF processes [36]. (Figure taken from
publication II - Lopez et al. (2013) in Appendix A.2 - reprinted from
Biotechnology Progress with permission from American Institute of Chem-
ical Engineers). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.2. Model fitting to experimental data E1 by using (left panel) the original
model proposed in Ref. [36] and (right panel) using the finally selected
model after the model selection step. (Figure taken from publication II
- Lopez et al. (2013) in Appendix A.2 - reprinted from Biotechnology
Progress with permission from American Institute of Chemical Engineers). . 99
6.3. Fitting of experimental data using the model selected in step 1: (left panel)
results for E2 and (right panel) results for E3. Different initial guesses,
IGDrissen[35] and IGPhilippidis [97], were used for the solution of the pa-
rameter estimation problems where measured data from E2 and E3 was
considered simultaneously. (Figure taken from publication II - Lopez et
al. (2013) in Appendix A.2 - reprinted from Biotechnology Progress with
permission from American Institute of Chemical Engineers). . . . . . . . . . 101
6.4. Cost function (CF) of parameter estimation and numerical rank (rϵ) of
the sensitivity matrix obtained for 30 different initial guesses generated
by MBLHD considering data from E2 and E3. The maximum acceptable
cost function value to accept an initial guess is denominated “CF Bound“.
(Figure taken from publication II - Lopez et al. (2013) in Appendix A.2
- reprinted from Biotechnology Progress with permission from American
Institute of Chemical Engineers). . . . . . . . . . . . . . . . . . . . . . . . . 102
XVII
List of Figures
6.5. Results of parameter estimation with identifiability analysis. (Figure taken
from publication II - Lopez et al. (2013) in Appendix A.2 - reprinted from
Biotechnology Progress with permission from American Institute of Chem-
ical Engineers). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.6. Experimental vs. predicted concentrations using the parameter vector θ
calculated in iteration k = 1, 4, 6 using experimental data of E2 and E3.
Results for E2 (left panel) and results for E3 (right panel). (Figure taken
from publication II - Lopez et al. (2013) in Appendix A.2 - reprinted from
Biotechnology Progress with permission from American Institute of Chem-
ical Engineers). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.7. Ranking of parameters according to sensitivity (the most sensitive parame-
ter above in position 1). The identifiable parameter subset in every iteration
k is marked by shaded cells. (Figure taken from publication II - Lopez et
al. (2013) in Appendix A.2 - reprinted from Biotechnology Progress with
permission from American Institute of Chemical Engineers). . . . . . . . . . 106
6.8. Estimator performance assessment: parameter statistical significance (tj -
value) and 95% confidence intervals (L ≤ θ(rk) ≤ U). Shady cells indicate
parameter with statistical significance of 95%. (Figure taken from publica-
tion II - Lopez et al. (2013) in Appendix A.2 - reprinted from Biotechnology
Progress with permission from American Institute of Chemical Engineers). . 108
6.9. Cross-validation using parameter vectors obtained from parameter estima-
tion with E2&E3: experimental vs. predicted concentrations for E1 (left
panel), for E4 (middle panel), and for E5 (right panel). (Figure taken
from publication II - Lopez et al. (2013) in Appendix A.2 - reprinted from
Biotechnology Progress with permission from American Institute of Chem-
ical Engineers). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.10. Validation of the identifiable parameter subset using parameter vectors
obtained after solving parameter estimation problems with Nθ = 4 and
Nθ = 14: experimental vs. predicted concentrations for E1 (left panel), for
E4 (middle panel), and for E5 (right panel). (Figure taken from publication
II - Lopez et al. (2013) in Appendix A.2 - reprinted from Biotechnology
Progress with permission from American Institute of Chemical Engineers). . 111
7.1. Main procedure and nomenclature of Section 7.3. (Figure taken from publi-
cation III - Lopez et al. (2015) in Appendix A.2 - reprinted from Computers
& Chemical Engineering with permission from Elsevier). . . . . . . . . . . . 120
XVIII
List of Figures
7.2. Singular value spectrum (SVs) of the sensitivity matrix evaluated at the
initial design S(uIG) for problem (a) E1, (b) E2 and (c) E3. Each singular
value less than ϵ-threshold is considered ill-conditioned. (Figure taken from
publication III - Lopez et al. (2015) in Appendix A.2 - reprinted from
Computers & Chemical Engineering with permission from Elsevier). . . . . 123
7.3. Change in the singular value spectrum (SVs) of the sensitivity matrix for the
OED without regularization. Results are shown for the sensitivity matrix at
initial design S(uIG) and at optimal A-design S(uA) and E-design (S(uE))
for problem E1. (Figure taken from publication III - Lopez et al. (2015)
in Appendix A.2 - reprinted from Computers & Chemical Engineering with
permission from Elsevier). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
7.4. Change in the singular value spectrum (SVs) of the sensitivity matrix for the
OED without regularization. Results are shown for the sensitivity matrix
at initial design S(uIG) and at optimal design for problem E2 with: (a)
A-criterion S(uA), and (b) E-criterion S(uE). Note that the shown lower
bound ϵ is computed for S(ucrit). (Figure taken from publication III - Lopez
et al. (2015) in Appendix A.2 - reprinted from Computers & Chemical
Engineering with permission from Elsevier). . . . . . . . . . . . . . . . . . . 127
7.5. Singular value spectrum (SVs) of the original and the regularized sensitivity
matrix after applying Reg=SsS, S and SSsS , respectively. Results are shown
for the initial uIG and optimal ucrit experimental designs for problem: (a)
E2 where ucrit = uA, and (b) E3 where ucrit = uE . The solid-black curve
shows the SVs of the original sensitivity matrix without regularization at
initial design S(uIG). The black-cross and gray-cross markers show the SVs
of the regularized (reduced) matrix at initial and optimal designs SSsS(uIG)
and SSsS(ucrit), respectively. Note that the lower bound ϵκ is computed for
SSsS(ucrit). (Figure taken from publication III - Lopez et al. (2015) in
Appendix A.2 - reprinted from Computers & Chemical Engineering with
permission from Elsevier). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.6. Singular value spectrum (SVs) of the original and the regularized sensitiv-
ity matrix after applying Reg=TSVD, S and STSV D, respectively. Results
are shown at initial uIGand optimal ucrit experimental designs, respectively
for problem: (a) E2 where ucrit = uA, and (b) E3 where ucrit = uE . The
solid-black curve shows the SVs of the original sensitivity matrix without
regularization at initial design S(uIG). The black-cross and gray-cross mark-
ers show the SVs of the regularized (approximated) matrix at the initial and
optimized designs STSV D(uIG) and STSV D(ucrit), respectively. Note that
the lower bound ϵκ is computed for STSV D(ucrit). (Figure taken from publi-
cation III - Lopez et al. (2015) in Appendix A.2 - reprinted from Computers
& Chemical Engineering with permission from Elsevier). . . . . . . . . . . . 131
XIX
List of Figures
7.7. Singular value spectrum (SVs) of the original and the regularized sensitiv-
ity matrix after applying Reg=Tikh, S and ST ikh, respectively for problem
E3 with: (a) λ1 = 0.001 (weak regularization), and (b) λ2 = 0.1 (strong
regularization). The solid-black curve shows the SVs of the original sensi-
tivity matrix without regularization at the the initial design S(uIG). The
black-cross markers show the SVs of the regularized matrix at the initial
design ST ikh(uIG). (Figure taken from publication III - Lopez et al. (2015)
in Appendix A.2 - reprinted from Computers & Chemical Engineering with
permission from Elsevier). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.8. Singular value spectrum (SVs) of the original and the regularized sensitivity
matrix after applying Reg=Tikh, S and ST ikh, respectively. Results are
shown at initial and optimal experimental design, uIG and ucrit, respectively,
for problem E2 with: (a) ucrit = uA and (b) ucrit = uE . The solid-black
curve shows the SVs of the original sensitivity matrix without regularization
at the initial design S(uIG). The black-cross and gray-cross markers show
the SVs of the regularized matrix at initial design ST ikh(uIG) and at optimal
design ST ikh(ucrit), respectively. Note that the lower bound ϵκ is computed
for the SVs of ST ikh(ucrit). (Figure taken from publication III - Lopez
et al. (2015) in Appendix A.2 - reprinted from Computers & Chemical
Engineering with permission from Elsevier). . . . . . . . . . . . . . . . . . . 134
7.9. Comparison between the singular value spectrum (SVs) of the sensitivity
matrix S at the initial design uIG for example E1 with different experimental
data sets. E1a-c are adaptations of E1 including x1 and x2 as measurable
variables and increasing the number of experimental points to 20, 80 and
160, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.10. Comparison between the singular value spectrum (SVs) of the sensitivity
matrix S(ucrit) at the optimal design ucrit with crit=A,D,E for the well-
posed case E1c (no regularization). Spectrum labeled S is evaluated at the
initial design uIG. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.11. Monte Carlo problem E2: Box plots of normalized parameter estimates and
corresponding cost function norm obtained at (a) initial design uIG and (b)
E-optimal design uE without regularization. . . . . . . . . . . . . . . . . . 137
7.12. Monte Carlo problem E2: Box plots of normalized parameter estimates and
corresponding cost function norm obtained at regularized A-optimal design
uA with Reg=SsS by solving parameter estimations with (a) Reg=None
(Study I) and (b) Reg=SsS (Study II). . . . . . . . . . . . . . . . . . . . . . 139
8.1. Experimental set up of the chromatography system. (Figure taken from
publication IV- Barz et al. (2016) in Appendix A.2 - reprinted from Com-
puters & Chemical Engineering with permission from Elsevier) . . . . . . . 146
XX
List of Figures
8.2. Schematic flow sheet of the chromatography system in Fig. 8.1. (Figure
taken from publication IV- Barz et al. (2016) in Appendix A.2 - reprinted
from Computers & Chemical Engineering with permission from Elsevier) . 146
8.3. Units of the process model, input/ output variables and unknown model pa-
rameters. (Figure taken from publication IV- Barz et al. (2016) in Appendix
A.2 - reprinted from Computers & Chemical Engineering with permission
from Elsevier) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
8.4. Responses of the manager and pump outlet concentrations (equal to the
column inlet concentrations cini ) for arbitray steps in the feed concentration
cfeedi . Flow is kept constant at 1.5 ml/min. (Figure taken from publication
IV- Barz et al. (2016) in Appendix A.2 - reprinted from Computers &
Chemical Engineering with permission from Elsevier) . . . . . . . . . . . . . 149
8.5. Online Base Case (Reg=None): Model fitting using the parameter estimate
θ at tk = 30 (shown in Table 8.2) for the online parameter estimation with-
out regularization. The markers show the measured sum outlet concentra-
tions whereas the solid-line shows the corresponding simulated concentrations.152
8.6. Online Base Case (Reg=None): Relative Bias (%) of each estimated param-
eter (w.r.t. its corresponding true parameter value in Table 8.2) computed
at each sampling time after solving parameter estimations without regular-
ization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
8.7. Online Base Case (Reg=None): Singular value spectrum (SVsk) of each sen-
sitivity matrix S−k computed at each sampling time tk after solving param-
eter estimations without regularization. The horizontal solid plane labeled
ϵ = 6.7 (see Eq. 3.3 with γmax = 15 and κmax = 1000) is the typical thresh-
old to selected the ill-conditioned singular values in the sensitivity method
ill-conditioning analysis of Section 4.4.1. . . . . . . . . . . . . . . . . . . . . 154
8.8. Online Case SsS (Reg=SsS): Relative Bias (%) of the estimated parame-
ter vector θk (w.r.t. the true parameter vector in Table 8.2) computed
at each sampling time tk with k = 1, · · · , Nm after solving parameter es-
timations by using subset selection as regularization for several values of
the regularization parameter ϵ-threshold. Regularization parameters with
the best performance in terms of the accuracy of the estimated parame-
ter vector at the end of the experiment t = 30 min are enclosed in the
red box. “None“ makes reference to parameter estimations without regu-
larization (Reg=None). Note that ϵ-threshold is always defined by γmax
according to Eq. 3.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
XXI
List of Figures
8.9. Online Case SsS (Reg=SsS): Mean Relative Bias (%) for the whole ex-
periment duration of the Nm estimated parameter vectors θk with k =
1, · · · , Nm (w.r.t. the true parameter vector in Table 8.2) obtained after
solving the Nm parameter estimations by using subset selection as regular-
ization for several values of the regularization parameter ϵ-threshold. Reg-
ularization parameters with the best global performance in terms of the
accuracy of the estimated parameter vectors during the whole experiment
are enclosed in the red box. . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.10. Online Case SsS (Reg=SsS): Singular value spectrum SVstk of the sensitiv-
ity matrix S−k computed at each sampling time tk = 3, 5, 10, 15, 20, 25, 30
after solving parameter estimations by using subset selection as regulariza-
tion for several values of the regularization parameter, i.e., ϵ = 0.1, 2, 500.Large values of ϵ determine strong regularizations of PE otherwise the reg-
ularization is weak. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
8.11. Online Case Tikh (Reg=Tikh) for λ regularization parameter selection:
Relative Bias (%) of the estimated parameter vector θk (w.r.t. the true
parameter vector in Table 8.2) computed at each sampling time tk with
k = 1, /cdots,Nm after solving parameter estimations by using subset selec-
tion as regularization (Reg=Tikh) for several values of the regularization
parameter λ in the course of the online experiment. Regularization param-
eters with the best performance in terms of the accuracy of the estimated
parameter vector at the end of the experiment t = 30 min are enclosed
in the red box. “None“ makes reference to parameter estimations without
regularization (Reg=None). Note that λ is here defined by σθ according to
Eq. 8.11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
8.12. Online Case Tikh (Reg=Tikh) for λ regularization parameter selection:
Mean Relative Bias (%) for the whole experiment duration of the Nm esti-
mated parameter vectors θk with k = 1, · · · , Nm (w.r.t. the true parameter
vector in Table 8.2) obtained after solving Nm parameter estimations by us-
ing subset selection as regularization for several values of the regularization
parameter λ. Regularization parameters with the best global performance
in terms of the accuracy of the estimated parameter vectors during the
whole experiment are enclosed in the red box. “None“ makes reference to
parameter estimations without regularization (Reg=None). Note that λ is
defined by σθ according to Eq. 8.11. . . . . . . . . . . . . . . . . . . . . . . 160
XXII
List of Figures
8.13. Online Case Tikh (Reg=Tikh) for λ regularization parameter selection:
Singular value spectrum SVsT ikhtk
and SVstk of the regularized and origi-
nal sensitivity matrices S−,T ikhk and S−
k computed at each sampling time
tk = 5, 10, 15, 20, 25, 30 after solving parameter estimations by using
Tikhonov as regularization for regularization parameter λ = 50. Notice
that singular values ςi ≤ λ are approximated to values around λ. . . . . . . 161
8.14. D-optimal adaptive input design for feeding strategy FS-1. Subfigures a,
b show the measured sum and predicted individual outlet concentrations.
Subfigure c shows the input design, i.e. the inlet concentrations. Subfigure d
shows the results of the identifiability analysis for the parameters θ1, · · · , θ6.If a parameter was identifiable and selected by the subset selection (SsS)
algorithm, this parameter was active and its activity was indicated by a
dot. If a dot was missing, the parameter was not active and not identifiable.
Subfigure e shows the course of the parameter estimates. (Figure taken
from publication IV- Barz et al. (2016) in Appendix A.2 - reprinted from
Computers & Chemical Engineering with permission from Elsevier) . . . . 162
8.15. D-optimal adaptive input design for feeding strategy FS-2. Subfigures a,
b show the measured sum and predicted individual outlet concentrations.
Subfigure c shows the input design, i.e. the inlet concentrations. Subfigure d
shows the results of the identifiability analysis for the parameters θ1, · · · , θ6.If a parameter was identifiable and selected by the subset selection (SsS)
algorithm, this parameter was active and its activity was indicated by a
dot. If a dot was missing, the parameter was not active and not identifiable.
Subfigure e shows the course of the parameter estimates. (Figure taken
from publication IV- Barz et al. (2016) in Appendix A.2 - reprinted from
Computers & Chemical Engineering with permission from Elsevier) . . . . 163
8.16. Validation of the parameter estimates for single component adsorption. Ad-
sorption isotherms obtained by FA are shown by calculated equilibrium
points. Predicted adsorption isotherms using the Langmuir model are
shown by lines. Predictions are made using parameter estimates from D-
optimal designs and standard input designs (uniform and pulse). (Figure
taken from publication IV- Barz et al. (2016) in Appendix A.2 - reprinted
from Computers & Chemical Engineering with permission from Elsevier) . . 164
8.17. Input design (subfigure c) and outlet concentrations (subfigure a, b) for
feeding strategy FS-2 and a standard input design generated by a sum of
sinusoids; ’in silico’ experiment. (Figure taken from publication IV- Barz
et al. (2016) in Appendix A.2 - reprinted from Computers & Chemical
Engineering with permission from Elsevier) . . . . . . . . . . . . . . . . . . 165
XXIII
List of Figures
8.18. Input design (subfigure c) and outlet concentrations (subfigure a, b) for
feeding strategy FS-2 and a standard input design generated by an uniform
sampling; real experiment. (Figure taken from publication IV- Barz et al.
(2016) in Appendix A.2 - reprinted from Computers & Chemical Engineer-
ing with permission from Elsevier) . . . . . . . . . . . . . . . . . . . . . . . 166
A.1. Probability distribution of the estimators Θ1 and Θ2 . . . . . . . . . . . . . 197
XXIV
List of Tables
3.1. Definition of the specific derivatives used for “SsS“, “TSVD“ and “Tikh“ reg-
ularization; regularization “Reg“ equal “None“ refers to the original param-
eter estimation problem. (Table from publication III - Lopez et al. (2015)
- reprinted with permission from Elsevier Science) . . . . . . . . . . . . . . 46
5.1. Variables in Li-Ion Model. (Table taken from publication I - Lopez et al.
(2016) in Appendix A.2 - reprinted from Industrial & Engineering Chem-
istry Research with permission from American Chemical Society). . . . . . 69
5.2. Estimated parameters in Li-Ion Model. . . . . . . . . . . . . . . . . . . . . . 69
5.3. Operating and design variables, constants and fixed parameters in Li-Ion
Model. (Table taken from publication I - Lopez et al. (2016) in Appendix
A.2 - reprinted from Industrial & Engineering Chemistry Research with
permission from American Chemical Society). . . . . . . . . . . . . . . . . . 70
5.4. Governing equations for modified Li-Ion PDAE model. (Table taken from
publication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Indus-
trial & Engineering Chemistry Research with permission from American
Chemical Society). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.5. Auxiliary equations of modified Li-Ion PDAE model. (Table taken from
publication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Indus-
trial & Engineering Chemistry Research with permission from American
Chemical Society). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.6. Case 1 for Sensitivity method. (Table taken from publication I - Lopez
et al. (2016) in Appendix A.2 - reprinted from Industrial & Engineering
Chemistry Research with permission from American Chemical Society). . . 76
5.7. Case 1, 2, and 3: Confidence interval lengths for Sensitivity and Monte
Carlo methods. Lengths are expressed as percentages relative to the true
parameter. (Table taken from publication I - Lopez et al. (2016) in Ap-
pendix A.2 - reprinted from Industrial & Engineering Chemistry Research
with permission from American Chemical Society). . . . . . . . . . . . . . . 77
5.8. Case 1: Summary of results for Monte Carlo. (Table taken from publication
I - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial &
Engineering Chemistry Research with permission from American Chemical
Society). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.9. Case 2: Summary of results for Sensitivity and Monte Carlo methods.
(Table taken from publication I - Lopez et al. (2016) in Appendix A.2 -
reprinted from Industrial & Engineering Chemistry Research with permis-
sion from American Chemical Society). . . . . . . . . . . . . . . . . . . . . . 84
XXV
List of Tables
5.10. Computational results for simulation and parameter estimation problems.
(Table taken from publication I - Lopez et al. (2016) in Appendix A.2 -
reprinted from Industrial & Engineering Chemistry Research with permis-
sion from American Chemical Society). . . . . . . . . . . . . . . . . . . . . . 89
6.1. Experimental Conditions for Bio-Ethanol Production from Sugarcane
Bagasse in the SSF Process. (Table taken from publication II - Lopez et
al. (2013) in Appendix A.2 - reprinted from Biotechnology Progress with
permission from American Institute of Chemical Engineers). . . . . . . . . . 93
6.2. Initial guesses taken from literature and generated by MBLHD. (Table taken
from publication II - Lopez et al. (2013) in Appendix A.2 - reprinted from
Biotechnology Progress with permission from American Institute of Chem-
ical Engineers). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.3. Estimated parameter vector using E1, E4, and E5 experimental data after
solution of different PE problems. Column labeled “E2&E3“ contains the
estimated parameters after finishing the iterative parameter estimation with
structural analysis (see Figure 6.5). (Table taken from publication II - Lopez
et al. (2013) in Appendix A.2 - reprinted from Biotechnology Progress with
permission from American Institute of Chemical Engineers). . . . . . . . . . 113
7.1. Problem description for case studies E1, E2 and E3. (Figure taken from
publication III - Lopez et al. (2015) in Appendix A.2 - reprinted from
Computers & Chemical Engineering with permission from Elsevier). . . . . 119
7.2. Thresholds for condition number (κmax) and collinearity index (γmax). (Fig-
ure taken from publication III - Lopez et al. (2015) in Appendix A.2 -
reprinted from Computers & Chemical Engineering with permission from
Elsevier). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
8.1. Different feeding strategies (FS) for the chromatography system. Abbrevi-
ations: ethyB = ethyl benzoate, propB = propyl benzoate, butyB = butyl
benzoate. (Table taken from publication IV- Barz et al. (2016) in Appendix
A.2 - reprinted from Computers & Chemical Engineering with permission
from Elsevier) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
8.2. Parameter estimates for the online parameter estimation using different
regularization strategies. ’True values’ refers to the best estimates obtained
from offline estimation without iteration limits. . . . . . . . . . . . . . . . . 153
8.3. Parameter accuracy for experimental data obtained from different input
designs and feeding strategies FS-1 and FS-2. Input designs marked by a
star were realized experimentally, all other are ’in silico’ experiments. (Table
taken from publication IV- Barz et al. (2016) in Appendix A.2 - reprinted
from Computers & Chemical Engineering with permission from Elsevier) . . 165
XXVI
List of Tables
A.1. Variance-decomposition: E1 - Fed Batch Fermentation. (Figure taken from
publication III - Lopez et al. (2015) - reprinted from Computers & Chemical
Engineering with permission from Elsevier). . . . . . . . . . . . . . . . . . . 181
A.2. Variance-decomposition: E2 - Biochemical network. (Figure taken from
publication III - Lopez et al. (2015) - reprinted from Computers & Chemical
Engineering with permission from Elsevier). . . . . . . . . . . . . . . . . . . 181
A.3. Variance-decomposition: E3 - ASM3. (Figure taken from publication III -
Lopez et al. (2015) - reprinted from Computers & Chemical Engineering
with permission from Elsevier). . . . . . . . . . . . . . . . . . . . . . . . . . 182
XXVII
XXVIII
Nomenclature
Abbreviations
ANOVA Analysis of variance
DAE Differential and algebraic equation
DoF Degrees of freedom
FIM Fisher-information matrix
IG Initial guess
LHD Latin hypercube design
LSQ Least-squares
MBLHD Minimum bias latin hypercube design
MSE Mean square error
OED Optimal experimental design
PD Positive definite
PDAE Partial differential and algebraic equation
PE Parameter estimation
PSD Positive semi-definite
Pubs. Scientific publications
QRP QR decomposition with column pivoting
SSF Simultaneous saccharification and fermentation process
SsS Identifiable parameter subset selection
SVD Singular value decomposition
SVs Singular value spectrum
Sym Symmetric matrix
Tikh Tikhonov
TSVD Truncated singular value decomposition
Latin symbols
a Scaling factor
C Covariance matrix (which is related to parameters if having no subscript)
CF Cost function (objective function) of the optimization problem either
PE or OED
f Set of DAEs representing the process model
F Fisher-information matrix
h Set of relations between y and x
H0 Null hypothesis in the hypothesis test
H1 Alternative hypothesis in the hypothesis test
H Hessian matrix
XXIX
List of Tables
INθIdentity matrix with dimension Nθ ×Nθ
J Jacobian matrix
L Operator matrix in Tikhonov regularization
LB Lower confidence limit
Ne Number of experiments
NIG Number of IGs
Nm Number of experimental data sampling times
Nmod Number of available models
Nmu Number of input variable switching times
Nu Number of input variable
Nx Number of dependent state variables
Ny Number of measured response variables
Nθ Number of parameters
PD(Nθ) Subspace of the positive definite matrices with dimension Nθ ×Nθ
PSD(Nθ) Subspace of positive semi-definite matrices with dimension Nθ ×Nθ
Q Orthogonal matrix of QRP
r Numerical rank
R Upper triangular matrix of QRP
S Sensitivity matrix
s Component of the sensitivity matrix
Sv Rectangular diagonal matrix of SVD
Sym(Nθ) Space of symmetric matrices with dimension Nθ ×Nθ
t Independent variable time
T Standardized random variable for the confidence interval
texp Experiment duration
TH0 Test statistic for the hypothesis test
tα/2,(Ny ·Nm·Ne−Nθ) Critical value of the two-tails Student’s t-distribution for the confidence
level α and DoF equals to (Ny ·Nm ·Ne −Nθ)
u Input action or experiment design / left singular vector
U Real or complex unitary matrix of SVD
UB Upper confidence limit
V ar Variance metric
V Real or complex unitary matrix of SVD
v Right singular vector
x Dependent state variable
y Observed model response variable
Y Total observed model response vector
ym Measured response variable
Y m Total measured response vector
Z Residual vector of parameter estimation
XXX
List of Tables
Greek symbols
α Confidence level
β Bias
γ Collinearity index
δ Sensitivity measure
ϵ Singular value threshold / Measurement error
θ Unknown parameter vector
Θ Parameter estimator
κ Condition number
λ(A) Eigenvalue of matrix A
λ Tikhonov regularization parameter
ξ Experiment
π Parameter variance-decomposition proportion
Π Permutation matrix of QRP
ρ Parameter variance threshold / Step size
σ Standard deviation
σ2 Variance
σ2j |ςi Variance component of the j-th parameter associated to ςi
Σ Standard deviation matrix
ς Singular value
υ Step direction
Φ Cost function of the PE
Ψ Cost function of the OED
Additional Indices and Subscripts
IG Related to the initial guess
i, j Vector/Matrix indices
(j) Related to the j-th replication in Monte Carlo
k Iteration index
mach Related to the machine precision
trsh Related to the threshold
u Related to the input variables
x Related to the state variables
y Related to the measured variables
θ Related to the parameters
κ Related to the condition number
γ Related to the collinearity index
0 Related to the initial condition
XXXI
List of Tables
Additional Superscripts
crit Related to alphabetic criteria after conducting OED, crit = A,D,ELSQ Related to the weighted nonlinear least-squares
max Maximum value
min Minimum value
None Related to original problem without regularization
(Nθ−rϵ) Related to the Nθ − rϵ unidentifiable parameters
R Related to the predefined parameter vector in Tikhonov
Reg Related to the regularized problem
(rϵ) Related to the rϵ identifiable parameters
SsS Related to the regularized problem with SsS
T ikh Related to the regularized problem with Tikh
TSV D Related to the regularized problem with TSVD
∗ Related to the true value
− Related to the past intervals/time instants
+ Related to the future intervals/time instants
Related to the normalized value
Related to the point estimate or optimal solution
Related to the linear independence reordering if applied to θ or scaling
if applied to S
Related to the first derivative
Special symbols
E Set of experiments
E [·] Sample expectation
N Normal distribution
R Real numbers
C Complex numbers
T Set of time measurement points
T u Set of time switching input action points
∇ Gradient
XXXII
Co-authorship
The research presented in this thesis was conducted by myself under the supervision of
Prof. Dr.-Ing. habil. Prof. h.c. Dr. h.c. Gunter Wozny of the Chair of Process Dynamics
and Operation at the Technische Universitat Berlin. Research in Chapter 5 was published
in Industrial & Engineering Chemistry Research, research in Chapter 6 was published in
Biotechnology Progress, research in Chapters 7 and 8 were published in Computers &
Chemical Engineering.
The topics in each peer-reviewed article were investigated by myself. Then the articles
were drafted by myself and revised by Professor Gunter Wozny who is co-author. Professor
Victor Zavala made contributions to Chapter 5 editing and reviewing it. Professor Mari-
ana Penuela contributed with the experimental data in Chapter 6. Professor Silvia Ochoa
reviewed the results in Chapter 6 and M.Sc Adriana Villegas contributed in model devel-
opment of the same chapter. Dr. Tilman Barz collaborated in the model development,
experimental and numerical implementation, and writing of the peer-reviewed articles pre-
sented in Chapters 7 and 8. Dr. Stefan Korkel reviewed the mathematical background
of Chapters 7 and 8. Dr. M. Nicolas Cruz B. and Dr. Sebastian Walter reviewed the
contents of Chapter 8.
XXXIII
XXXIV
List of publications used for this thesis
This thesis is based on the following publications. The complete list of publications and
contributions in conferences which were co-authored during my PhD can be found in the
Appendix A.1.
Publication I:
D. C. Lopez C., G. Wozny, A. Flores-Tlacuahuac, R. Vasquez-Medrano, and V. M. Zavala.
A computational framework for identifiability and ill-conditioning analysis of lithium-ion
battery models. Industrial & Engineering Chemistry Research, 55(11):3026-3042, 2016.
http://pubs.acs.org/doi/abs/10.1021/acs.iecr.5b03910
Publication II:
D. C. Lopez C., T. Barz, M. Penuela, A. Villegas, S. Ochoa, and G. Wozny. Model-
based identifiable parameter determination applied to a simultaneous saccharification
and fermentation process model for bio-ethanol production. Biotechnology Progress,
29(4):1064-1082, 2013. http://onlinelibrary.wiley.com/doi/10.1002/btpr.1753/abstract
Publication III:
D. C. Lopez C., T. Barz, S. Korkel, and G. Wozny. Nonlinear ill-posed problem analysis
in model-based parameter estimation and experimental design. Computers & Chemical
Engineering, 77:24-42, 2015. http://dx.doi.org/10.1016/j.compchemeng.2015.03.002
Publication IV:
T. Barz, D. C. Lopez C., M. N. Cruz B., S. Korkel, and S. F. Walter. Real-
time adaptive input design for the determination of competitive adsorption isotherms
in liquid chromatography. Computers & Chemical Engineering, 94:104-116, 2016.
http://dx.doi.org/10.1016/j.compchemeng.2016.07.009
1
2
1. Introduction
1.1. Research motivation
Recovering unknown parameters (e.g., kinetic, transport and thermodynamic parameters)
from available experimental data is a key task in the development of process models. Their
determination involves the numeric solution of the parameter estimation (PE) problem.
This is typically a difficult task for various challenges arising from the model structure
and the experimental data. High non-linearity, poor model fitting, over-parameterized
and correlated-parameter models are challenges related to the model structure. The lack
of informative experimental data, correlated measured outputs, and highly corrupted ex-
perimental data are considered as challenges arising from the experiment.
The aforementioned challenges are well-accepted sources of poor or even non-
identifiability in linear and nonlinear estimations [3, 10, 41, 120, 124]. When this happens,
the estimation may be unstable showing convergence and numerical problems. In addition,
the recovered parameters may have large variance and are not reliable for further use. In
fact, identifiability deficiency may be mathematically represented by an ill-conditioned
problem. A problem is considered ill-conditioned if small errors in the data produce large
errors in the solution. In that case, the problem is said to be ill-posed due to the presence
of ill-conditioning. Over the last decades identifiability and ill-conditioning issues have
been separately addressed. Notwithstanding, the combined analysis of both in nonlinear
estimation and model-based experimental design has rarely been performed in theoretical
studies and much less in real applications.
Discrete ill-posed problems are often encountered in many fields and arise from discrete
available data or discretization of ill-posed problems. Examples of ill-posed inverse prob-
lems can be found in chemical engineering [5, 4, 8, 79, 78, 80, 83], damage monitoring [73],
image processing [59] and geophysics [48, 49] to name just a few of them. Specially in
engineering, their sound analysis is not yet common practice and difficulties arising from
the determination of uncertain parameters are typically not assigned properly. Regular-
ization techniques for the solution of ill-posed parameter estimation problems have been
developed and found wide application in numeric algorithms [13, 52, 63, 66]. However, a
critical analysis of the ill-posedness of a studied problem and the possible consequences
for its solution are frequently not carried out by the user.
To further improve the precision of parameter estimates, model-based or optimal exper-
imental design (OED) techniques have been developed with the aim to maximize the in-
formation content in experimental data [41, 101]. In recent years, approaches to handle ill-
posed problems in optimal experimental design have been addressed [4, 15, 48, 49, 59, 93].
3
1. Introduction
Nonetheless, due to their large complexity and computational burden they are only little in
use. Consequently, the results of common optimal designs, which turn out to be ill-posed,
might be doubtful. Furthermore, investigation about the estimator behavior in terms of
variance, bias and convergence in parameter estimations using regularized optimal designs
has been conducted neither. These matters will be addressed along this thesis.
1.2. Research scope
The focus of this thesis is to study the features, sources, consequences and treatments of
nonlinear ill-posed problems in the context of parameter estimation and optimal experi-
mental design of dynamical systems. Unstable estimations and poor robustness against
noise generated by ill-conditioning issues will be along this research the main character-
istic of the studied problems. They suffer from poor identifiability and large parameter
uncertainty.
The backbone of this research is the singular value analysis of the sensitivity matrix
which exploits the link between ill-conditioning and identifiability. With this in mind a con-
solidated computational framework to systematically evaluate ill-posedness in model-based
parameter estimation and experimental design is proposed. It includes ill-conditioning and
identifiability analysis as well as regularization techniques. After all, this thesis is based
on two major axes:
1. detection of ill-conditioning to diagnose identifiability issues (topics treated in Pub-
lication I [80], Publication II [79], Publication III [78] and Publication IV)
2. application and analysis of regularization techniques in ill-posed PE and OED prob-
lems (topics treated in Publication II [79], Publication III [78] and Publication IV).
The first major axis includes various techniques such as singular value analysis of the
sensitivity matrix to detect ill-conditioning and diagnose local identifiability issues; pa-
rameter variance-decomposition to determine the effect of ill-conditioning in parameter
precision; orthogonal decompositions to rank and determine identifiable parameters; dy-
namic sensitivity profile study to assess the excitation provided by different parameters
on different outputs among others techniques. Monte Carlo studies are accomplished in
order to support conclusions obtained by the singular value analysis.
The second major axis discusses different numerical approaches to deal with nonlinear
ill-posed problems. Hence, regularization in PE as well as in OED are investigated. Three
regularization techniques will be treated, namely parameter subset selection (SsS), Trun-
cated Singular Value Decomposition -TSVD (generalized inverse) and the Tikhonov regu-
larization. For the regularized optimality criteria in OED only the variance contribution
will be considered [3, 38, 78]. This approach reduces the computational effort compared to
the bi-level optimization approach proposed in [49, 59, 73]. Moreover, it allows to better
4
1.2. Research scope
understand the effects of ill-posedness and its numerical stabilization treatments on the
typical OED design criteria. The performance of mentioned regularization techniques in
PE is assessed by Monte Carlo Studies.
The above-mentioned axes are fundamental in model-based parameter estimation and
experimental design since ill-posedness is a common problem to deal with. They can be
independently applied in both optimizations. Although, in some case studies the possi-
bility to include one or both major axes will be shown either in parameter estimation
or experimental design, in the framework of Chapter 4 they are combined. This general
computational framework is conceived to be applied to the development of process mod-
els using model-based experimentation under ill-posedness. It should be noted that the
framework might be applied even to well-posed problems. In fact, it should be applied to
general problems regardless their ill-posedness state.
The framework considers the solution of parameter estimation and optimal experimen-
tal design as foremost computational tasks. In parameter estimation, two paradigms to
analyze an estimator are described. The first paradigm uses parameter-output sensitivity
information, whereas the second paradigm is conducted via Monte Carlo studies. Strate-
gies to mitigate the ill-conditioning and improve the model identification such as model
structure modification, experimental data analysis, parameter initial guess selection and
numerical regularization in the case studies are well illustrated. In optimal experimental
design a link between the singular value analysis to the common alphabetic design criteria
is established, and information for the solution of the ill-posed optimal experimental de-
sign is derived. The implications to perform the typical alphabetic optimal experimental
design under ill-conditioning and identifiability problems are also discussed. This situation
can even occur after the identifiability issues are previously detected. Nonetheless, there is
no clear understanding of the behavior of an optimal experimental design arising from an
ill-posed parameter estimation and its actual impact on the next estimations. Finally, it is
also shown how the experimental design performs when the parameter covariance matrix
is approximated by a sensitivity matrix from a regularized estimation.
In order to accomplish the goal of this thesis, several contributions in scientific journals
and conferences, which were achieved during this PhD research, are cited. Although, the
manuscript is essentially based on four full-text peer-review articles (for the list presented
see the Appendix A.2), the other contributions in which the author of this thesis was
co-authoring are here mentioned as further examples/applications. The complete list of
the contributions in journals as well as in conferences are displayed in the Appendix
A.1. Furthermore, supplementary data, computations and analysis (which have not been
published yet) along the manuscript are also provided. The combination of published and
new information is used to establish the entire computational framework and to present
an overall discussion and conclusion, which is more than just the summation of individual
publications.
5
1. Introduction
1.3. Research outline
The matter of the following chapters will contain summaries and a selection of results
of the associated publications which this thesis is based on. The structure of the thesis
is displayed in Figure 1.1. Note that each individual peer-reviewed article on the list of
Appendix A.2, which was produced during this project, has its own methodology and case
study which served to the purpose of the specific publication. This thesis however is con-
ceived as unifying the segregated methodologies in the mentioned peer-reviewed articles
in a consolidated computational framework displayed in Chapter 4. For doing so, specific
chapters are required, namely, theoretical background chapters (Chapters 2 and 3) and the
computational framework chapter (Chapter 4). Chapter 2 summarizes the mathematical
formulations, definitions and techniques related to well-established concepts about parame-
ter estimation and optimal experimental design. Chapter 3 deals with the major concepts
regarding ill-posed problems and regularization. The framework in Chapter 4 incorpo-
rates various levels with strategies to select parameter initial guess, to assess the estimator
performance, to detect ill-conditioning, to analyze identifiability issues, to regularize a pa-
rameter estimation and optimal experimental design. These levels are connected by the
singular value analysis of the output-sensitivity matrix, which provides an explanation of
the parameter variability.
Once unified the computational framework, the next chapters of this thesis exhibit six
case studies from bioprocess and chemical engineering which were the subject of the peer-
reviewed articles. Each case study illustrates fragments of the computational framework
and applications of them to overcome typical deficiencies or weak practices in parameter
estimation and optimal experimental design. In Chapter 5 the impact of poor informa-
tive experimental data on ill-conditioning and identifiability of an energy storage system,
namely a Lithium-ion battery is analyzed. In Chapter 6 the case of determining parameters
of an over-parameterized and parameter-correlated model in the context of Bio-ethanol pro-
duction is investigated. In Chapter 7 the optimal experimental design for three ill-posed
bio-processes, namely, a semi-continuous fermentation, a biochemical growth reactor and
water treatment is studied. In Chapter 8 the parameters and the experimental design for
a chromatography system are online optimized. Special attention to handle the deficiency
of scarce experimental data in online estimation is paid.
Model selection and reduction as well as parameter initial guess selection are briefly dis-
cussed in Chapter 6. The ill-conditioning analysis is formally applied in Chapters 5, 7 and
8. Practically identifiability diagnosis and estimator performance assessment are exhibited
in all case studies applied on energy storage systems in Chapter 5, bio-ethanol production
in Chapter 6, bioreactors for several purposes in Chapter 7 and liquid chromatography
separation in Chapter 8. Regularization in parameter estimation is treated in Chapters
6, 7 and 8. Moreover, the analysis of regularized optimal experimental design along with
some recommendations to select regularization parameters are shown in Chapters 7 and
6
1.3. Research outline
8. Dynamic sensitivity profiles to assess the excitation provided by different parameters
on different outputs are presented in Chapter 5. In order to support findings obtained
with the sensitivity method in Chapters 5, 7 and 8 Monte Carlo analyzes are conducted.
In addition, at the end of each chapter conclusions about the respective case study and
implications of the use of the framework are given. Finally, in Chapter 9 the work is
summarized and an outlook is given.
7
1. Introduction
TH
ES
IS:
Syste
matic e
valu
ation o
f ill
-posed p
roble
ms in m
odel-based
para
mete
r estim
ation a
nd e
xperim
enta
l desig
n
Th
eo
reti
ca
l b
ac
kg
rou
nd
I,
II
Ch
ap
ter
2 a
nd
3:
Co
mp
uta
tio
na
lfr
am
ew
ork
Ch
ap
ter
4:
Ca
se
stu
die
s
Intr
od
uc
tio
n
Ch
ap
ter
1:
Lit
hiu
m-i
on
ba
tte
ry
Ch
ap
ter
5:
Pu
blic
ati
on
I:
Ló
pez e
t a
l. 2
01
6
A c
om
puta
tional f
ram
ew
ork
for
identifiabili
ty a
nd ill-
conditio
nin
g a
naly
sis
of lit
hiu
m-ion b
att
ery
models
.
Bio
eth
an
ol
Ch
ap
ter
6:
Pu
blic
ati
on
II: L
óp
ez e
t a
l. 2
01
3
Model-based id
entifiable
para
mete
r
dete
rmin
ation a
pplie
d to a
sim
ultaneous
saccharification
and F
erm
enta
tion p
rocess
model f
or
bio
-eth
anol p
roduction.
Div
ers
e b
io-p
roc
es
se
s
Ch
ap
ter
7:
Pu
blic
ati
on
III:
Ló
pez e
t a
l. 2
01
5.
Nonlin
ear
ill-p
osed p
roble
m a
naly
sis
in
model-
based p
ara
mete
r estim
ation a
nd
experim
enta
l desig
n.
+
New
In
form
ati
on
Ch
rom
ato
gra
ph
y s
ys
tem
Ch
ap
ter
8:
Pu
blic
ati
on
IV
: B
arz
et
al. 2
01
6
Real-tim
e a
daptive in
put desig
n for
the
dete
rmin
ation o
f com
petitive a
dsorp
tion
isoth
erm
s in
liq
uid
chro
mato
gra
phy.
+
New
In
form
ati
on
Mo
de
l se
lection
Pu
bs
. II
Mo
de
l re
du
ction
Pu
bs
. II
Pa
r. in
itia
lg
uess
se
lection
Pu
bs
. II
Ill-
co
nditio
nin
g
an
aly
sis
Pu
bs
. I,
II,
III
, IV
Pra
ctica
lly
ide
ntifiabili
ty d
iag
nosis
Pu
bs
. I,
II,
III
, IV
Estim
ato
r p
erf
orm
ance
a
sse
ssm
ent
Pu
bs
. I,
II,
III
, IV
PE
with
reg
ula
rization
Pu
bs
. II
, II
I, IV
OE
D w
ith
reg
ula
rization
Pu
bs
. II
I, IV
Application:
Over-parameterized
and parameter-correlated model
Application:
Scarce experimental
data in online estimation
Application:
Improved selection of
experimental data
Application:
Effects of ill-posed
PE on OED
Re
gula
rization
pa
ram
ete
r se
lection
Pu
bs
. II
I, , IV
Ch
ap
ter
5, 6
, 7
, 8
:
Figure
1.1.:Structure
oftheThesis
8
2. Theoretical background I: Model-based
parameter estimation and experimental
design
This chapter summarizes the mathematical background, basic definitions as well as the
required notation of model-based parameter estimation and experimental design used in
the computational framework of Chapter 4 to systematically evaluate ill-posedness in
process models. It is important to point out that all information in this chapter is typical
of well-posed problems. The description of ill-posed problems and its implications in
model-based parameter estimation is made in Chapter (3).
In order to support all matrix concepts mentioned in this chapter, formal definitions,
propositions, theorems and lemas exposed in Appendix A.5 about matrix notions are
referenced. Special attention is paid to the iterative refinement of the experimental design
together with a repeated update of the parameter values. As an extension to this so-
called adaptive design, the idea of an online model-based redesign of experiments is also
explained.
The chapter is organized as follows. First, the basis to develop models throughout model-
based experimentation is presented. Second, generalities of the type of models concerning
to this thesis are given. Third, the optimization formulation for parameter estimation
is outlined. Fourth, the components to evaluate the performance of an estimator, i.e.,
precision and accuracy, are defined. Fifth, the concepts of identifiability are summarized
and finally the optimal experimental design including the sequential (adaptive) and online
approach is described.
2.1. Model development
When a process is running in any scaling either in laboratory, pilot plant or industry the ad-
vantages to model it are substantial. Benefits such as new process/product/configuration
design, profit increment, high quality product, energy saving, cost and loss reduction,
environmental damage mitigation, yield maximization among others are results of using
models in process and product design, control, operation and optimization. In literature
there exists several kinds of approaches to model a (dynamic) process. Black-box models
(e.g., neuronal networks, genetic algorithms, etc.), first-principles models or hybrid models
(which combines first-principles and empirical models) are some of them. The challenge is
to figure out how to develop the suitable model (in structure and quality of parameters) for
9
2. Theoretical background I: Model-based parameter estimation and experimental design
the specific process. To do so, the experimentation can be used to either select an adequate
model structure or to guarantee low uncertainty in model parameters instead of directly
enhancing the process itself. In Figure 2.1 the so-called model-based experimentation for
model development is depicted.
Optimal
experimental
design
max. discrimination criterion
or
min. par. variance
Experimentation
Parameter
estimation
min. model error
Modeling
Figure 2.1.: Iterative work cycle of model-based experimentation for model development
The work cycle in this figure considers the execution of experiments to collect mea-
surements (Experimentation stage), the mathematical formulation of models to compute
the process states (Modeling stage), the estimation of parameters (Parameter estimation
stage), and posterior (as required) design of new experiments (Optimal experimental de-
sign stage). This strategy is iterative and integrates experimental techniques, mathemat-
ical modeling, model identification and optimal experimental design. The estimation of
the model parameters is achieved by minimizing the model error to available experimen-
tal data. This estimation along with the model structure selection constitute the known
model identification task.
In Figure 2.1 two model-based experimentation work cycles are displayed. The inner
work cycle (dashed-line circle) deals with the selection of the model structure among several
model candidates. Whereas the outer work cycle (solid-line circle) treats the improvement
of the parameter precision for a fixed model structure. In order to pass from the inner to
10
2.2. Model formulation
the outer level, the best model structure should be selected. After having this fixed struc-
ture a reliable estimation of parameters from the available data is carried out. In both
levels the parameter estimation and optimal experimental design are required. Nonethe-
less, the optimal experimental design (which is an optimization problem) has different
cost function (objective function). In this optimization the process model structure is a
constraint and the degrees of freedom are the experimental conditions. For model struc-
ture selection (inner work cycle in Figure 2.1) the optimal design is asked to maximize
the difference of model predictions among the model candidates. This task is known as
model discrimination. Whilst for improving the parameter precision the experimental de-
sign is addressed to reduce a parameter variance metric (e.g., parameter variance average
or the largest parameter variance) of a unique model. Each level terminates when some
stop criterion, for instance, on predictive quality/model fitting or parameter variance, is
accomplished. Note, that in the model structure selection level the number of parameter
vectors to be estimated depends on the number of model candidates, whereas in the level
for improving parameter precision it is only one.
One of the interests of this thesis is to improve the understanding of the functionality
of the outer cycle in Figure 2.1 applied to nonlinear first-principles models additionally
contemplating diagnosis and treatments of ill-conditioning and identifiability issues. The
basis of optimal experimental design are given in Section 2.7. Further explanations about
the outer cycle in Figure 2.1 are summarized in the adaptive or sequential model-based
design of experiments of Section 2.7.4. Finally the extension to real-time applications of
the outer cycle is explained in Section 2.7.5.
2.2. Model formulation
Dynamic processes are frequently described by mechanistic models, which may be formu-
lated by using systems of ordinary differential equations (ODEs), differential and algebraic
equations (DAEs) or partial differential and algebraic equations (PDAEs). A PDAE sys-
tem is usually discretized in space (for instance, in radial and axial coordinates) to obtain
a set of DAEs.
The mechanistic model is here represented by a system of DAEs of the general implicit
form in Eq. 2.1. This representation considers OED models when there is no algebraic
portion in the formulation as well as PDAE models after discretization:
0 = f(t, x, x, u, θ) (2.1a)
x(t0) = x0(u0, θ), x(t0) = x0(u0, θ) (2.1b)
where x, x ∈ RNx are vectors containing the state variables and their first derivatives,
respectively, t ∈ R is the independent variable (here t is time); u ∈ RNu contains input
actions which are time-varying (i.e., experiment design vector or controls), and θ ∈ Ω ⊂
11
2. Theoretical background I: Model-based parameter estimation and experimental design
RNθ is the unknown parameter vector. Note that Ω is known as the parameter space and
that x is considered to have sub-vectors, denoted xd and xa, which are its differential and
algebraic parts, respectively. The mapping f(·) : RNx × RNu × RNθ → RNx is assumed
to be twice continuously differentiable. The initial values x0, x0 of the DAE system (Eqs.
2.1b) are both initialized to satisfy 0 = f(t0, x0, x0, u0, θ) for fixed u0 = u(t0) and θ. The
input actions u can be represented by piecewise polynomial approximations e.g., constant,
linear or quadratic functions. Here, a piecewise constant approximation is considered. In
addition to the system in Eq. 2.1 also consider the vector of the observed model responses
y ∈ RNy given by the mapping h(·) : RNx → RNy
y(t, u, θ) = h(x(t, u, θ)), (2.2)
which is also assumed twice continuously differentiable. The observation mapping h(·)relates the measured response variables y(t, u, θ) and the state variables x(t, u, θ). Fre-
quently, h(·) is just a function to select those state variables which are indeed measured.
The measured variables are in the vector ym ∈ RNy . The i-th entry of vectors y and ym is
denoted as y(i) and ym(i), respectively.
Due to in realistic experiments the response function in Eq. 2.2 is discretely ob-
served, a set of time measurement points T = t1, . . . , tNm is established. More-
over a set of time switching input action points T u =tu1 , · · · , tuNmu
and the set
of experiments E := ξ1, . . . , ξNe are also considered. Each experiment ξj ∈ E for
j = 1, · · · , Ne has a corresponding input action vector uξj ∈ RNu·Nmu and observed model
responses yξj (tk, uξj , θ) ∈ RNy , tk ∈ T , ξj ∈ E . The inputs are collected in the vector1
u := (uξ1 , · · · , uξNe) ∈ RNu·Nmu·Ne and the model responses are collected in the vector
Y (u, θ) := (yξ1(t1, uξ1 , θ), . . . , yξ1(tNm , uξ1 , θ), . . . , yξNe(t1, uξNe
, θ) . . . , yξNe(tNm , uξNe
, θ)),
(2.3)
where Y (u, θ) ∈ RNy ·Nm·Ne . The corresponding set of observed (measured) response vector
is Y m ∈ RNy ·Nm·Ne . It is also reasonably considered that each measurement ym(i)ξj(tk) is
affected by a random error ϵi such that
ym(i)ξj(tk) = y
(i)ξj(tk, uξj , θ
∗) + ϵi, i = 1, . . . , Ny, tk ∈ T , ξj ∈ E , (2.4)
where y(i)ξj(tk, uξj , θ
∗) is the i-th true underlying output at the true parameter vector
θ∗ ∈ RNθ , tk ∈ T , and ξj ∈ E corrupted by the measurement error ϵi. The errors ϵi for
i = 1, . . . , Ny are assumed to be realizations of the random variables Ei which satisfy the
following assumptions:
1For the sake of simplicity the same symbol u of the finite dimensional input variables in Eq. 2.1 are hereused.
12
2.3. Parameter estimation
• the errors Ei are normal-distributed with mean zero (E[Ei] = 0) and finite known
variance (V ar[Ei] = (σy,i)2 <∞), i.e., Ei ∼ N (0, σ2
y,i),
• the errors Ei, Ej are independent, i.e., Cov(Ei,Ej) = 0 whenever i = j, and are also
identically distributed.
According to Eq. 2.4 the observations ym(i)ξj(tk) are also random variables with normal
distribution. Having so, the total observed measured response vector Y m has a mean equal
to the model output at θ∗, i.e., E[Y m] = Y (u, θ∗), and a constant covariance matrix in time,
V ar[Y m] = Cy, i.e., Ym ∼ N (Y (u, θ∗), Cy). Under these assumptions the measurement
covariance matrix Cy is diagonal with entries given by the variances σ2y,i. It is also defined
Σy := C1/2y ∈ RNy ·Nm·Ne×Ny ·Nm·Ne as the measurement standard deviation matrix.
2.3. Parameter estimation
Parameter estimation is one of the major areas of statistical inference whose methods use
sample information for drawing conclusions about a population. In the inverse problem
of recovering the unknown parameters θ from finite experimental data Y m of the model
described in Eq. 2.1, the sample data is utilized to calculate a reasonable single value of
each parameter close to its true value in θ∗.
2.3.1. Mathematical formulation
A parameter estimation is an optimization problem with the following form
θ := argminθ
ΦLSQ(u, θ) (2.5a)
ΦLSQ(u, θ) :=1
2(Y (u, θ)− Y m)TC−1
y (Y (u, θ)− Y m), (2.5b)
where ΦLSQ denotes the cost function (objective function) of the weighted nonlinear
least-squares criterion, the weighting matrix C−1y ∈ RNy ·Nm·Ne×Ny ·Nm·Ne is the inverse of
the experimental error covariance matrix Cy of the measured data, and θ is a solution
of the problem in Eq. 2.5 called point estimate of a population parameter θ [89]. The
estimator of θ is represented by Θ. The assumptions in Section 2.2 of errors being unbiased,
independent, identical and normal distributed guarantees that the estimator Θ and then
the particular vector (or point estimate) θ of Eq. 2.5 are equivalent to the solution of the
maximum likelihood problem [3]. It means that the minimization of the sum of squares
error in Eq. 2.5 is equivalent to maximize the likelihood of obtaining the available observed
responses for a given parameter value. An estimator coming from the maximum likelihood
is an estimator with good statistical properties, i.e., the estimator is unbiased, has an
approximate normal distribution and is one with the smallest variance [89]. When the
estimation is well-posed, Θ is the asymptotically maximum likelihood unbiased estimator.
13
2. Theoretical background I: Model-based parameter estimation and experimental design
For the nonlinear parameter estimation the Eq. 2.5 is iteratively solved [3] by using
θk+1 = θk + ρkυk, (2.6)
where ρk is the step size and υk is the step direction. Steepest-descent and Newton
type algorithms (e.g., Gauss-Newton, Levenberg-Marquardt [82] and trust region methods
[110]) are typical gradient-based solution methods. In those approaches, the Jacobian
matrix Jθ(u, θ) ∈ RNθ
Jθ(u, θ) = ∇θΦLSQ (2.7a)
= ∇θY (u, θ)TC−1y (Y (u, θ)− Y m), (2.7b)
and the Hessian matrix Hθ(u, θ) ∈ RNθ×Nθ
Hθ(u, θ) = ∇2θΦ
LSQ (2.8a)
= ∇θY (u, θ)TC−1y ∇θY (u, θ) +∇2
θY (u, θ)TC−1y (Y (u, θ)− Y m) (2.8b)
of ΦLSQ in Eq. 2.5 are required to compute the step direction υk. For instance in
Newton’s method υk = Rkqk, where Rk = H−1θ and qk = Jθ [3].
As seen in Eqs. 2.7b and 2.8b, the Jacobian matrix Jθ(u, θ) and Hessian matrix
Hθ(u, θ) are based on the gradient of the predicted outputs with respect to the param-
eters ∇θY (u, θ) which is the well-known parameter-output sensitivity matrix S(u, θ) ∈RNy ·Nm·Ne×Nθ (see Section 2.3.2 to get details of its calculation)
S(u, θ) := ∇θY (u, θ), (2.9)
and the scaled residual vector Z = Σ−1y (Y (u, θ)− Y m) ∈ RNy ·Nm·Ne .
The Hessian Hθ(u, θ) in Eq. 2.8b can be further simplified following the Gauss approxi-
mation (i.e., neglecting the second term in Eq. 2.8b) as
Hθ(u, θ) ≈ ∇θY (u, θ)TC−1y ∇θY (u, θ) (2.10)
and using the definition of S(u, θ) it turns out as
Hθ(u, θ) ≈ STC−1y S = (Σ−1
y S)T (Σ−1y S) = ST S. (2.11)
Note that in Eq. 2.11 the sensitivity matrix is taken as
S = Σ−1y S, (2.12)
which will be called as the scaled2 form of the sensitivity matrix for the sake of simplicity.
2In the case of measurements with the same measurement variance, i.e., σy,i = σy for i = 1, · · · , Ny,
14
2.3. Parameter estimation
The simplified Hessian in Eq. 2.11 is identical to the Fisher-Information matrix
F (u, θ) = STC−1y S, (2.13)
where F is the Nθ × Nθ dispersion matrix, which belongs to the subspace of positive
semi-definite matrices PSD(Nθ) (see Section A.5.15 of Appendix A.5).
2.3.2. Parameter-output sensitivity matrix
The parameter-output sensitivity matrix S previously defined in Eq. 2.9 expresses the
local first-order derivatives (sensitivities) of the model responses collected in Y (u, θ) to
the parameters collected in θ. These derivatives show how much perturbations on the
estimated parameter vector θ impacts on the predicted outputs of the model Y (u, θ). The
j-th column of the sensitivity matrix S contains the dynamic sensitivities of each measured
response variables yi ∈ y with respect to the parameter θj , i.e.,∂yi∂θj|t for t ∈ T .
The first-order derivative information in S may be computed either by finite differences
or integrating the original model (Eq. 2.1) along with the so-called sensitivity equations [7,
81, 121], or by automatic differentiation. When finite differences are used, a differentiation
scheme such that the central finite differences
∇θY (u, θ) =∂Y (u, θ)
∂θ≈ Y (u, θ +∆θ)− Y (u, θ −∆θ)
2∆θ(2.14)
can be applied, where ∆θ =θ√U is the parameter perturbation and uses the machine
unit roundoff U .
When the sensitivity equations are employed, the so-called forward sensitivity analysis is
performed. This analysis formulates the (forward) sensitivity equations
0 =∂f
∂xJx
∂x
∂θSx
+∂f
∂xJx
∂x
∂θSx
+∂f
∂θJθ
(2.15a)
∂x
∂θ|t0 =
∂x0∂θ
,∂x
∂θ|t0 =
∂x0∂θ
, (2.15b)
which are obtained by applying the differentiation chain rule to the original DAEs in Eq.
2.1. The variable-coefficient matrices Jx ∈ RNy ·Nm·Ne×Ny ·Nm·Ne , Jx ∈ RNy ·Nm·Ne×Ny ·Nm·Ne
and Jθ ∈ RNy ·Nm·Ne×Nθ are the respective Jacobians of f with respect to x, x and θ. The
Jacobian of the system is represented by Jx. By simultaneously solving this linear matrix
system of differential equations along with the model in Eq. 2.1, the sensitivity matrix of
the state variables Sx is obtained. In order to compute the sensitivity matrix S (i.e., only
considering the observed model responses in y) the result Sx from Eq. 2.15 and the chain
the matrix S is indeed scaled by σ−1y such as S = σ−1
y S
15
2. Theoretical background I: Model-based parameter estimation and experimental design
rule applied to Eq. 2.2 with respect to θ are used
∂y
∂θS
=∂h
∂x
∂x
∂θSx
(2.16)
If the state variables are the only measured, i.e., y ⊆ x then S = Sx. The scaled form
of the sensitivity matrix (i.e., S) is also computed when precision of the measurements is
available. Note that S encompasses the influence of the parameters and the reliability of
the available measurements in one matrix.
The sensitivity information is used in this research to determine the most influential
parameters to the outputs, the most linear independent parameters, the state of ill-
conditioning of the estimation and the identifiability of the model. In the forthcoming
analysis the scaled sensitivity matrix S will be used.
Sensitivity measure
The sensitivity measure δj in Eq. 2.17 is introduced to assess the individual parameter
weight in terms of sensitivities in the parameter estimation problem. It is based on the
Euclidean Norm of the j-th column of S and measures the mean sensitivity of the total
observed model response vector Y to changes in the parameter θj [17]. A high value of
δj means that the parameter θj has a considerable impact on the simulation result of the
responses, while values near zero or zero mean that their simulation result does not depend
on the parameter θj . It implies that the parameter with the largest δj for j = 1, · · · , Nθ is
called the most sensitive parameter, whereas the parameter with the smallest δj is called
the most insensitive.
δj =
√ 1
Ny ·Nm ·Ne
Ny ·Nm·Ne∑i=1
s2ij (2.17)
2.4. Singular value decomposition
This section describes the singular value decomposition (SVD) of the sensitivity matrix
S, which will be the cornerstone to conduct identifiability and ill-conditioning analysis on
the parameter estimation problem of Eq. 2.5. The connection between S and F by means
of the singular values of S and the eigenvalues of F is here exploited.
16
2.4. Singular value decomposition
2.4.1. SVD of the sensitivity matrix
The SVD explain in Section A.5.9 and declared in the Definition A.5.9 of the Appendix
A.5 is here used to decompose the sensitivity matrix S
S = USvVT =
Nθ∑i=1
ςiuivTi , (2.18)
where Sv ∈ RNy ·Nm·Ne×Nθ is a diagonal matrix Sv = diag(ς1, ς2, · · · , ςNθ) with nonnegative
diagonal elements ς1 ≥ ς2,≥ · · · ,≥ ςNθ> 0 which are the singular values of S. The
matrices U ∈ CNy ·Nm·Ne×Ny ·Nm·Ne and V ∈ CNθ×Nθ are unitary matrices. The triplet
(U, Sv, V ) is called the singular value decomposition of S. The columns vi ∈ V with
i = 1, · · · , Nθ and uj ∈ U with j = 1, · · · , Ny ·Nm ·Ne are called the right and left singular
vectors of S, respectively.
2.4.2. SVD of the sensitivity matrix vs the eigensystem of Fisher-information
matrix
The Fisher-information matrix F in Eq. 2.13 may be also declared as a function of the
singular values of S by using the proposition in Section A.5.9 of the Appendix A.5, which
relates the SVD of S in Eq. 2.18 and the eigendecomposition of ST S, i.e., F . With relation
between SVD and eigendecomposition, F may be written as
F = V (STv Sv)V
T =
Nθ∑i=1
ς2i vivTi . (2.19)
The Eq. 2.19 is the eigensystem of the Fisher-information matrix F whose diagonal
elements of matrix S2v (the square of the singular values ςi) are the eigenvalues of the real
symmetric matrix F . This relation is shown in Eq. 2.20, where the eigenvalues of F are
denoted as λi(F ), with i = 1, · · · , Nθ.
ς2i = λi(F ) (2.20)
The SVD of S provides information that encompasses the information given by the
eigensystem of F . As a practical matter in identifiability and ill-conditioning analysis
(both lumped as the structural analysis) there are reasons for preferring the use of the
SVD of S than the eigenvalues of F [12]. Firstly, it directly applies to the sensitivity
matrix S instead of F , which is also used for local identifiability analysis of nonlinear
parameter estimation. Secondly, although the computation of the SVD is typically more
expensive than the eigensystem, in the context of larger problems with sparse matrices,
the SVD can be performed more efficiently [14, 130]. Moreover, the computation of F in
Eq. 2.13 corresponding to (Ny ·Nm) ·N2θ operations can be saved. Thirdly, the SVD of
17
2. Theoretical background I: Model-based parameter estimation and experimental design
S where S is ill-conditioned can be computed with much greater numerical stability than
the eigensystem of F [12]. In this context, when the system is extremely ill-conditioned
(see Section 3.2.1), the accuracy of the smallest eigenvalues of F is really doubtful [12].
Furthermore, additional numerical errors could take place in the computation of F in Eq.
2.13 which could also lead to the presence of unexpected negative eigenvalues [78].
2.5. Parameter estimator analysis
The performance of the estimator Θ given the sample data Y m of the parameter estimation
in Eq. 2.5 may be assessed analyzing some of its statistical properties. Here it is considered
the variance V ar(Θ) and the bias β(Θ). The former arises from the variability generated
by measurement errors and the latter is the difference between the expected value of the
estimator Θ and the true (or reference) parameter θ∗. The bias (also referred as accuracy)
measures the systematic error, whereas the variance (also referenced as precision) measures
the random error [3, 93, 126]
In practice, it is said that a good estimator is that with a small bias and a small variance
[51, 89]. A desirable estimator should be unbiased (β(Θ) = 0) and have minimum variance
(min V ar(Θ)). Unfortunately, these two properties are often unattainable (although the
theoretical maximum likelihood estimator has these natural properties). The best that it
can be done is to test them and keep them small as possible.
2.5.1. Estimator precision
The precision of the parameter estimator Θ may be assessed by using its parameter co-
variance matrix C and its confidence interval CI [3, 38, 74, 89, 127]. Before establishing
the corresponding definitions of C and CI, the main assumption regarding the parameter
probability distribution will be clarified.
In the parameter estimation problem of Eq. 2.5 the point estimate θ is a function of
the total observed measured response vector Y m. Therefore, θ is considered a random
vector because of Y m is also considered random. In that sense, Θ is an estimator of θ
such that θ ∈ Θ. Furthermore, under a number of regularity and sampling conditions, as
Ny ·Nm ·Ne →∞, the estimator Θ approximately follows a normal distribution
Θ ∼ N (θ∗, C) (2.21)
with mean the true parameter θ∗ and covariance matrix C [74, 89]. The probability
distribution of Θ is called a sampling distribution.
Covariance matrix
The general definition of a covariance matrix is set up in Section A.5.14 (Eq. A.17) of the
Appendix A.5. In the context of parameter estimation, the parameter covariance matrix
18
2.5. Parameter estimator analysis
C ∈ RNθ×Nθ reads as follows:
C := E[(Θ− E(Θ))(Θ− E(Θ))T
](2.22)
This is a symmetric positive matrix whose (i, j)-element σ2θij is the covariance of the
i-th parameter θi with respect to the j-th parameter θj for i, j = 1, · · · , Nθ. The diagonal
element σ2θj is the variance of the j-th parameter θj .
Confidence interval
The confidence interval CI supplies the bounds of plausible values of the unknown param-
eter. CI is constructed to have a high confidence that it contains the unknown parameter
despite the fact it cannot be said that the interval contains the true, unknown parame-
ter. The interval is a measure of the uncertainty inherent in the estimator Θj . A narrow
confidence interval indicates that the effect size is known precisely. On the contrary, long
intervals show little knowledge about the true parameter (in well-posed estimations), and
that further information is needed. Another interpretation of a 95% CI says that for an
infinite number of solutions of the estimation problem using the same experimental con-
ditions and generating the same experimental data but considering random measurement
errors (normally and independently distributed with mean zero and constant variance),
the true value of the parameter θj will be contained within 95% of all observed confidence
intervals.
A confidence interval for the parameter θj is an interval of the form
lbj ≤ θ∗j ≤ ubj . (2.23)
where the end-points lbj and ubj are called the lower- and upper-confidence limits, respec-
tively. lbj and ubj change with each sample data which is analyzed. These end-points
are values of random variables LBj and UBj and CI is considered a random interval [89].
The probability of selecting a sample in which the interval contains the true value of θj is
called 1− α [84, 89]:
PLBj ≤ θ∗j ≤ UBj
= 1− α. (2.24)
This interval is known as the 100(1− α)% confidence interval with alpha being called the
confidence level. The CI in Eq. 2.24 because provides both the lower and upper confidence
limits is also referred as to a two-side CI.
In parameter estimation typically the true value and the variance of θj are unknown and
moreover the experimental data could be scarce. In that case, the form of the underlying
parameter distribution to obtain a valid CI must be assumed. A reasonable assumption
is a normal distribution [89]. The expected value or mean of the estimator Θj and its
variance, i.e., E[Θj ] and σ2θj, respectively, are then used to characterize this arbitrary
19
2. Theoretical background I: Model-based parameter estimation and experimental design
normal distribution. In order to construct a two-side CI on θj the normally (but not
standard) distributed E[Θj ] with unknown mean θj and unknown variance is usually
standardized with the random variable
T =(E[Θj ]− θ∗j )
σθj(2.25)
which has a t-distribution with Ny · Nm · Ne − Nθ degrees of freedom (DoF). σθj is the
standard deviation of Θj computed as the squared root of the j-th diagonal element of C.
If tα/2,(Ny ·Nm·Ne−Nθ) is the upper 100α/2 percentage point of the t-distribution with DoF
equals to Ny ·Nm ·Ne −Nθ, then the 1− α probability for the t distribution is
P−tα/2,(Ny ·Nm·Ne−Nθ) ≤ T ≤ tα/2,(Ny ·Nm·Ne−Nθ)
= 1− α. (2.26)
Substituting Eq. 2.25 in Eq. 2.26 and manipulating it, this last equation results in
PE[Θj ]− tα/2,(Ny ·Nm·Ne−Nθ)σθj ≤ θ∗j ≤ E[Θj ] + tα/2,(Ny ·Nm·Ne−Nθ)σθj
= 1− α.
(2.27)
The lower and upper limits of the inequalities in Eq. 2.27 corresponds to the lower-
and upper- confidence limits LBj and UBj in Eq. 2.24, respectively. This leads to the
formulation of the 100(1 − α) percent two-sided confidence interval on θj from a sample
data
E[Θj ]− tα/2,(Ny ·Nm·Ne−Nθ)σθj ≤ θ∗j ≤ E[Θj ] + tα/2,(Ny ·Nm·Ne−Nθ)σθj (2.28)
where lbj = E[Θj ]− tα/2,(Ny ·Nm·Ne−Nθ)σθj and ubj = E[Θj ] + tα/2,(Ny ·Nm·Ne−Nθ)σθj .
2.5.2. Estimator accuracy
Accuracy is a qualitative term defined as the distance between the estimated (or observed)
value and the true or reference value. Bias on the other hand is the quantitative term
describing the previous difference. The bias is given by,
β(Θ) = E[Θ]− θ∗. (2.29)
If an estimator is unbiased, the bias is zero and then E[Θ] = θ∗. However, an estimator
based on a finite sample, can additionally be expected to differ from the true parameter
due to systematic errors, for instance. This estimator is called a biased estimator.
A measure of the expected performance of the biased parameter estimator Θ may be
computed by means of the Mean Square Error MSE which is the expected squared differ-
ence between Θ and θ∗. MSE simultaneously considers the effect of the bias β(Θ) and
20
2.5. Parameter estimator analysis
the variance V ar(Θ) and is given by,
MSE =β(Θ)
2 + var(Θ)2 . (2.30)
The first term on the right-hand side is the squared norm of the bias and the second
term is a metric of the parameter variance. One scalar measure of the parameter variance
is the trace of the parameter covariance matrix Tr[C] = sum(diag(C)) [57, 83]. In fact,
Tr[C] follows the A-optimality criterion used in optimal experimental design [101].
2.5.3. Reliability tests
Two statistical procedures are considered to assess the reliability of the model parameters
after performing an estimation, namely the hypothesis and confidence interval tests. Both
tests evaluate the statistical significance of the parameter θj by analyzing its corresponding
estimator Θj .
Hypothesis test
This test allows to evaluate if the parameter θj might be dropped from the model. The
statistical significance of θj with j = 1, · · · , Nθ is obtained by the hypothesis-testing
procedure in which the null hypothesis H0 is contrasted to its negation in the alternative
hypothesis H1:
H0 : θj = 0 (2.31a)
H1 : θj = 0, (2.31b)
This test relies on the comparison of a test statistic with a reference value from the two-
tails Student’s t-distribution with degrees of freedom equals to Ny ·Nm ·Ne −Nθ [41, 84].
In order to accept or reject the null hypothesis H0 the Student t-value (or test statistic)
is defined as
TH0 =E[Θj ]
σθj, (2.32)
which uses the expected value of the parameter estimator E[Θj ] of the j-parameter θj
and its corresponding standard deviation σθj (the squared root of the j-th entry of
the diagonal of the parameter covariance matrix C. The null hypothesis is rejected if
|TH0 | > tα/2,(Ny ·Nm·Ne−Nθ), meaning that the parameter θj contributes significantly to the
model. On the contrary, very low |TH0 |-values are evidence that the parameter could be
statistically excluded from the model. It should be noted that in Eq. 2.32 the value of TH0
is inversely proportional to the parameter standard deviation σθj . Hence, if a determined
parameter has large variance or uncertainty it might not pass the test and is a candidate to
21
2. Theoretical background I: Model-based parameter estimation and experimental design
leave the model. That is actually true when the observable predicted variables are not sen-
sitive to a parameter. Insensitive parameters will have by definition large variance (small
|TH0 |-value) and might then leave the model without any effect. A sensitivity analysis
should be then wisely performed. Notice that a local sensitivity analysis depends on the
current experimental data set, therefore any decision on it should include a wide range of
experimental conditions. In general, ill-conditioned estimates (generated by identifiability
problems) prompt low |TH0 |-values. This fact was already established in Refs. [41, 79]
specifically making reference to effects of high correlation between parameters (an iden-
tifiability problem cause). Nevertheless, any claim based on low |TH0 |-values should be
carefully formulated because some parameters might just have a badly computed variance
generated by the influence of the ill-conditioning (see variance-decomposition in Section
3.2.2). In that case, the parameters with large |TH0 |-values (small parameter variances)
should be considered statistically important, but no further conclusions about those with
small values should be released without an appropriate ill-conditioning analysis.
Confidence interval test
The second measure of reliability of the parameter θj is the 100(1−α)% confidence interval
obtained in Section 2.5.1. The rationale behind is that parameters with confidence intervals
which contain the zero-value does not have statistical significance at the α-level and might
be dropped from the model. According to Eq. 2.28 parameters with large variance have
the largest probability of including the zero-value in their confidence interval. In that case
the close relationship between the confidence intervals and the hypothesis testing would
indicate that the null hypothesis cannot be rejected at the same α-level. On the contrary,
if a parameter is significantly different from zero at the α-level (null hypothesis rejected),
then the 100(1 − α)% CI will no contain zero and the parameter should be remained in
the model structure.
2.6. Identifiability
The parameterization of the nonlinear process model described by Eq. 2.1 is conducted by
the solution of the parameter estimation problem in Eq. 2.5. This solution (the parame-
ters) might not be unique and is a specific attribute of the mathematical model structure
and the set of observations used for the fitting. These observations are collected in well-
defined stimulus-response experiments performed on the dynamic system [31]. On the one
side, suppose the model structure is distinguishable [124] and has been already selected
as the best candidate of this dynamic system. The answer to the question of whether the
corresponding parameters might be uniquely determined from the experimental data is
studied by the model identifiability. On the other side, if the model structure is not yet
selected, model identifiability might also answer the question whether the experimental
set-up enables the selection of the best model structure [124]. If the parameter uniqueness
22
2.6. Identifiability
cannot be warrantied then either the mathematical model or the experiment itself should
be modified [106]. Only a well-posed estimation problem will be free of identifiability
issues.
There are two ways to consider identifiability, namely qualitatively and quantitatively.
The qualitative identifiability assumes ideal conditions such as a correct model, an ideal
and continuous experiments [120]. Whereas, the quantitative identifiability deals with
more realistic experimental conditions with noise and discrete measurements.
2.6.1. Qualitative identifiability
The basic assumptions of the qualitative or deterministic identifiability are the ideal con-
ditions of an error-free model structure and noise-free and continuous observations of
measurable quantities [120]. In terms of the model in Eq. 2.1, it is assumed that the map-
pings F and h are real analytic functions at every θ ∈ Ω. Besides that, these assumptions
imply that the observed model responses in y at some nominal parameter θ1 for all t over
the time interval [0, texp] are equivalent to their corresponding observed response variables
in ym under the operating conditions in u, i.e.,
y(t, u, θ1) = ym(t). (2.33)
If there exists another (nominal) parameter θ2 which yields the same predicted response
in all feasible experiments, i.e.,
y(t, u, θ1) = y(t, u, θ2), (2.34)
for all t ∈ [0, texp], it is said that θ2 is indistinguishable from θ1. Identifiability attempts to
find the number of different solutions θ2 of the Eq. 2.34, first at the fixed nominal value θ1.
Having so, the model in Eq. 2.1 is considered globally identifiable at θ1 if for any θ1, θ2 ∈ Ω
the Eq. 2.34 implies θ1 = θ2. Furthermore, the model in Eq. 2.1 is locally identifiable at
θ1 if there exists an open neighborhood V of θ1 ∈ Ω such that for any θ2 ∈ V the Eq. 2.34
implies θ2 = θ1.
Moreover, if the model is identifiable for almost every θ1 ∈ Ω it is called structural
identifiable. In terms of global and local identifiability, the model is said to be structurally
globally (locally) identifiable if it is globally (locally) identifiable for almost any θ1 ∈ Ω.
Techniques to diagnose global identifiability of nonlinear models, namely Taylor series,
power series expansions, local state isomorphism or similarity transformation approach,
transform the differential system into an algebraic equation system to be solved for the
parameters. If this system has a unique solution then it is demonstrated the model is
globally identifiable [120, 124]. In order to diagnose local identifiability the linear inde-
pendence analysis at θ1 of the sensitivity functions ∂y(t, u, θ1)/∂θ1 could be used. If the
sensitivity functions are linear independent at θ1 then the model is locally identifiable at
23
2. Theoretical background I: Model-based parameter estimation and experimental design
θ1. Nevertheless, the linear independence of numerically computed sensitivity functions is
difficult to established and this condition is nor reliable [120] .
2.6.2. Quantitative identifiability
This analysis applies to discrete systems which is the typical case in real experiments and
has a local nature. Therefore, the model is considered locally identifiable or not. It deals
with parameters (from a possible qualitative identifiable model) with large variance gener-
ated, for instance, by low measurement precision, limited experimental data or deficiency
on the location of sample points [120].
Methods to detect this quantitative local identifiability refer to analyze the singularity
or near singularity of the Fisher-information matrix F in Eq. 2.13 [1, 31, 120] or to analyze
the linear dependence or nearly linear dependence of the columns of the sensitivity matrix
S in Eq. 2.9 (or its scaled form S in Eq. 2.12) [31, 78, 120]. The most common local
identifiability condition in literature is based on the analysis of the Fisher-information
matrix. This identifiability condition evaluated at some θ may be summarized as follows:
Definition 2.1 Local identifiability condition based on the Fisher matrix.
The model M is locally identifiable if and only if the Fisher-information matrix F is
nonsingular [31, 120].
The nonsingularity of F is typically assessed by testing its determinant such that
det(F ) = 0.
On the other hand, its equivalence in terms of the sensitivity matrix S evaluated at
some θ reads:
Definition 2.2 Local identifiability condition based on the Sensitivity matrix.
The model M is locally identifiable if and only if the matrix S has numerical rank rϵ equal
to the number of parameters Nθ [31, 78, 120].
It is important to highlight that the rank of a matrix exposes the number of its linearly
independent columns (if the matrix has less columns than rows). Consequently, evaluating
the rank of S means to determine the linearly independence of its columns (see QR method
in Section 4.4.2).
Both identifiability conditions are equivalent according to matrix theory and are called
(local) identifiability conditions (see Appendix A.4 for further justification of the link
between S and F ). They are practical tools to measure the quality of the information
available from real experimental data. In this thesis the structural analysis of the sensi-
tivity matrix S is the cornerstone, hence the identifiability will be analyzed based on the
structural properties of this matrix.
24
2.7. Optimal experimental design
2.7. Optimal experimental design
This section focuses on the mathematical formulation of the optimal experimental design
(OED) for improving parameter precision of the fixed model structure given by Eq. 2.1.
It is an optimization problem formulated as
ucrit := argminu
Ψcrit(C), (2.35)
where ucrit contains the optimal experimental conditions and the cost function Ψcrit
is the information function defined on the positive semi-definite domain PSD(Nθ) into
the real numbers, i.e., Ψcrit : PSD(Nθ) −→ R. The information function is positively
homogeneous, superadditive, nonnegative, nonconstant, and upper semicontinuous [101].
The vector ucrit minimizes the size of the confidence region of the model parameters
through the metric of C (given by Ψcrit). The target of Ψcrit(C) is to capture the largeness
of the parameter covariance matrix C. This transformation in Ψcrit(C) from a high-
dimensional matrix to a real number unfortunately cannot completely retain all aspects
of the variance of multiple parameters and therefore it cannot suit all demands. The
question is to find the appropriate metric for the respective application. Furthermore, due
to the approximation of C via the Fisher-information matrix F in Eq. 4.3, the optimal
problem in Eq. 2.35 can be also seen as the maximization of the information content of
the experiment enclosed in a metric of F .
2.7.1. OED design criteria
The information function Ψcrit(C) is called the OED criterion which may be one of the
alphabetic criteria A, D and E such that crit ∈ A,D,E:
ΨA(C) =1
Nθ[tr(C)] (2.36)
ΨD(C) = [det(C)]1
Nθ (2.37)
ΨE(C) = λmax(C), (2.38)
where tr(C) in A-criterion (also called the average-variance criterion) is the trace of the
parameter covariance matrix C, det(C) in D-criterion (also called the generalized variance
criterion) is its determinant and λmax(C) in criterion E (also called the minimax criterion)
is its largest eigenvalue. As a matter of fact, the criteria A and D can also be declared as
a function of the eigenvalues of C. The former is a summation of eigenvalues whilst the
latter is a product of them (see first equality in Eqs. 2.39 and 2.40, respectively).
25
2. Theoretical background I: Model-based parameter estimation and experimental design
ΨA =1
Nθ[
Nθ∑i=1
λi(C)] =1
Nθ[
Nθ∑i=1
1
λi(F )] =
1
Nθ[
Nθ∑i=1
1
ς2i] (2.39)
ΨD = [
Nθ∏i=1
λi(C)]1
Nθ = [
Nθ∏i=1
1
λi(F )]
1Nθ = [
Nθ∏i=1
1
ς2i]
1Nθ (2.40)
ΨE = λmax(C) =1
λmin(F )=
1
ς2min
(2.41)
Taking into account the inverse relation between C, F and S (see Section 4.3.1 ) and
by applying the Theorem A.5.5 of the Appendix A.5, the eigenvalues of C and F , and the
singular values of S are related by
λi(C) =1
λi(F )=
1
ς2i, ∀i = 1, · · · , Nθ. (2.42)
The Eq. 2.42 uses the squared reciprocal of the singular values of S to simplistically
approximate the eigenvalues of C of the maximum likelihood estimator [15]. Having so,
all mentioned OED criteria may also be defined as a function of the eigenvalues of F (see
second equality of Eqs. 2.39 - 2.41), and declared as a function of the singular values of
the sensitivity matrix S (see third equality of Eqs. 2.39 - 2.41).
Finally, the invariance property under re-parameterizations of D [101], the reduced
computational complexity of A (only requiring the computation of the diagonal entries of
C) [101] and the ability to measure ill-conditioning of E [15] (see Section 3.2.1) are the
most important qualities of each criterion.
2.7.2. Graphical interpretation of OED design criteria
Being the Fisher-information matrix F positive definite, the approximation of the con-
fidence region in Eq. (17) has an ellipsoidal shape which is centered at the estimate θ
[3]. The Nθ eigenvectors of C are the Nθ principal axes of the ellipsoid whose lengths are
proportional to√
λi(C), for i = 1, · · · , Nθ, i.e., the reciprocal of the singular value of S.
The longest axis (the largest eigenvalue of C, i.e., λmax(C)), defines the worst-determined
direction in the parameter space Ω, and the shortest axis (the smallest eigenvalue of C,
i.e., λmin(C)) defines the best-determined direction. The target of the design problem in
Eq. 2.35 is to shrink this ellipsoidal region to an acceptable size.
The application of the different design criteria in Eqs. 2.36-2.38 can be interpreted
geometrically [41, 109]. The D-criterion aims at minimizing the volume of this confidence
region, the A-criterion the dimensions of the enclosing box around the confidence region
and the E-criterion the size of its major axis (i.e., λmax(C)).
26
2.7. Optimal experimental design
D-design
A-design E-d
esig
n
1E-02
1E-01
1E+00
1E+01
1E+02
1 2 3 4 5 6 7
Singular value (ςςςςi)
Figure 2.2.: Influence of alphabetic experimental design criteria on the singular value spectrum(SVs) of the sensitivity matrix S. (Figure from publication III - Lopez et al. (2015) -reprinted with permission from Elsevier Science)
It is worthwhile to point out that each criterion in Eqs. 2.36-2.38 directly affects the
singular values ςi for i = 1, · · · , Nθ of S, because of the inverse relation between ς2i and
λi(C) in Eq. 2.42. The reciprocal of the square of the smallest singular value of S, i.e.,
1/ς2min = 1/ς2Nθis equal to the largest eigenvalue of C, i.e., λmax(C) = 1/ς2min. Therefore,
some or all singular values ςi of the sensitivity matrix S are, somehow, maximized and
the SVs of S is partially or completely improved when minimizing the eigenvalues of C.
For instance, the A-criterion preferably maximize the smallest singular values (specially
those smaller than 1) because their reciprocal contributes more in the summation of the
third equality of Eq. 2.39. The multiplicative nature of the D-criterion in Eq. 2.40 tends
to maximize the largest singular values (specially greater than 1). However, the smallest
singular values of the sensitivity matrix S could be indirectly reduced. In this case, S
would become worse-conditioned (see Section 3.2.1) possibly because of an intensification
of parameter correlations. This is a well-known feature of the D-optimal criterion [41, 109].
Accordingly, the D-design can lead to very large stretched confidence regions with a high
correlation of parameters, but with a small volume [109]. Moreover, the D-design tends
to give excessive importance to parameters to which the model is most sensitive [41]. It
is important to highlight that most of the sensitive parameters are related to the largest
singular values of S. Finally, E-criterion in Eq. 2.41 increases the smallest singular value
of S because of its effect on the largest eigenvalue of C. The tendency to improve the
small singular values of the SVs of S can therefore drive A- and E- optimal designs to
similar solutions. This is especially true for strong correlated parameters, see Ref. 109.
Fig. 2.2 shows the influence of the different criteria on the SVs of S in presence of
singular values smaller than 1. The largest values (top section) of the SVs determine the
D-criterion value and the smallest values (bottom section) the A- and E-criterion values,
27
2. Theoretical background I: Model-based parameter estimation and experimental design
with the E-criterion being solely determined by the smallest one. Note that the respective
portion of the SVs whose contribution can be neglected is improved only by chance.
2.7.3. Observation about singular matrices in OED
Most of the development in model-based experimental design is carried out for systems of
full-rank matrices [101]. In this context, the Fisher-information matrix F and consequently
the covariance matrix C (by properties of an invertible matrix in Section A.5.6 of Appendix
A.5 ) should be nonsingular. In this scenario the calculation of all design criteria in Eqs.
2.36-2.38 is consistent.
However, it might not be necessary true considering that F in Eq. 2.13 is a positive
semi-definite matrix. It means that F might be singular because of the presence of zero-
eigenvalues (see Proposition A.5.5 of Appendix A.5). If zero is an eigenvalue of F , then
it is not invertible and has determinant equals zero (see theorem A.5.6 of Appendix A.5).
In that scenario, C cannot be approximated by the inverse of F (see Eq. 4.3) and the
design criteria in Eqs. 2.36-2.38 are indefinite. Furthermore, the assumption of θ being an
unbiased estimator of θ does not hold anymore [114] and the squared distance between the
estimator θ and the true parameter value θ∗ is unbounded [120]. Thus, the parameters θ
will be highly uncertain (even biased) and the model is not locally identifiable (see Section
2.6).
This mathematically supported facts may be computationally a common reality because
of numerical errors related to the rounding errors and machine precision when matrix
operations on ill-conditioned matrices are performed. For instance, the computation of
C requires a matrix product in Eq. 2.13 to get the Fisher-information matrix F and
then its posterior inversion. If the sensitivity matrix S is ill-conditioned in Eq. 2.13 (the
causes of having ill-conditioned matrices related to identifiability problems in parameter
estimation are addressed in Section 3.2.1), F might be numerically inaccurate and loose its
positive semi-definiteness exhibiting some negative eigenvalues. The same features apply
to the parameter covariance matrix C. An inaccurate C harms the parameter precision
assessment and all posterior analysis. Inaccurate eigenvalues of C significantly affect the
computation of OED (see Section 2.7.1). This fact will be illustrated in in Section 7.3.3 of
Chapter 7 for a severe ill-posed problem on a sequencing batch reactor for water treatment.
2.7.4. Sequential OED
The quality of a computed optimal design ucrit in Eq. 2.35 depends on the quality of the
unknown parameters [40]. Usually previously estimated parameters which depended on
the initial guesses determine the future optimal design. Thus, the information content
of an OED is attached to the accuracy of the assumed parameter initial guess. In order
to diminish the effect of these uncertainties and guarantee robustness in computing an
OED for nonlinear models two methods are typically implemented, namely direct and
28
2.7. Optimal experimental design
indirect approaches. In the direct method uncertainties are defined and considered a priori.
Whereas in the indirect method an iterative refinement of the experimental design together
with a repeated update of the parameter values is employed. The indirect also called
“sequential“ or “adaptive“ method is an iterative procedure where a local optimal design
is computed in each iteration. The experiment is carried out, an initial parameter guess
is defined, parameters are re-estimated from the available observations, and subsequent
optimized experiments are computed, executed and analyzed [8] as shown in the outer cycle
of Figure 2.1. The design is obtained by solving the optimization problem in Eq. 2.35
using the parameter variance criteria in Eqs. 2.36, 2.37 and 2.37. This sequential solution
of experimental design and parameter estimation problems is then continued until the
estimated parameters meet predefined parameter precision specifications (see Ref. 69, 6
for examples). The sequential approach can provide reliable estimates (high parameter
precision with small confidence regions) in the presence of uncertainties. In the case of a
dynamic process the collection in time of measurements has the potential of a real-time
redesign implementation which might drastically reduce the experimental effort. The basis
of the real-time (online) design of experiments is explained in the following.
2.7.5. Online OED
The online model-based redesign of experiments is an extension of the adaptive design,
which was first presented in the 70ies by Ref. 88. This approach was originally proposed
for nonphysical models, however in the actuality various studies have been achieved for
mechanistic models [113, 68, 43, 62, 136, 44]. In contrast to the offline adaptive design, in
the online approach, new measurement data is exploited as soon as it is available and before
designing a new experiment. Hence, the experiment is executed collecting one or several
measurement sampling points, the parameter values are updated and the experiment is
redesigned. This procedure iterates between local solutions of the OED and PE problems
and then repeated until the end of the experiment. It turns out in an iterative refinement
of both parameter estimates and input actions.
Some authors have adopted concepts from model predictive control (MPC) (see, e.g.
Ref. 22) and recursive state and parameter estimation methods for the implementation
of the online redesign of experiments. Those implementation schemes are also referred
to as receding horizon experiment design (see Ref. 113, 62, 136). In the following the
mathematical formulation of real-time experimentation, parameter estimation and design
of experiments (proposed and used in Refs. 8, 9) as well as the corresponding notation
are introduced.
Finite time horizon schemes
The real time implementation is based on a finite time horizon scheme, see Fig. 2.3. It is
considered a repeated update of current parameter estimates (solution of the PE problem)
29
2. Theoretical background I: Model-based parameter estimation and experimental design
and planned input actions (solution of the OED problem). Limited time horizons are
defined and measurement data is assumed to be taken at equally spaced sampling times
tk. Moreover, input actions are implemented as piece-wise constant trajectories which
match the time grid defined by the measurement sampling times.
t0 t1 … tk-1 tk tk+1 … tk+h-1 tk+h
uk|k
yk|k
increasing horizon receding horizon
yk-1|k
uk+1|k
yk+1|k
uk-1|k
yk+h|k
uk+h-1|k
implementation time
y1|k
u0|k
Figure 2.3.: Discretization grids and time horizons used in the online algorithm. (Figure takenfrom Barz et al. (2013) [8] - reprinted with permission from AIChE Journal)
During the experimentation for any sampling time instant tk, a new measurement ymk =
ym(tk) is obtained. Considering all prior measurements at instants t1, · · · , tk−1, the current
measurement vector reads Y mk =
(ym1 , ym2 , · · · , ymk−1| ymk
)T. Due to the growing number of
elements in the measurement vector with an ongoing experiment time, the time horizon
(t0, tk) is referred to here as increasing horizon. In Figure 2.3 a discretization grid with
elements of length ∆t = (tk − tk−1) is shown which meets the measurement sampling
instants tk.
In every instant tk all available measurements Y mk are used to update the current pa-
rameter estimate θk. This is done by fitting the vector of simulated response variables
Y −k =
(y1|k, y2|k, · · · , yk|k
)T 3 to the measurement vector Y mk (solution of problem in Eq.
2.43). The simulated response variables Y −k are obtained from the solution of Eq. 2.1 for
t = [t0, tk] and taking into account the discrete input actions in all past intervals within
the increasing horizon U−k = [u0|k, u1|k, · · · , uk−1|k]
T as well as the initial states x0 = x(t0).
In the same way, for any sampling instant tk, a prediction or receding horizon [tk+1, tk+h]
is considered, where the predicted outputs Y +k = [yk+2|k, yk+3|k, · · · , yk+h|k]
T are obtained
from the solution of Eq. 2.1 for t = [tk+1, tk+h] and taking into account the future discrete
input actions U+k = [uk+1|k, uk+2|k, · · · , uk+h−1|k]
T and the initial states x0 = x(tk+1). All
future input actions U+k are updated for each tk by the solution of an OED problem and
based on the current parameter estimate θk.
According to the iterative implementation strategy, the parameter estimation as well
3The notation yi|j indicates the values of the predicted variable vector at sampling instant i which iscalculated at instant j.
30
2.7. Optimal experimental design
as the generation of new input actions are repeated at each time instant tk. Thus, all
corresponding computations have to be finished within one sampling interval ∆t. In Figure
2.3 this sampling interval is denoted as implementation time. It is important to point out
that during the implementation time the corresponding input actions uk|k are executed in
the experiment so these are not considered in the OED problem formulation (update of
the future input actions in the time horizon [tk+1, tk+h]). Moreover, the predicted output
yk+1|k is not considered in the PE problem (update of the current parameter estimate
using measurement from the time horizon [t0, tk]) [8]. However, initial states x0 = x(tk+1)
are needed for the definition of the OED problem. The initial states x(tk+1) are obtained
from a simulation step, solving 2.1 for t = [tk, tk+1] with uk|k, θk and x0 = x(tk).
Online mathematical formulation of PE and OED
In the adaptive online OED for increasing parameter precision at each sampling instant tk
parameter estimation and optimal experimental design problems are solved. It enhances
the current parameter estimate θk and updates the planned input actions U+k , respectively.
The discrete formulation of the PE problem in Eq. 2.5 reads
θk = argminθk
ΦLSQk (U−
k , θk) (2.43)
with ΦLSQk (U−
k , θk) =(Y −k (U−
k , θk)− Y mk
)T · (C−Y,k)
−1 ·(Y −k (U−
k , θk)− Y mk
),
where Y −k ∈ RNy ·k is the vector of past outputs and Y m
k and Cy,k are the available mea-
surement vector and the covariance matrix of the experimental errors, respectively. C−Y,k
is assumed to be a diagonal matrix with the variance σ2y,i of each measurement i in its
diagonal entries. Moreover, θk ∈ RNθ is the parameter (point) estimate at the k-th time
instant and U−k ∈ RNu·k are the input actions which were implemented in all past intervals.
The corresponding sensitivity matrix S−k (U
−k , θk) ∈ RNy ·k×Nθ is given as
S−k (U
−k , θk) = (Σ−
Y,k)−1 ∇θkY
−k (U−
k , θk), (2.44)
where the weighting matrix Σ−Y,k ∈ RNy ·k×Ny ·k is the standard deviation matrix of the
experimental errors of the measurements such that C−Y,k = Σ−
Y,kΣ−Y,k.
The discrete formulation of the OED problem in Eq. 2.35 reads
U+∗k = argmin
U+k
Ψcritk (U+
k , uk|k, U−k , θk) (2.45)
with Ψcritk ∈ ΨA,ΨD,ΨE,
where U+∗k ∈ RNu·(h−1) is the optimal input vector at the k-th time instant, Ψcrit
k is one of
the criteria described in Eqs. 2.36-2.38 evaluated in the k-th time instant, and θk is the
parameter estimate at the k-th time instant. The Fisher-information matrix Hθ,k at the
31
2. Theoretical background I: Model-based parameter estimation and experimental design
k-th iteration for the approximation of the parameter covariance matrix Ck according to
Eq. 4.3 is calculated as
Hθ,k(U+k , uk|k, U
−k , θk) = H−
θ,k(U−k , θk) +Hθ,k|k(uk|k, θk) +H+
θ,k(U+k , θk), (2.46)
where the first term H+θ,k is a non-constant contribution in Eq. 2.45 because depends on
the future input actions U+k , i.e., the decision variables in the optimization of Eq. 2.45.
The second and third terms Hθ,k|k and H−θ,k are constant contributions in the OED because
they depend on currently implemented and past input actions, respectively. Anyhow, all
contributions in Eq. 2.46 have to be updated for the current parameter estimate θk. It
is important to highlight that the computation of Hθ,k(U+k , uk|k, U
−k , θk) according to Eq.
2.13 is based on the corresponding scaled sensitivity matrix
Sk(U+k , uk|k, U
−k , θk) =
⎡⎢⎣ S−k (U
−k , θk)
Sk|k(uk|k, θk)
S+k (U
+k , θk).
⎤⎥⎦ (2.47)
Sk(U+k , uk|k, U
−k , θk) ∈ RNy ·(k+h)×Nθ is calculated using the last available estimated pa-
rameter vector θk at the time instant tk considering in its structure the past, present and
future measurements generated by the input actions U−k , uk|k and U+
k , respectively.
2.8. Initial guess sampling
In nonlinear optimization problems there exists a large probability of finding multiple local
minima. The nonlinear least squares estimation in Eq. 2.5 and the optimal experimental
design in Eq. 2.35 are subject to this issue. Applying gradient-based algorithms the finding
of the global solution depends mainly on the selection of the parameter initial guess (IG).
The high nonlinearity of the chemical and bioprocess process models of interest in this
thesis makes this case pretty important. Strategies to tackle this weakness consider data
from literature, priori calculations using the model, modeler experience or solutions of
different parameter estimation problems using several IGs as starting points. This section
explains a sampling strategy called Minimum bias Latin hypercube design to statistically
generate initial guesses for either PE or OED.
2.8.1. Minimum bias Latin hypercube design (MBLHD)
In order to find an appropriate IG for parameters in parameter estimation a sampling
type algorithm within a prescribed parameter range might be applied. The Minimum bias
Latin hypercube design (MBLHD) is based on Latin hypercube sampling (LHS), which
provides unique values for each point and exhibits better dispersion than other sampling
procedures such as the random and grid sampling [27, 86]-[94].
32
2.8. Initial guess sampling
According to Ref. [86] a LHS is constructed by dividing the range for each of the m
input variables into N strata of equal marginal probability 1/N . If each input variable
is assumed to be distributed uniformly on the interval [−1, 1], then the strata are all of
width 2/N . A single value is selected randomly from each stratum, producing N sample
values for each input variable. The values are randomly matched to create N sets of
values for the N simulator runs. MBLHD follows the same sampling structure of LHS
before but additionally it has found to satisfy the minimum bias conditions of symmetry
and orthogonality described by authors in Ref. [94] with sample values for each input
variable xiu with i = 1, · · · ,m and sample point u = 1, · · · , N given as follows:
xiu =2u−N − 1√
(N2 − 1), (2.48)
where N is the number of total sampling elements, which is chosen for each sample value.
The results of applying the MBLHD approach will be discussed in Chapter 6.
33
34
3. Theoretical background II: Ill-posed
problems and numerical regularization
In this chapter, the main characteristics of ill-posed problems and most common regular-
izations strategies are discussed.
3.1. Direct and inverse problems
Before giving the formal definition of well-posedness in the sense of Hadamard [50], let
define the general problem
g = A(q) (3.1)
in the functional spaces G,Q (e.g., Hilbert spaces), where q ∈ Q and g ∈ G and A (linear
or nonlinear) is an operator from Q into G. The data and the unknown space are G and
Q, respectively. It should be noted that g is a function of q or that A denotes the operator
which acts on q and produces f . This operator may be linear or nonlinear. In the linear
case Eq. 3.1 transforms into
g = Aq. (3.2)
From the formulation in Eq. 3.1 two problems can be considered:
1. the direct problem in which g is computed given q,
2. the inverse problem in which for some prescribed g the target is to find the solution
q.
Direct problems are basically concerned with finding a function that represents (models) a
process, phenomena or physical field at any point of a given domain at any instant of time
(if the field is nonstationary). Whilst, inverse problems are dealt with determining causes
for a desired or an observed effect [64]. Those causes might be the present state or physical
parameters of a system, and the effects might be futures observations or measurement data
of a process. A solution of Eq. 3.1 exists if and only if g is in the range of the operator A
[13]. Generally direct problems of mathematical physics are well-posed, whereas inverse
problems turn out to be ill-posed [13, 64].
Examples of inverse problems are present in physics (astronomy, quantum mechanics,
acoustics, electrodynamics, etc.), medicine (X-ray and NMR tomography, ultrasound test-
35
3. Theoretical background II: Ill-posed problems and numerical regularization
ing, etc.), geophysics (seismic exploration, electrical, magnetic and gravimetric prospect-
ing, logging, magnetotelluric sounding, etc.), economics (optimal control theory, financial
mathematics, etc.), ecology (air and water quality control, space monitoring, etc.), chemi-
cal engineering (chemical reactors, bioreactors, etc.) [64].
In the context of nonlinear modeling, the simulation using the model for a given param-
eter vector is considered the direct problem (see Figure 3.1). The parameter estimation
(identification) problem in Eq. 2.1, on the other hand is considered the inverse prob-
lem. The experimental measurement vector Y m (as the data of the inverse problem, i.e.,
g = Y m) is used to recover the unknown parameter vector θ, i.e., q = θ (see Figure 3.1)
which depends on the structure of the operator A, i.e., the Fisher-information matrix F
as a function of the sensitivity matrix S. In Figure 3.1 the relation between F and the
sensitivity matrix S is denoted by ϕ.
Input(parameters)
Output(measurements)
(S)
output(parameters)
Input(measurements)
(S)
(a)
(b)
Figure 3.1.: Graphical representation of a) the direct problem and b) the inverse problem in non-linear modeling.
3.2. Ill-posed problems
Hadamard [50] at the beginning of the 20th century established the definition of a well-
posed problem. That problem should be simultaneously fulfilled three conditions, namely
existence, uniqueness and continuity.
Definition 3.1 Well-posed problem in the sense of Hadamard [13, 50, 64].
The problem in Eq. 3.1 is well-posed if the following three conditions hold:
1. the existence condition: for each data g in a given class of functions G there exists
a solution q in a prescribed class Q,
2. the uniqueness condition: the solution q is unique in Q, i.e., there exists an inverse
operator A−1 from G into Q, and
3. the continuity condition: the dependence of q upon g is continuous, i.e., when the
error on the data g tends to zero, the induced error on the solution q tends also to
zero.
36
3.2. Ill-posed problems
If at least one of the three previous conditions in Definition 3.1 is violated, the problem
is called ill-posed in the sense of Hadamard [50]. In other words, if the problem either
has no solutions in the desired class, or has many (two or more) solutions, or the solution
procedure is unstable (i.e., arbitrarily small changes of the experimental data may lead to
large perturbations of the solution).
Continuity is a necessary condition for stability or robustness of the solution [13]. Never-
theless, it is not sufficient especially for discrete ill-posed problems (coming from discrete
available data or discretization of ill-posed problems), where lack of robustness against
noise is usually evidenced [13]. Most difficulties in solving ill-posed problems are caused
by the instability of the solution. Consequently, the term ill-posed problems frequently
(but not only) makes reference to unstable problems.
In terms of stability, the third condition in Definition 3.1 can be formulated as a stability
condition which sets that for any neighborhood O(q) ⊂ Q of the solution q to the Eq. 3.1,
there is a neighborhood O(f) ⊂ F of the data f such that for all fδ ∈ O(f) the element
qδ belongs to the neighborhood O(q) [64].When the stability condition does not hold true, the error in the data is unlimited ampli-
fied in the solution because of the inverse of the operator in Eq. 3.1 is unbounded [13, 64].
However, some problems although formally well-posed (i.e., the three conditions in Defini-
tion 3.1 are fulfilled) suffer from practical instability when the system has ill-conditioned
matrices [13, 64]. In that case, a careful investigation about error propagation should be
conducted. To do so, the description of ill-conditioned problems and the concept of the
condition number as a measure of numerical stability of the problem must be introduced.
Other quantities as auxiliary ill-conditioning metrics [78] useful in the computational frame-
work of Chapter 4, namely the collinearity index and the sensitivity measure, will be also
introduced.
3.2.1. Ill-conditioned problems
Noisy measurement data may be harmfully propagated in the solution and lead to signifi-
cant misinterpretations of it [13, 64, 66]. A problem is considered ill-conditioned if small
errors in the data produce large errors in the solution. On the contrary, if small errors in
the data produce small errors in the solution, a problem is considered well-conditioned.
Ill-conditioning promotes numerical instability. In parameter estimation, the ill-
conditioning is determined by the linear dependencies or collinearity (see Definition A.5.1
of Appendix A.5) on the row/columns of an specific matrix. Thus, a practical analysis
of the ill-posedness of an estimation problem might be related to the ill-conditioning of
this specific matrix. If this matrix is severe ill-conditioned, the estimation problem is
treated as ill-posed. However, well-posedness in the sense of Hadamard is not a guarantee
of robustness against noise. Consequently, an ill-conditioning analysis is always highly
recommended.
37
3. Theoretical background II: Ill-posed problems and numerical regularization
In the general inverse problem of Eq. 3.1, the problem is considered ill-conditioned if
the operator A is ill-conditioned [12, 13, 52, 57, 64, 83]. For instance, in linear estimation
(the linear regression model y = Xβ + ϵ) the ill-conditioning is analyzed on the data
matrix X. On the other hand, in the nonlinear case as the least-squares estimation,
the ill-conditioning is locally analyzed based on the Fisher-information matrix F (see
Eq. 2.13) [1, 3, 10, 42, 60, 63, 83, 120]. However, because of the construction of F
in Eq. 2.13 the ill-conditioning properties of F are generated by the properties of the
sensitivity matrix S. Thus, in the nonlinear parameter identification context the matrix
A is the sensitivity matrix S. That means if S is far from orthogonal, then F is highly
ill-conditioned [78, 83, 120] and the estimation is considered ill-posed.
This ill-conditioning (generated by exact or near collinearity) is of paramount impor-
tance to the efficacy of least-squares estimation though it is a non-statistical problem [12].
In the presence of collinearity the uniqueness and stability of the least-squares estimator
may not be guaranteed. Moreover, the effect of linear dependencies (or collinearity) on
model regression is well-know for statisticians who claim that they inflate the variances of
regression coefficients and magnify the impact of errors on the regression variables [112].
Notice that, for instance, in the nonlinear estimation of Eq. 2.5 by using Newton-type
algorithms, the solution in Eq. 2.6 might be inflated and thus, unstable by inverting an
ill-conditioned Hessian (which is the Fisher-information matrix).
Ill-conditioning and collinearity measures
The indicator of ill-conditioning is the condition number κ (see Section A.5.12 of Appendix
A.5). When κ is not far from 1, the matrix is said to be well-conditioned, on the contrary
when κ is large the matrix is ill-conditioned. Only a matrix with orthonormal (perfect
linearly independent) columns has κ = 1. Otherwise, if the matrix has exact linear de-
pendencies among its columns it has at least one zero-singular value [12, 11] and then
κ =∞.
An ill-posed problem has infinity condition number, thus extremely ill-conditioned prob-
lems behave in practice as ill-posed problems and have to be treated by the same tech-
nique [13]. In most applications on nonlinear parameter estimation all singular values of
the sensitivity matrix are nonzero, because its columns are rarely exactly linearly depen-
dent (except the case that there are some structural non-identifiabilities). Nonetheless, if
among the columns of the sensitivity matrix there exists a near linear dependence (also
called collinearity) [12] this will manifest itself in a “small“ singular value and then a large
condition number. This also makes the condition number as a measure of near collinearity
[112].
The collinearity index (see Section A.5.13 of Appendix A.5), as its name makes reference,
measures collinearity of the columns of a matrix. In the extreme case, when A is said to
be singular then γ(A) = ∞. In other cases, when this quantity is large, that indicates A
38
3.2. Ill-posed problems
has columns (nearly) linearly dependent and is ill-conditioned, and the parameter vector
is poorly identifiable [17].
Empirical upper bounds for the condition number and the collinearity index are defined
by κmax ≈ 1000 [46] and γmax ≈ 10 − 15 [17], respectively. The maximum condition
number κmax and collinearity index γmax are used to define a well-conditioned sensitivity
matrix and a well-posed problem. Therefore, this problem should not have numerical
instabilities due to error propagation and nearly linear dependencies. Accordingly, γmax
controls linear dependencies and κmax assures numerical stability. Furthermore, because
these thresholds are related to the definition of a “large“ ratio in the SVs and “smallness“
of singular values, the user should tune them according to the magnitude (scaling) of
singular values in the particular application. However the aforementioned values are good
references and could be used as a starting point.
When the matrix S is scaled by the scalar a, this scaling does not affect the condition
number because it is scale invariant, i.e., κ(aS) = (aς1)/(aςNθ) = κ(S). However, it
modifies the collinearity index, i.e., γ(aS) = 1/(aςNθ) = γ(S)/a. The scale-invariance
poses the condition number as a robust indicator of ill-conditioning.
Classification of ill-conditioned problems
For ill-conditioned problems a distinction is made, being either rank-deficient or having an
ill-determined rank [52, 53]. If the numerical rank (see Section A.5.11 of Appendix A.5)
of S can be reliably calculated, the problem is considered rank-deficient, otherwise it is of
ill-determined rank [52]. Problems with a SVs decaying gradually to zero without a gap
between singular values belong to the ill-determined rank problem class [52, 53].
In order to define a problem as one of the above classes, the first step is to compute
its numerical rank, specifically its ϵ-numerical rank (see Definition A.5.11). To do so, the
lower bound on the SVs, namely the ϵ-threshold, is calculated from the maximum condition
number (κmax) and collinearity index (γmax) defined in Section A.5.12 such that
ϵ = max ϵκ = ς1/κmax, ϵγ = 1/γmax (3.3)
In Fig. 3.2 and 3.3, SVs for both, rank-deficient and ill-determined rank problems are
depicted. For the rank-deficient problem in Fig. 3.2, the numerical ϵ-rank (i.e., number of
linear independent columns of S) can be well-determined by using the ϵ-threshold. Rank-
deficient problems have either a large condition number κ(S) and/or a large collinearity
index γ(S). Furthermore, the singular value ς1 is larger than 1 although the scaling of S
determines its magnitude.
For the ill-determined rank problem in Fig. 3.3, the dominant feature is the presence of
singular values close to the conventional or selected “zero“. Large values of both κ(S) and
γ(S) are expected. Moreover, almost the complete SVs is below 1 and even the largest
singular value ς1 is small. Although, ill-determined rank problems do not possess a well-
39
3. Theoretical background II: Ill-posed problems and numerical regularization
1E-06
1E-05
1E-04
1E-03
1E-02
1E-01
1E+00
1E+01
1E+02
1E+03
1 2 3 4 5 6 7 8 9 10
Singular value (ςςςςi)
GAP
Sm
all
valu
es
Larg
e
valu
es
∈∈∈∈
well-conditioned ςςςςi
ill-conditioned ςςςςi
Figure 3.2.: Singular value spectrum (SVs) of a rank-deficient problem. (Figure taken from publi-cation III - Lopez et al. (2015) - reprinted with permission from Elsevier Science)
defined rϵ, the ϵ-threshold can still be used to establish a new well-conditioned problem
(with only well-conditioned singular values) which is related to the amount of available
linearly independent information.
As mentioned for the condition number, the rank of S is also scale invariant. However,
when applying the collinearity index to a scaled matrix aS, the ϵ-threshold for computing
rϵ must be modified as aϵ 1 such that raϵ(aS) = rϵ(S). This fact shows that the ill-
conditioning of a problem is not affected by this scaling, but indicators based on the
singular values (and also eigenvalues) need to be applied carefully. On the other hand,
it is recommended to conduct PE and OED normalizing predicted response variables
and parameters (see Eqs. 4.1 and 4.2, respectively) to decrease the influence of different
magnitudes in the optimization.
Relationship between identifiability problems and ill-conditioning
Typically, identifiability problems in nonlinear parameter estimation are detected analyz-
ing whether the Fisher-information matrix F (see Eq. 2.13) is singular or“almost“ singular
from a numerical point of view [1, 42, 120, 124]. When F is singular (a case rarely en-
countered in actual practice), it is either not invertible, or det(F ) = 0, or the smallest
eigenvalue λmin(F ) = 0, or is not of full rank, or has exact linear dependence among its
columns, or some column norms are equal zero (see Theorem A.5.6). When F is almost
singular its columns have nearly linear dependencies or has column norms close to zero
[12], or its det(F ) and λmin(F ) are close to zero. Having so, F is still invertible, but it
might numerically run into problems (ill-conditioning). Thus, the estimation problem is
1For the scaled sensitivity matrix aS the maximum collinearity index should be also scaled as γmax/asuch that ϵγ(aS) = aϵγ and ϵ(aS) = max ϵκ, aϵγ
40
3.2. Ill-posed problems
1E-16
1E-14
1E-12
1E-10
1E-08
1E-06
1E-04
1E-02
1E+00
1E+02
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43
Singular value (ςςςςi)
Sm
all
valu
es∈∈∈∈
ill-conditioned ςςςςi
wel
l-co
nd
itio
ned
ςς ςςi
Figure 3.3.: Singular value spectrum (SVs) of an ill-determined rank problem. (Figure taken frompublication III - Lopez et al. (2015) - reprinted with permission from Elsevier Science)
exactly or nearly singular. In those cases the assumption, that F is of full rank belonging
to the set of positive definite matrices PD(Nθ) (see Lemma A.5.5) does not hold [3].
The eigensystem (eigenvalues and eigenvectors) of F has been used to analyze local
identifiability problems [17, 19, 120, 122, 123]. The presence of a small eigenvalue of F
is a visible proof of identifiability problems. Indeed, small eigenvalues of F evidences
its ill-conditioning [12]. Nonetheless, in cases where F is extremely ill-conditioned the
computation of its eigensystem has weak numerical stability [12].
At this point, it is again needed to return to the intimate connection between F and S
in Eqs. 2.11 and 2.19. As explained in Section 2.6, the structural problems of S are trans-
ported to F . Consequently, it is clear that if F is ill-conditioned, it is a consequence of an
ill-conditioned S, which should not be of full rank. This happens, for example, for columns
of S with small values (even close to or equal zero) due to insensitive parameters or nearly
identical columns or columns which are linear combinations of others because of corre-
lated parameters. Differently expressed, if S is ill-conditioned then F is ill-conditioned,
the parameter estimation is ill-posed and at least has local identifiability problems.
Effect of ill-conditioning on parameter estimation
Indefinite Hessians (and even just positive definite Hessians with small eigenvalues) in non-
linear estimation can cause unacceptable (inflated) search directions [83], which increases
the estimation variability and instability. In order to better understand this claim, please
see Eq. 2.6, where the step direction υk along with the step size are controlling the value
of the new solution. In gradient-based methods υk is typically determined for a function
of the inverse of Hθ in Eq. 2.11. Therefore, if Hθ is ill-conditioned this inverse is degen-
erated leading to unstable solutions. Moreover, for problems where Hθ is ill-conditioned
41
3. Theoretical background II: Ill-posed problems and numerical regularization
(nearly singular problems), a large variance (uncertainty) in some sub-space of the pa-
rameter space Ω can be found [3, 120, 63]. Estimates may be obtained far away from
the true parameter values and thus some parameters cannot be accurately estimated and
are non-identifiable [120]. It may be explained taking a look at the approximation of the
parameter covariance matrix C in Eq. 4.3, where the inverse of F ≈ Hθ is also required.
It has to be noted, that well established solution methods as Levenberg-Marquardt
[82, 83] and trust region [90, 110] are adapted to handle indefinite matrices [3, 66].
Levenberg-Marquardt method is based on the Tikhonov regularization of the linearized
problem in Newton’s method and trust-region method is a modification of Newton’s
method with origin from Levenberg-Marquardt method. However, for nonlinear PE prob-
lems with highly ill-conditioned matrices, Levenberg-Marquardt methods exhibit reduced
convergence rates [66]. For trust-region methods efficiency and stability are negatively
affected.
Effect of ill-conditioning on optimal experimental design
It is strongly recommended to carefully analyze the conditioning of S when applying the
alphabetic design criteria in Section 2.7.1. For example, D-criterion cannot be calculated
if a SVs of S contains at least one zero-value. Meaning that an eigenvalue of F is zero (i.e.,
λi(F ) = 0), the F is not invertible and hence the covariance matrix C in Eq. 4.3 cannot be
evaluated. Even if the SVs of S does not have exactly zero-singular values, the presence of
one or several values near zero could degenerate the inversion and produce criterion values
approaching infinity. Moreover, negative eigenvalues could appear because of inherent
numerical errors from the computation of F (Eq. 2.10) which can affect the positive
semi-definiteness of that matrix. Indeed, the theoretically positive semi-definite matrix F
(see Definition A.5.15) could lose its positive semi-definite property despite conserving the
symmetry (Definition A.5.3) in its structure. In those cases, the calculated design criteria
are meaningless and an optimal experimental design should not be performed.
3.2.2. Parameter variance-decomposition
In Eq. 2.19 the Fisher-information matrix F is defined in terms of the singular values
of the sensitivity matrix S. Moreover, in Eq. 4.3 the relationship between F and C is
described. Combining the mentioned equations, C can be also expressed in terms of the
singular values of S. Doing so, the final expression of C in terms of the singular values of
S is as follows:
C = V (STv Sv)
−1V T =
Nθ∑i=1
vivTi
ς2i(3.4)
where ςi and vi ∈ RNθ with i = 1, ..., Nθ are the i-th singular value and the i-th right
vector of S, respectively. Once the parameter covariance matrix C is defined as a function
42
3.3. Numerical regularization for parameter estimation
of ςi the decomposition of the variance can be achieved. As can be observed in Eq. 3.4
the inverse of each singular value ςi contributes to build the parameter covariance matrix
C. Clearly, each parameter variance σ2θj
of the j-th parameter θj is inversely affected by
ςi in the following form
σ2θj
=
Nθ∑i=1
vi,j · vi,jς2i
(3.5)
where vi,j is the j-th element of the i-th right singular vector of S. The variance σ2θj
in Eq. 3.5 is obtained by a sum of componentsvi,j ·vi,j
ς2i, which are here called variance
components σ2θj|ςi , associated with one of the singular values ςi of S for i = 1, · · · , Nθ such
that
σ2θj
=
Nθ∑i=1
σ2θj|ςi . (3.6)
The Eq. 3.6 is known as the parameter variance-decomposition. Finally, in order to
quantify the contribution of each singular value ςi, i = 1, · · · , Nθ to the variance σ2θj
of
each j-th parameter θj , j = 1, · · · , Nθ the proportions
πi,j =σ2θj|ςi
σ2θj
(3.7)
are computed.
3.3. Numerical regularization for parameter estimation
Handling ill-posed PE problems is aimed at “trying to control the parameter variance“
either imposing upper and lower limits on the admissible parameter values [25] or using
regularization techniques [13, 66, 52]. Although, the former seems to control the instability
(at amplitude as large as that allowed by the imposed limits) the solution may remain
meaningless merely oscillating between the limits [25]. The latter eliminates unwarranted
oscillations (with a meaningful variance level reduction) but the estimated parameters can
be biased [4, 48, 49, 52, 63, 83].
All prior knowledge or information about the desired solution may be incorporated into
the identification problem of Eq. 2.5 in the form of constrains. This action is the basic idea
of any regularization technique which numerically permits to deal with ill-posed parameter
estimations. These constraints or transformations stabilize the problem (because it is
better conditioned) and lead the regularized solution θReg to a useful and stable region
which will be hopefully near the desired (θ∗) [52, 63, 66].
When a regularization technique (Reg) is applied to Eq. 2.5 the solution norm, either
2-norm (i.e.,θReg
2) or an appropriate seminorm, and the residual value (i.e., Z =
43
3. Theoretical background II: Ill-posed problems and numerical regularization
(Y (u, θReg) − Y m)) should remain small. Nevertheless, the regularized solution θReg is
obtained from a modified problem. Hence θReg is generally biased but also has a reduced
variance as the regularization decreases the size of the solution’s covariance matrix [38, 52,
63].
For rank-deficient problems with matrix S whose rϵ exists, regularization strategies, for
instance parameter Subset Selection (SsS) and Truncated Singular Value Decomposition
(TSVD), which exploit the filtering of a subset either of the parameters or the singular
values can be applied. The key feature of these strategies is to extract the linearly inde-
pendent information of S and to transform the problem into a well-conditioned one [52].
Ill-determined rank problems (i.e., rϵ of S cannot exactly be determined independently of
the value of ϵ used) could generally be treated with strategies for rank-deficient problems.
However, better suited are regularization techniques which introduce a penalty term to
ensure smoothness of the solution, for instance the Tikhonov regularization (Tikh) known
in linear cases as ridge regression [4, 52, 63, 83]. Here the ill-conditioning of S is also
directly modified. Instead of selecting its well-conditioned information (maintaining the
formulation in Eq. 2.5), a new well-conditioned matrix (i.e., the regularization parameter
multiply by a predefined full row rank matrix) is added to S [52, 57, 56, 83].
The three regularization techniques SsS, TSVD and Tikh are studied in the sequel. The
choice of these techniques lies on their broad application, good performance and easy
implementation. Moreover, SsS and TSVD have a close relationship to the here referred
ill-conditioning analysis because they are also based on the singular value analysis. It is
important to point out, that all mentioned regularization techniques are aimed at getting
rid of the ill-conditioning of the original problem and turning it into a new well-posed
problem [52, 70]. In the field of parameter subset selection, more sophisticated methods
have been proposed recently which are also based on orthogonalization methods [28, 30,
129]. The main differences are in the chosen orthogonalization algorithm (e.g., Gram-
Schmidt or householder transformation) and how parameter combinations are selected for
estimation (possible criteria are maximum determinant of reduced FIM, largest column
norm of reduced sensitivity matrix or lowest MSE). In Ref. 30 the selection is formulated
as an optimization problem. However, all strategies coincide in finding a parameter subset
which generates a well-conditioned sensitivity matrix in the reduced parameter space.
Furthermore, each technique uses a regularization parameter (ϵ-threshold for SsS and
TSVD, and λ for Tikhonov) which must be tuned according to the gravity of the ill-
conditioning in the particular application. The regularized sensitivity matrix SReg for
Reg=SsS, TSVD, Tikh is computed and used to obtain HRegθ (see Eq. 2.11) and FReg ≈
HRegθ . The inverse of HReg
θ (more stable) is not only the basis for the calculation of the
step direction υk in Eq. 2.6 but also for the construction of the parameter covariance
matrix in Eq. 4.3.
44
3.3. Numerical regularization for parameter estimation
3.3.1. Parameter subset selection (Reg=SsS)
The idea of dealing with an ill-conditioned estimation by selecting a well-conditioned
parameter subset and solving a new (but reduced) estimation has been applied first
in linear least squares problems with non-orthogonal data [56] and nonlinear estima-
tion with inflated step directions υk in Eq. 2.6 [83]. More recently, systematic
strategies for parameter subset selection have been proposed (see for instance Refs.
18, 19, 26, 28, 30, 45, 52, 87, 79, 122, 123, 129, 134).
In this thesis, the considered identifiable parameters to be used in this regularization are
the most linearly independent parameters selected by applying orthogonal projections on
the columns of S [45]. This orthogonal method (here called QR method) will be explained
in Section 4.4.2. This method is a popular way of determining identifiable parameters
computing the numerical rank rϵ (see Section A.5.11 of Appendix A.5) of the sensitivity
matrix [70, 78]. The use of the results of QR method appropriately guarantees a reliable
inversion of the Fisher-information matrix approximated in Eq. 2.13. This regularization
is based on only estimating the so-called identifiable parameters. Although this means
that only certain parameters are estimated, their values will be obtained with improved
reliability [120].
For the solution of nonlinear problems this local regularization technique should be
applied repeatedly during the search. The status of parameters (being active or nonactive)
along with estimator performance analysis are then iteratively refined and nonactive or
unidentifiable parameters (θ(Nθ−rϵ)) are fixed at currently best available estimate (ˆθ(Nθ−rϵ))
[79]. The corresponding reduced PE problem takes the following form:
θSsS := argminθ(rϵ)
ΦSsS(u, θ(rϵ)) (3.8a)
ΦSsS(u, θ(rϵ)) := ΦLSQ(u, θ(rϵ)). (3.8b)
The regularized Jacobian JSsSθ ∈ RNy ·Nm·Ne×rϵ and Hessian HSsS
θ ∈ Rrϵ×rϵ of the cost
function ΦSsS in Eq. 3.8a are given in Table 3.1. They are obtained from the regularized
sensitivity matrix SSsS ∈ RNy ·Nm·Ne×rϵ . The linearly independent columns of S (i.e.,
the sensitivities corresponding to the rϵ active parameters of θ, see Section 4.4.2) are
stored in the reduced (regularized) sensitivity matrix SSsS ∈ RNy·Nm·Ne×rϵ . Therefore, the
regularized matrix SSsS is well-conditioned and has full-column-rank with κ(SSsS) ≤ κmax
and γ(SSsS) ≤ γmax [79].
Furthermore, according to the interlacing inequality of the singular values, each i-th
singular value of the sub-matrix SSsS is at most as large as the corresponding i-th singular
value of the original matrix S, i.e., ςi(SSsS) ≤ ςi(S) [117, 24]. Also note that HSsS
θ is
computed according to the assumptions in Eq. 2.10. The application-depending threshold
ϵ (see Eq. 3.3) is considered the regularization parameter in this technique, which should
be tuned according to the gravity of the ill-conditioning in each case study.
45
3. Theoretical background II: Ill-posed problems and numerical regularization
3.3.2. Truncated singular value decomposition (Reg=TSVD)
In this regularization a new problem with a well-conditioned but rank-deficient sensitivity
matrix (STSVD ∈ RNy ·Nm·Ne×Nθ) is derived [52, 131]. The matrix STSVD is obtained by trun-
cating the SVs of S at rϵ and substituting the small nonzero singular values ςrϵ+1 , · · · , ςNθ
with exact zeros (see 3.1 ). It should be noted that the same truncation principle is used
to compute the generalized inverse of A = XTX when the existence of a distribution of
eigenvalues near zero complicates the calculus of its inverse [83]. In TSVD, the numerical
rank rϵ is computed as described in Section A.5.11. Thus, the regularization parameter
in this technique is once again the ϵ-threshold (see Eq. 3.3 for its determination). The
matrix STSVD is rank-deficient. It is the closest rϵ-rank approximation to S with residual
2-norm given byS − STSVD
2= ςrϵ+1 [52].
By applying the TSVD technique the parameter space dimension Ω and the cost function
of the original PE problem in Eq. 2.5b remains unchanged:
θTSVD := argminθ
ΦTSVD(u, θ) (3.9a)
ΦTSVD(u, θ) := ΦLSQ(u, θ). (3.9b)
The regularized Jacobian JTSVDθ ∈ RNy ·Nm·Ne×Nθ , and Hessian HTSVD
θ ∈ RNθ×Nθ matrices
are calculated based on the regularized sensitivity matrix STSVD according to Eq. 2.7b and
2.11, respectively, see Table 3.1. It has to be noted, that as for most parameter estimation
problems the parameters are somehow correlated, all parameter sensitivities are affected
by the truncation of one or several singular values of S.
Table 3.1.: Definition of the specific derivatives used for “SsS“, “TSVD“ and “Tikh“ regularization;regularization “Reg“ equal “None“ refers to the original parameter estimation problem.(Table from publication III - Lopez et al. (2015) - reprinted with permission fromElsevier Science)
Reg SReg JRegθ HReg
θ∂2ΦReg
∂θY m CReg
None (Σy)−1 ∂Y
∂θ (SNone)TZ (SNone)T (SNone)T −(SNone)T[HNone
θ
]−1
SsS (Σy)−1 ∂Y
∂θrϵ(SSsS)TZ (SSsS)T SSsS −(SSsS)T
[HSsS
θ
]−1
TSVD∑rϵ
i=1 ςiuivTi (STSVD)TZ
∑rϵi=1 ς
2i viv
Ti −
∑rϵi=1 ςiuiv
Ti
∑rϵi=1
vivTi
ς2i
Tikh
[SNone
λL
]JNoneθ + 1
2λ2JΩ(θ) HNone
θ + 12λ
2HΩ(θ) −(SNone)T[HTikh
θ
]−1
3.3.3. Tikhonov regularization (Reg=Tikh)
This regularization modifies the identification problem in Eq. 2.5 by incorporating assump-
tions about the size and smoothness of the desired solution [52, 63, 118]. This a priori
information is considered as additional penalty (regularizing) term in the cost function of
the regularized parameter estimation problem
46
3.3. Numerical regularization for parameter estimation
θTikh := argminθ
ΦTikh(u, θ) (3.10a)
ΦTikh(u, θ) := ΦLSQ(u, θ) +1
2λ2Ω(θ), (3.10b)
where Ω(θ) :=L(θ − θR)
22is the penalty term also called the discrete smoothing
norm and assumed to be twice continuously differentiable. Its purpose is to attract non-
identifiable parameters towards their corresponding values in a predefined vector θR ∈ RNθ
[63]. θR should contain the most reliable parameter values (prior information) available.
The scalar parameter λ controls the contribution of the regularization term in the regular-
ized cost function ΦTikh relative to the original and ill-posed cost function ΦLSQ and it is
known as the regularization parameter. If λ is close to zero the regularization is weak and
the problem again becomes ill-posed; on the other hand, if λ is large the regularization
is strong and the solution degenerates to θ = θR. Heuristic and systematic techniques
to determine λ for the linear case are available in literature (e.g., L-curve [52] and ME-
TER criterion [4]). The matrix L ∈ RNθ×Nθ is a diagonal matrix typically the identity
matrix INθ. Other approaches [4, 52, 63, 118] define L as a discrete approximation of a
derivative operator (e.g., first or second derivative operators) or λL as the inverse of the
parameter variances [52]. The most common choice of the functional Ω(θ) is the ridge
regression stabilizer [82, 57, 56] where θR = 0Nθ. In this work, L = INθ
and no a priori
information, i.e., θR = 0Nθ(zero-vector) are considered. It is important to point out that
priori information makes reference to the best available estimate or value of the param-
eters. The Jacobian JT ikhθ ∈ RNy ·Nm·Ne×Nθ and Hessian HT ikh
θ ∈ RNθ×Nθ matrices are
given in Table 3.1. Therein, the first and second derivative of Ω(θ) with respect to θ read
JΩ(θ) = ∂Ω(θ)/∂θ = 2θ and HΩ(θ) = (∂2Ω(θ))/(∂θ2) = 2INθ, respectively.
In this regularization the model can be improved without modifying the model structure,
which means that the parameter space remains unchanged. Indeed, the ill-conditioning of
S is eliminated by introducing a new well-conditioned matrix ST ikh of full rank (see Table
3.1). The regularized sensitivity matrix ST ikh ∈ RNy ·Nm·Ne+Nθ×Nθ is a row-augmented
matrix where its last rows are formed by a user-defined full-row-rank matrix [52, 57, 56, 83].
3.3.4. Regularized parameter covariance matrix
The computation of the covariance matrix of the biased estimator θReg depends on the
chosen regularization technique. Here this matrix is denoted as CReg. The general approx-
imation which also gives origin to Eq. 4.3 is given in Eq. 3.11 (Bard, 1974; Fessler, 1996).
The specific terms used for each regularization technique are given in Table 3.1.
47
3. Theoretical background II: Ill-posed problems and numerical regularization
CReg ≈[HReg
θ
]−1[∂2ΦReg
∂θ∂Y m
] [∂2ΦReg
∂θ∂Y m
]T [HReg
θ
]−1(3.11)
Using Tikhonov regularization, especially for small regularization parameter values (λ→0) the parameter covariance matrix CT ikh in Eq. 3.11 can be approximated by [HT ikh
θ ]−1
(as shown in Table 3.1). Hereafter this simplified version is referred along this manuscript
as Tikhonov regularization (“Reg=Tikh“).
48
4. Computational framework
This chapter presents a computational framework for the analysis of ill-conditioning and
identifiability issues as well as the implementation of numerical regularization in several
stages of the model development (see Figure 4.1).
The framework may be used as a whole piece or segregated to work either on parameter
estimation or optimal experimental design of a previously selected model structure. It
is also conceived to treat adaptive designs and online parameterizations. The existence
of ill-posed problems is a challenge in the experimental implementation. To deal with
the robustification of PE and OED for ill-posed problems the purpose of this chapter is
to systematically combine techniques described in Chapters 2 and 3 to enable the main
procedural components of the computation framework (see Figure 4.1), i.e.,
1. the estimator performance assessment
2. the ill-conditioning analysis, and
3. the identifiability diagnosis.
The framework takes into consideration the option to analyze its estimator by using two
major paradigms, namely, the sensitivity and Monte Carlo methods described in Publica-
tion I [80]. The former uses numerical techniques that perform singular value analysis of
the sensitivity matrix. The latter performs statistical techniques based on Monte Carlo
simulations. Furthermore, in model-based experimental design, the framework has regard
to the most common optimal criteria for parameter variance reduction (i.e., the alphabetic
criteria) even under regularization.
In the sequel the algorithm of the consolidated global computational framework dis-
played in Figure 4.1 is described. Additional theoretical details to assess the estimator
according to the sensitivity and Monte Carlo paradigms of analysis are given. Then, guide-
lines for conducting ill-conditioning analysis, local identifiability diagnosis, implementation
of regularization are exposed. The chapter ends with two additional guidelines regarding
sensitivity analysis and parameter initial guess selection.
4.1. Algorithm
In Figure 4.1 the algorithm of the mentioned computational framework is outlined. It is
based on the iterative work cycle of model-based experimentation for model development
explained in Section 2.1. Consequently, it considers the four major phases, namely
49
4. Computational framework
No
Parameter
estimation
Identifiability
diagnosis
Estimator
performance
assessment
Ill-
conditioning
analysis
ModelingExperiment
Optimal
experimental design
→ (, θ)
Up
dat
ing:
Implement initial guess
Collect experimental data
Select model
Select initial guess θ
Compute model outputs
( , θ)
→ ( , θ)
→ ( , θ)
Compute
Optimal design
Original Sensitivity
matrix
Ill-posed
Problem?Reg = Reg =
Yes No
Precision
reached?
Yes !" $%&'"
Precision
reached?
Yes !" $%&'"
Ill-posed
Problem?Reg =
Reg =
Yes
No
No
1
2
3
2
2
1
1
Parameter estimate θ
3
3
Regularized
sensitivity matrix
Reg = Reg =
Yes NoIll-posed
Problem?
→ ( , θ)
Original
sensitivity matrix
23
θ ← θ
Figure 4.1.: Consolidated framework for development and experimental validation of process mod-els with possible ill-posed problems
50
4.1. Algorithm
the experiment, modeling, parameter estimation and optimal experimental design.
Furthermore, it includes the evaluation of quality of the estimator, the analysis of
latent ill-posedness and the constantly diagnosis of identifiability before and after the
parameter estimation. The implementation of regularization in parameter estimation as
well as in optimal design is included. The iterative algorithm finishes when the desired
parameter precision is reached. The complete framework is originally depicted for the
sensitivity method (see Section 4.2.1), thus in Figure 4.1 the three components (the
estimator performance assessment, the ill-conditioning analysis, and the identifiability
diagnosis) are based on the sensitivity matrix. That is so structured to consider the online
redesign of experiments (Section 2.7.5) which needs fast strategies for the experimental
implementation. Notwithstanding, the algorithm may be executed using the Monte Carlo
method described in Section 4.2.2. When that is so, the sample mean (expected value) of
parameter estimates (Eq. 4.5) are used to optimally design the new experiment in the
OED sequential approach (Section 2.7.4).
The algorithm is presented as follows:
Experiment: Before the starting of an experiment, an initial experimental design
vector uIG has to be chosen. Then these experimental conditions are implemented and
the experimental data Y m are collected.
Modeling: The structure of the model m should have been previously selected and
fixed. An initial parameter guess θIG should be defined and the model should be solved to
get the corresponding model outputs Y (uIG, θIG) as well as the corresponding sensitivity
matrix S(uIG, θIG). With this information starts the evaluation regarding estimator
quality, ill-conditioning and identifiability.
1. Estimator performance assessment: The computing of precision measures,
namely parameter variances throughout the covariance matrix (Section 4.3.1) and confi-
dence intervals (Section 4.3.1) is achieved. 2. Ill-conditioning analysis: The singular
value analysis of the sensitivity matrix in Section 4.4.1 to detect structural problems
is conducted. If the sensitivity matrix is ill-conditioned the ill-conditioned singular
values, the numerical rank and the class of ill-posedness are determined to be used in
the identifiability diagnosis in 3. (when applicable). 3. Identifiability diagnosis: The
determination of the unidentifiable parameters (by using the guidelines in Section 4.4.2)
is accomplished. The results from 2. (when applicable) are here used.
Precision reached?: The precision of the initial guess θIG (parameter variances
calculated from 1.) is compared to a predefined variance threshold. Yes: Current
parameter variances are less than threshold, then available parameter initial guess is
adequate for the model and the model calibration is finished. No: Current parameter
51
4. Computational framework
initial guess is still uncertain and new estimations should be obtained.
Ill-posed problem?: The results from 2. about ill-posedness are here employed. Yes:
If the problem is ill-posed then the parameter estimation should be regularized, i.e., Reg=
Active. One of the regularization techniques explained in Section 3.3 according to the
kind of ill-posed problem should be applied. No: If the problem is not ill-posed then the
parameter estimation can be performed without regularization, i.e., Reg=None.
Parameter estimation: The solution of the PE problem in Eq. 2.5 either with or
without regularization is calculated. The parameter estimate θReg, the corresponding
regularized sensitivity matrix SReg(uIG, θReg) in Table 3.1 and the original sensitivity
matrix S(uIG, θReg)1 evaluated at uIG and θReg are then obtained.
Once more the three evaluations 1. Estimator performance assessment:, 2. Ill-
conditioning analysis: and 3. Identifiability diagnosis: applied to the regularized
and original sensitivity matrices SReg(uIG, θReg) and S(uIG, θReg), respectively are accom-
plished.
Ill-posed problem?: The results from 2. about the ill-posedness of SReg(uIG, θReg)
are here employed. Yes: If the problem is ill-posed then the parameter estimation should
be regularized, i.e., Reg= Active. A regularization technique explained in Section 3.3 ac-
cording to the kind of ill-posed problem should be implemented. The parameter estimation
should be iteratively repeated with θIG = θReg till this condition is not fulfilled anymore.
No: If the problem is not ill-posed the evaluation of the parameter precision should be
performed.
Precision reached?: The evaluation of the precision of the estimate θReg is conducted.
To do so, parameter variances calculated in 1. after the parameter estimation and a
predefined variance threshold are compared. Yes: Current parameter variances are less
than threshold, then available parameter estimates are adequate for the model and the
model calibration is finished. No: Current parameter estimates are still uncertain and a
new experiment should be design.
Ill-posed problem?: The results from 2. about the ill-posedness of S(uIG, θReg)
after parameter estimation are here employed. Yes: If the problem is still ill-posed then
the optimal experimental design should be maintained regularized, i.e., Reg= Active
according to the regularization chosen in the parameter estimation. No: If the problem
is not ill-posed the optimal experimental design is performed without regularization, i.e.,
Reg=None.
Optimal experimental design: The optimal design problem in Eq. 2.35 for the cor-
responding regularized parameter covariance matrix or sensitivity matrix SReg(uIG, θReg)
1This matrix is obtained without regularization
52
4.2. Analysis paradigms
in Table 3.1 is solved. The optimal design uReg, the corresponding updated original and
regularized sensitivity matrices S(uReg, θReg) and SReg(uReg, θReg), respectively are then
obtained. At this point the possibility to use several OED criteria should be exploited.
The selection of the best optimal design should not be only due to the parameter variance
reduction but also based on ill-conditioning and identifiability improvements.
Updating of the input design and parameter vectors: In the sequential and online
approaches for optimal experimental design in Sections 2.7.4 and 2.7.5 they are repeatedly
updated. Accordingly, the updating of the initial input design and parameter vector with
the optimal design uReg and parameter estimate θReg, respectively is accomplished. The
cycle continues till the desired precision is reached.
4.2. Analysis paradigms
This section describes two paradigms to analyze estimated parameters. The first paradigm
is based on one parameter estimation and the inversion of the Fisher-Information matrix
to form the parameter covariance matrix C. The Fisher-information matrix is computed
by parameter sensitivities (section 4.3.1) therefore this approach is called as sensitivity
method. The second paradigm, on the other hand, requires multiple parameter estimations
and assesses the parameter uncertainty with data from finite samples. It needs repeated
optimizations and is called Monte Carlo method.
Both strategies follow the same structure of analysis: they solve the parameter estima-
tion problem, analyze the ill-conditioning and identifiability issues, evaluate the perfor-
mance of the estimator and (where applicable), conduct the optimal experimental design.
Moreover, all computations (PE, OED, ill-conditioning and identifiability analysis) in the
case studies analyzed in this thesis use normalized predicted response variables
yi(u, θ, tk) =yi(u, θ, tk)
maxtk∈T ymi (tk), (4.1)
with i = 1, ..., Ny and for all tk ∈ T , and normalized parameters
θj =θj
θj,IG, (4.2)
with j = 1, ..., Nθ. Hereinafter, for notation simplicity yi ← yi and θj ← θj . Doing so,
the estimation problem is normalized and the computed sensitivity matrix is considered
in its standard form. This matrix preserves all information of the system [20]. A stan-
dardized sensitivity matrix2 has reduced influence of parameters and predicted response
variables with different orders of magnitude which may mask the ill-conditioning analysis
and identifiability diagnosis.
2In linear estimation one recommended column equilibration is to scale the columns of the sensitivitymatrix to unit length [11]
53
4. Computational framework
4.2.1. Sensitivity method
This is the most common strategy to estimate parameters in nonlinear models. The inputs
of this paradigm are a fixed model structure, experimental data (Y m) collected under
the experiment design (u), and the initial guess of parameters (θIG). The parameter or
point estimate θ is computed by solving the parameter estimation problem in Eq. 2.5.
The sensitivity matrix S (obtained either by using finite differences in Eq. 2.14 or the
sensitivity equations 2.15), the Fisher-information matrix F (Eq. 2.10) and the parameter
covariance matrix (Eq. 4.3) are then obtained.
The sensitivity-based method is straightforward to implement and provides fast local
information of the model behavior around the parameter estimates. Sensitivity information
can also be used in conjunction with singular value analysis to perform structural model
analysis to detect ill-conditioning and diagnose identifiability problems. Because of its
inherent local nature, however, the covariance matrix obtained from sensitivity might
not provide a good covariance approximation (specially true when S is ill-conditioned).
Consequently, this method is most useful for qualitative analysis of parameter uncertainty,
the presence of ill-conditioning and (at least practical) identifiability problems.
The summary of techniques when the sensitivity method is used to determine model
parameters is schematized in Figure 4.2. Notice that precision (see Section 4.3.1) but no
accuracy is considered, the ill-conditioning is analyzed by one method (see Section 4.4.1)
whereas the identifiability diagnosis may be performed by using three different methods,
namely the variance (see Section 4.4.2), SVD (see Section 4.4.2) and QR (see Section 4.4.2)
methods.
4.2.2. Monte Carlo method
This method is a simple, useful but computer-intensive technique to explore the unknown
form of the parameter probability distribution [89] and to reach conclusions about ill-
conditioning and identifiability [11, 80]. The idea behind of this method is to propagate
random measurement errors in the experimental data to conduct parameter estimation and
to recover the corresponding parameters which will have a specific probability distribution.
The measurement errors are sampled several times from a known probability distribution
such as a normal distribution. The various experimental data sets are considered virtual
replications of the same experiment.
Two versions of this paradigm depending on the available experimental information can
be implemented. In both versions their results are employed to compute the expected
value E[Θ] in Eq. 4.5, the precision measures (i.e., the approximate covariance matrix C
in Eq. 4.4 and confidence intervals in Eq. 2.28) and the accuracy measures when applicable
(i.e., the bias in Eq. 2.29 and the empirical averaged MSE in Eq. 4.7). They are based on
a fixed structure m, a fixed experiment design u and a fixed parameter initial guess θIG
(see Figure 4.3).
54
4.2. Analysis paradigms
Precision
Cov. Matrix
Confidence
Interval
Ill-conditioning
analysis
Identifiability
analysis
Reliability
tests
Hypothesis
test
Estimator
performance
assessment
Variance
method
QR
method
SVD
method
Sensitivity
method
1 2 3
Parameter Initial Guess θ
Sensitivity Matrix
Parameter
estimation
Estimated parameter θ
Sensitivity Matrix
Initial guess
Experimental data ExperimentModel
Initial guess θ
Model outputs
Modeling
Figure 4.2.: Model parameters estimated and analyzed based on the sensitivity method
The first version uses the available experimental data vector Y m and the measurement
error covariance matrix Cy to generate synthetic data. In this setting NMC data sets
denoted as Y m(j), j = 1, ..., NMC of the available observations Y m by sampling N (Y m, Cy)
are obtained. For each synthetic data set Y m(j) ∼ N (Y m, Cy), the problem in Eq. 2.5 is
solved to obtain the estimates θ(1), · · · , θ(NMC).
In the second version, the NMC data sets are sampled from the normal distribution with
mean Y (u, θ∗) and variance Cy, i.e., Ym(j) ∼ N (Y (u, θ∗), Cy). The true parameter vector
(or reference vector) θ∗ is used to obtain model responses Y (u, θ∗) at fixed design u. The
estimation in Eq. 2.5 is then performed for each synthetic data set Y m(j) and the j-point
estimate θ(j) is recovered.
The replications can also be interpreted as perturbations on the data that can help
to assess the stability of the estimates. When the true parameter θ∗ is known (as in
theoretical studies) it can be also assessed the accuracy of the estimator by computing the
bias and the empirical mean squared error of the estimate. The Monte Carlo approach
enables a more quantitative analysis, but it is computationally demanding. It is a pure
numerical procedure which allows to more precisely quantify the parameter uncertainties
[100]. Moreover, it is a more generally applicable computational alternative to diagnose
ill-conditioning [11] and analyze identifiability problems [125].
The summary of techniques when the Monte Carlo method is used to determine model
55
4. Computational framework
Parameter estimation
Initial guess
Experimental data
Experiment
() ~( , ), or,()
~( , θ∗ , )
For = 1. …
Model
Initial guess θ
Model outputs
Modeling
Precision
Cov. Matrix
Confidence
Interval
Ill-conditioning
analysis
Identifiability
analysis
Reliability
tests
Hypothesis
test
Estimator
performance
assessment
Variance
method
Monte Carlo
method
Parameter estimate θ()
Accuracy
Bias Empirical
MSE
1 2 3
Figure 4.3.: Model parameters estimated and analyzed based on the Monte Carlo method
parameters is schematized in Figure 4.3. Notice that, precision (see Section 4.3.1) and
accuracy (see Section 4.3.2) are considered, the ill-conditioning is analyzed by only one
method (see Section 4.4.1) whereas the identifiability diagnosis may be performed by using
only the variance method (see Section 4.4.2).
4.3. Estimator performance assessment
Precision and accuracy are the arguments which will be used in this framework to assess
the performance of an estimator. Theory of those statistical properties are summarized in
Sections 2.5.1 and 2.5.2 in Chapter 2. In the following, the computation of precision and
accuracy measures corresponding to the sensitivity and Monte Carlo methods are given.
Notice that the accuracy computation in terms of Bias and MSE is only possible when a
reference vector is available.
4.3.1. Estimator precision
The parameter variance σ2θj, the standard deviation σθj and the confidence interval for
the i-parameter are calculated. Two strategies to compute the covariance matrix are
here described. The first strategy computes the so-called average estimated covariance
matrix based on sensitivity information. The second strategy calculates the so-called true
56
4.3. Estimator performance assessment
covariance matrix based on Monte Carlo simulations.
Covariance based on the Sensitivity Matrix
The matrix C is approximated by the inverse of the Fisher-information matrix in Eq. 2.13
which is denoted as F . This matrix is known as the average estimated covariance matrix
[3],
C ≈ F−1 =[ST S
]−1. (4.3)
The Fisher-information matrix F may be only guaranteed to be positive definite when the
sensitivity matrix S is full-rank [101]. The relationship between the Fisher matrix and the
covariance matrix is derived by applying a first-order Taylor expansion of the maximum
likelihood function around θ, assuming a Gauss-Newton approximation of its Hessian and
using the Cramer-Rao bound [3, 38, 74]. Despite the fact that this approximation only
approximately applies to nonlinear models, it is widely used [3, 38].
Covariance based on Monte Carlo
In this approach the so-called true parameter covariance matrix C is estimated via a
sample mean of the NMC parameter point estimates θ(j) with j = 1, · · · , NMC obtained
from the Monte Carlo method (see Section 4.2.2),
C =1
(NMC − 1)
NMC∑j=1
(θj − E[Θ])(θj − E[Θ])T (4.4)
where E[Θ] is approximated using the sample mean
E[Θ] ≈ 1
NMC
NMC∑j=1
θj . (4.5)
This statistical approach can better capture the effect of nonlinearities on the posterior
distribution of the parameters, but it does not provide much insight on specific sources of
identifiability issues.
Confidence intervals
The computation of the confidence intervals according to Section 2.5.1 are based on the
estimator Θj and the standard deviation σθj of the j-th parameter, the confidence level α
and the degrees of freedom. In the sensitivity method the j-th estimator Θj is the point
estimate θj , while in the Monte Carlo method, the sample mean E[Θ] in Eq. 4.5 is used.
σθj is extracted from the covariance matrices in Eqs. 4.3 and 4.4 depending on the selected
method. α is typically 0.05.
57
4. Computational framework
As a further metric the confidence length
CIL(j) = UB(j) − LB(j) (4.6)
is defined and conveniently expressed as relative to the estimated value as 100 ·(CIL(j))/θj .
4.3.2. Estimator accuracy
The evaluation of the accuracy needs a reference parameter vector (preferably the true
parameter vector when applicable) to compute the bias in Eq. 2.29 and MSE in Eq. 2.30.
In the case of the sensitivity method the parameter expected value E[Θ] is considered the
point estimate θ in Eq. 2.5a, whilst in Monte Carlo the empirical sample mean in Eq. 4.5
is required.
Although there are well-known analytical derivations for the MSE in the case of linear
problems, in nonlinear cases this expression can only be estimated numerically [3, 93].
To do so, a series of Monte Carlo replications can be performed to obtain the empirical
averaged MSE as follows,
MSE ≈ 1
NMC
NMC∑j=1
θj − θ∗2 . (4.7)
4.4. Structural Analysis
The previous section seeks to evaluate estimator performance by computing parameter
variances, confidence intervals, biases and MSEs. The computations in that section by
themselves, however, do not detect the ill-conditioning resulting from structural deficien-
cies of the parameter covariance and Fisher matrices [1, 31, 41, 70, 120, 124]. Therefore,
guidelines to evaluate this ill-conditioning and detect identifiability problems are here
presented. The guidelines of the methods to detect ill-conditioning are separately distin-
guished for the sensitivity and Monte Carlo methods, see Sections 4.4.1 and 4.4.1, respec-
tively. The guidelines of all identifiability methods in Section 4.4.2 are suitable for the
sensitivity method but only the variance method also applies to the Monte Carlo approach.
Figures 4.4 and 4.5 display the ill-conditioning strategy and all identifiability techniques,
respectively based on the sensitivity method explained in Section 4.2.1.
4.4.1. Ill-Conditioning Analysis
Sensitivity method
The guideline outlined in Publication III [78] to analyze ill-conditioning is here followed.
It is a singular value analysis summarized as follows:
1. Perform SVD of S (see Section 2.4.1) to obtain SVs = ς1, · · · , ςNθ.
58
4.4. Structural Analysis
Obtain singular values by SVD→
SVs
Compute κ() and γ()
Is well-conditioned?
Classify ill-posed problem
YES
NO
Sensitivity matrix,
Set threshold of SVs → ∈-threshold
Cut the SVs at ∈-threshold and
determine the numerical rank ∈
is well-conditioned,
for = ,… , ∈∈∈∈
Ill-conditioning analysis2
Ill-posed problem
Wel
l-p
ose
d p
rob
lem
iden
tifi
able
Iden
tifi
ab
ilit
y d
iag
no
sis
3
Figure 4.4.: Ill-conditioning analysis based on the sensitivity method.
2. Compute the condition number κ(S) (Eq. A.14) and collinearity index γ(S) (Eq.
A.16).
3. Check if κ(S) and γ(S) satisfy κ(S) ≤ κmax or γ(S) ≤ γmax. If false, it is said that
S is ill-conditioned and the problem is ill-posed.
4. Analyze the kind of ill-conditioning of S according to Figures 3.2 and 3.3, i.e., rank-
deficiency or ill-determined rank class.
5. Define the ϵ-threshold (the lowest bound on the SVs) according to Eq. 3.3.
6. Cut the singular value spectrum SVs at ϵ-threshold. Singular values above the
ϵ-threshold are said to be well-conditioned and those below are said to be ill-
conditioned. The number of well-conditioned singular values reveals the numerical
rank rϵ of the sensitivity matrix, which ideally should be Nθ.
Monte Carlo method
As all Monte Carlo studies, this straightforward alternative is a computationally intensive
procedure. The basic idea is to select “reasonable“ perturbations for creating many ficti-
59
4. Computational framework
Calculate par. variance based on the
Fisher Information matrix
Calculate par. variance from SVD
Compute variance-decomp.
proportions ,, , , = 1,… ,θ
Select j-par. with the largest
influence of ill-conditioning →
∑ , ∈
≥ , = 1,… ,θ
Compute Π by QRP:
Π =
Reorder → = Π
Cut at ∈:
= (∈) , ( !∈)
Define par. variance threshold → ρ
Select parameters with the largest
variance → "θ# > %, = 1,… ,θ
Parameter vector
QR Method SVD Method Variance Method
2
Unidentifiable parameters
Identifiability diagnosis3
ran
k
Define the maximum variance
proportion threshold →
Select the last (& − ∈)
elements of
SV
D o
f S
Sensitivity matrix,
2Ill-conditioning analysis Ill-conditioning analysis
Figure 4.5.: Identifiability diagnosis techniques based on the sensitivity method.
tious experimental data sets Y m(j) by using a prior information of measurement errors (ϵ(j)),
estimate the parameter vectors (θ(j)) corresponding to each fictitious data set, and check
if the estimator remains stable at some point [11, 100, 125]. Small parameter variability
indicates a stable estimator. Otherwise, the estimator is unstable and the problem is
ill-conditioned. Reasonable a priori limits for this variability should be taken.
4.4.2. Identifiability diagnosis
In this section techniques to detect unidentifiable parameters based on parameter variances
and orthogonal decompositions of the sensitivity matrix are discussed, namely variance,
SVD and QR methods. The last technique is also called parameter subset selection or
orthogonalization method. Each technique constructs an identifiable ranking based on
different metrics.
Variance Method
Unidentifiable parameters may be defined as those with variances outside some given
ranges [125] following this simple procedure:
60
4.4. Structural Analysis
1. Compute C either from the sensitivity method as shown in Eq. 4.3 or Eq. 3.5, or
from Monte Carlo method as shown in Eq. 4.4.
2. Take the diagonal elements of C as variances σ2θj
for each parameter with j =
1, · · · , Nθ.
3. Rank parameters of θ by ascending order according to parameter variances.
4. Define the variance threshold ρ.
5. Select the unidentifiable parameters as those with parameter variance larger than
the variance threshold (i.e., σ2θj
> ρ).
SVD Method
As seen in the singular value analysis of Section 4.4.1 and in the variance-decomposition
in Section 3.2.2, small (ill-conditioned) singular values inflate parameter variances. The
following procedure relies on a parameter variance-decomposition that quantifies the con-
tribution of each singular value to the variance of each parameter (see Section 3.2.2). The
rationale behind this ranking is that ill-conditioned singular values have large contributions
to parameters with large variance. In other words, a parameter that is strongly influenced
by ill-conditioned singular values provides evidence of identifiability issues. The procedure
reads:
1. Compute parameter variances σ2θj
for j = 1, · · · , Nθ according to SVD in Eq. 3.5.
Perform procedure of Section 4.4.1 to obtain the numerical rank rϵ and the ill-
conditioned singular values ςi, i = rϵ + 1, · · · , Nθ.
2. Compute the variance-decomposition proportion πi,j , i = 1, · · · , Nθ for each j-
parameter by using Eq. 3.7 in Section 3.2.2.
3. Rank parameters of θ by ascending order according to the sum of their proportions∑Nθi=rϵ+1 πi,j . Parameters on the top of the list are not strongly influenced by ill-
conditioned singular values.
4. Define the maximum proportion threshold πmax (typically set to 0.5).
5. Select the unidentifiable parameters are those satisfying∑Nθ
i=rϵ+1 πi,j ≥ πmax, for
j = 1, · · · , Nθ. If rϵ = Nθ then all parameters are identifiable.
QR Method
In this work the algorithm for local identifiability analysis which has been presented by
Refs. 45 and 122, 123. The original algorithm is based on orthogonal projections of the
Hessian matrix (see Eq. 2.10) and is one of the most widely used among several approaches
61
4. Computational framework
for parameter subset selection [70]. It was recently modified and applied by Ref. [79] using
the sensitivity matrix.
This method selects a subset of linearly independent parameters based on orthogonal
projections of the columns of the sensitivity matrix [70, 79, 78]. It is also called parameter
subset selection or othogonalization method and is a popular way of determining uniden-
tifiable parameters. Other studies dedicated to parameter subset selection can be found
in Refs. 17, 19, 29, 30, 45, 79, 87, 123, 129, 134. The rationale behind this ranking is that
the unidentifiable parameters are expected to have the largest variance among the whole
parameter vector and this can be detected by identifying the linearly dependent columns
of the sensitivity matrix.
The algorithm involves the calculation of the numerical rank rϵ of S (see Section 4.4.1),
which is the dimension of the identifiable parameter subset. In addition, the analyzed
parameter vector θ is reordered according to the linear independence of the columns of S
by applying QRP decomposition (via Householder transformation). This decomposition
(see Definition A.5.10 of Appendix A.5) can be written as:
SΠ = QR, (4.8)
where Q ∈ RNy ·Nm·Ne×Ny ·Nm·Ne is an orthogonal matrix, R ∈ RNy·Nm·Ne×Nθ is an upper-
triangular matrix with decreasing diagonal elements and Π ∈ RNθ×Nθ is a permutation
matrix. Π orders the columns of S according to its linear independence. This means that
the first columns of SΠ are the largest independent set of columns of S. The reordered
parameter vector reads
θ = ΠT θ. (4.9)
The result is θ =((θ(rϵ))T , (θ(Nθ−rϵ))T
)T. Here, θ(rϵ) contains the first rϵ elements of θ
which are the identifiable parameters, and θ(Nθ−rϵ) contains the last Nθ − rϵ elements of
θ, which are not identifiable.
The summarized procedure is:
1. Perform procedure of Section 4.4.1 to obtain the numerical rank rϵ. The dimension
of the identifiable parameter subset is rϵ.
2. Compute the permutation matrix Π from the QRP decomposition of S according to
Eq. 4.8.
3. Build the identifiable ranking θ = ΠT θ.
4. Select the unidentifiable parameters as the last (Nθ−rϵ) entries of the ordered vector
θ.
62
4.5. Regularization
4.5. Regularization
When the problem is diagnosed ill-posed due to the ill-conditioning of the corresponding
sensitivity matrix (see Section 4.4.1) then a treatment is required. Various possibilities are
available, for instance conducting more experiments, adding new experimental information,
restructuring the model or aiding numerically the deficiency. This last numerical treatment
is the regularization. In the following, shorts guidelines to implement three regularization
techniques in parameter estimation and optimal experimental design are given.
4.5.1. Regularization in parameter estimation
In this section a general guideline to apply regularization to the parameter estimation is
formulated which is suitable for the sensitivity method.
1. Determine a convenient regularization according to the kind of ill-posed problem
(i.e., rank-deficient or ill-determined rank) obtained in Section 4.4.1. Rank-deficient
problems perform well when using either SsS or TSVD, whilst ill-determined rank be-
haves well when applying Tikhonov regularization. Notwithstanding, all techniques
might be applied anywhere.
2. Tune the corresponding regularization parameter following the recommendations in
Section 4.5.3 depending on the chosen regularization, i.e., ϵ-threshold for SsS or
TSVD, whilst λ for Tikhonov.
3. Define the regularized cost function depending on the chosen regularization. If SsS is
applied then the cost function in Eq. 3.8 is selected and the unidentifiable parameters
determined in the QR method (see Section 4.4.2) are required. If TSVD is applied
then the cost function in Eq. 3.9 is selected. If Tikhonov is applied then the cost
function in Eq. 3.10 is selected and a priori information of the parameters expressed
in variance σθ and reliable nominal values θR are required.
4. Calculate the regularized sensitivity matrix SReg and the corresponding Jacobian
based on SReg for each regularization according to Table 3.1. If TSVD is applied then
the SVD of S and its numerical rank rϵ determined in the ill-conditioning analysis
(see Section 4.4.1) are required to build the respective regularized sensitivity matrix.
The residual Z is needed to computed the Jacobian in all regularizations.
5. Compute the solution of the corresponding regularized optimization problem in Eqs.
3.8, 3.9 and 3.10. The Jacobian corresponding to each regularization should be
supplied to the algorithm.
63
4. Computational framework
4.5.2. Regularization in optimal experimental design
In this optimization the regularized sensitivity matrix SReg or the regularized covariance
matrix CReg computed at the parameter estimate θReg from Table 3.1 are used. Then the
criteria in Eqs. 2.39, 2.40 and 2.41 are obtained and the optimal experimental design in
Eq. 2.35 is solved.
4.5.3. Selection of the regularization parameter
This section outlines a simple procedure to select regularization parameters suitable for
SsS, TSVD, Tikhonov regularizations. The idea is to explore the effect of a discrete set of
regularization parameters on fitting, parameter precision, parameter accuracy (when appli-
cable) and ill-conditioning. The analyzed regularization parameter range is determined by
the span of the singular values of the sensitivity matrix. The procedure reads as follows:
• Determine the regularization parameter range to conduct the search. To do so,
the maximum and minimum singular values of the sensitivity matrix i.e., ς1 and ςθ,
respectively are used. They are the limits for the regularization parameters ϵ for
SsS and TSVD, and λ for Tikhonov. Note that regularization parameters less than
ςθ does not generate a regularization effect and values larger than ς1 over-regularize
the problem and introduce more bias.
• Sample some values from the aforementioned range determined above. The MBLHD
in Section 2.8.1 would be an appropriate sampling type algorithm.
• Solve the corresponding regularized PE problem, i.e., Eqs. 3.8, 3.9 and 3.10, for each
selected regularization parameter.
• Take the regularization parameters with best simultaneous performance regarding
fitting, precision, ill-conditioning and when applicable accuracy. Regularization pa-
rameters which yield an adequate model fitting, a small parameter variance, ill-
conditioning improvement and a small bias (when applicable) are selected as suit-
able.
4.6. Other analysis
4.6.1. Parameter sensitivity analysis
In this section, the idea of using the parameter-output dynamic sensitivity information
and a suitable sensitivity metric of each j-th parameter, i.e., the sensitivity measure δj is
exploited. Parameter-output dynamic sensitivity information can be found in the sensitiv-
ity matrix of Eq. 2.9. The sensitivity measure δj in Eq. 2.17 allows to rank the parameters
according to their overall influence in the whole vector Y .
64
4.6. Other analysis
In the following, a simple procedure to rank sensitive parameters according to their
overall effect on the outputs is described:
1. Compute the sensitivity matrix either by finite differences in Eq. 2.14 or solving the
sensitivity equations in Eq. 2.15.
2. Take the j-th column of the sensitivity matrix and compute δj in Eq. 2.17 for
j = 1, · · · , Nθ.
3. Rank parameters of θ by ascending order according to the sensitivity measure δj for
j = 1, · · · , Nθ.
For simple problems, the single dynamical effects of each parameter θj on the output
yi ∈ y as a graphical representation might be used. Therein, the corresponding sensitivity
profile ∂yi∂θj|t for t ∈ T is drawn. There are Ny ·Nθ sensitivity time profiles for each experi-
ment ξ ∈ E . The comparison of each profile enables the determination of which parameter
largely excites the outputs. Doing so, the information from the corresponding measurable
variable ymi ∈ ym to recover each parameter can be proved. Insensitive parameters cannot
be reliably recovered from the available data.
Finally, ranking results computed by the QR method in Section 4.4.2 may be connected
with the sensitivity analysis. That allows to determine if unidentifiable parameters selected
according to their linear independence may introduce a large bias in the solution. The
rationale is that parameters unidentifiable (in the sense of linear dependence) with small
sensitivities will introduce less bias than those with large sensitivities. It is important
to point out that this analysis has also a local nature based on the current experimental
information.
4.6.2. Selection of a parameter initial guess
To define an adequate initial guess (IG) in nonlinear optimization problems several possi-
bilities may be taken into account. Some of them are values from literature, values of a
priori calculations using the model, or the best solution for several IGs randomly generated
as starting points. The latter can be based on a sampling type algorithm e.g., MBLHD
(see Section 2.8.1), to obtain initial guess candidates within a prescribed parameter range.
The rationale is that, from different parameter initial guesses the best solution of their
corresponding PE problems in Eq. 2.5 in terms of good fitting and less ill-conditioning is
selected as appropriate parameter initial guess θIG.
The procedure reads as follows:
1. Define several initial guess candidates IGi, for instance from literature, MBLHD,
etc.
2. Solve the PE problem in Eq. 2.5 starting from each IGi and collect the corresponding
i-th cost function CFi.
65
4. Computational framework
3. Follow the ill-conditioning analysis in Section 4.4.1 after each i-th parameter estima-
tion and gather the numerical rank rϵ,i.
4. Select the best solution as that with simultaneously low cost function and large
numerical rank as possible.
It is important to highlight that the solution for the IG with the best fitting (the lowest
cost function) does not have necessarily the less ill-conditioning behavior (the largest
numerical rank). Thus, a trade-off should be found regarding these two properties.
66
5. Lithium-ion battery: Finding adequate
experimental data
5.1. Abstract
1 The lack of informative experimental data and the complexity of first-principles battery
models make the recovery of kinetic, transport, and thermodynamic parameters compli-
cated. This chapter explores how different sources of experimental data affect parameter
structural ill-conditioning and identifiability by using the computational framework in
Chapter 4. The sensitivity method is used to analyze the estimators. Accordingly, its
techniques related to ill-conditioning analysis, practically identifiability diagnosis and esti-
mator performance assessment are here employed. Moreover, dynamic sensitivity profiles
to assess the excitation provided by different parameters on different outputs are also pre-
sented. Monte Carlo studies are accomplished in order to support intermediate conclusions
obtained by the sensitivity method. The study is conducted on a modified version of the
Doyle-Fuller-Newman model of a Lithium-ion battery. It is here demonstrated that the use
of typical experimental data in Lithium-ion batteries i.e., the voltage discharge curves, only
enables the identification of a small parameter subset, regardless of the number of experi-
ments considered. Furthermore, it is shown that the inclusion of a new measurable variable
(i.e., a single electrolyte concentration measurement) significantly aids identifiability and
mitigates ill-conditioning. Doing so, the impact of poor and informative experimental data
on ill-conditioning and identifiability is illustrated.
5.2. Li-Ion Battery Modeling
The Li-Ion battery model under study is that proposed in Ref. 33 and experimentally
validated in Ref. 34. The Li-Ion cell sandwich in Figure 5.1 consists of a lithiated car-
bon anode (LixC6), a polymer electrolyte, and a lithium-manganese-oxide-spinel cathode
(LiyMn2O4). The active material in the composite electrodes is assumed to be made
up of spherical particles and supported on an inert material. The polymer electrolyte in
the separator uses a LiPF6 salt in a non-aqueous liquid mixture of ethylene carbonate
and dimethyl carbonate with a random co-polymer matrix of vinylidene fluoride and hex-
1The content of this chapter is reprinted (adapted) with permission from (D. C. Lopez C., G. Wozny, A.Flores-Tlacuahuac, R. Vasquez-Medrano, and V. M. Zavala. A computational framework for identifia-bility and ill-conditioning analysis of lithium-ion battery models. Industrial & Engineering ChemistryResearch, 55(11):3026-3042, 2016). Copyright (2016) American Chemical Society. (Publication I inAppendix A.2)
67
5. Lithium-ion battery: Finding adequate experimental data
afluoropropylene. The lithium ions (Li+) travel through the electrolyte from one porous
electrode to the other whereas the electrons travel through an external closed circuit. The
Li+ ions react and diffuse in the electrodes towards the inner regions of metal oxide active
material particles (the solid phase). The discharge process takes place when Li+ ions
diffuse from the anode to the cathode.
Llc
la
ls
x
LixC6
r
LiyMn2O4
r
Li+
Cathode AnodeSeparator
e-
Li+ Li+
Figure 5.1.: Li-Ion cell during discharge process. Cell consists of a LixC6 negative electrode, aLiyMn2O4 positive electrode, and a separator with LiPF6 salt-based electrolyte. (Fig-ure taken from publication I - Lopez et al. (2016) in Appendix A.2 - reprinted fromIndustrial & Engineering Chemistry Research with permission from American Chemi-cal Society).
The governing equations presented in Refs. 33, 34 are based on the porous electrode
and concentrated electrolyte theories. These equations consist of mass transport balances
in the electrolyte including migration, diffusion, and reaction; Ohm’s law in the electrolyte
which includes the diffusion potential and the variation of the electrolyte resistivity with
concentration; Fick’s laws in the solid active material which assumes a constant solid
diffusion coefficient; Ohm’s law in the solid electrode matrix; Butler-Volmer kinetics; and
current conservation. Radial diffusion is considered to be the transport mechanism of Li+
ions into the spherical particles in the electrodes.
The space-time model is comprised of a set of highly complex partial differential and
algebraic equations. Strategies to simplify the model are discussed in Refs. 115, 116, 85.
The subscripts a, s, and c to denote anode, separator, and cathode regions, respectively are
used. The subscripts e and s denote the electrolyte and solid phases, respectively. Three
independent variables (axial coordinate x, radial coordinate r and time t) are considered.
The model includes twenty dependent variables (states) summarized in Table 5.1 and
68
5.2. Li-Ion Battery Modeling
Table 5.1.: Variables in Li-Ion Model. (Table taken from publication I - Lopez et al. (2016)in Appendix A.2 - reprinted from Industrial & Engineering Chemistry Research withpermission from American Chemical Society).
Variable Description Unit Anode Separator Cathode
(k = a) (k = s) (k = c)
ce,k(x, t) Electrolyte concentration in regionk
mol m−3 X X X
Φe,k(x, t) Electrolyte-phase potential in re-gion k
V X X X
ik(x, t) Local current density in region k A m−2 X X Xcs,k(x, r, t) Concentration of Li+ on the inter-
calation particle of electrode kmol m−3 X - X
Φs,k(x, t) Solid-phase potential of electrode k V X - Xjn,k(x, t) Pore wall flux of Li+ on the interca-
lation particle of electrode kmol m−2 s−1 X - X
κ0,k(x, t) Ionic conductivity of the electrolytein region k
S cm−1 X X X
Uk(x, t) Open-circuit potential of electrodek
V X - X
Table 5.2.: Estimated parameters in Li-Ion Model.
Parameter Description UnitInitial Guess True Value
θIG θ∗
Ds,a Li+ diffusion coeff. inanode solid particle
m2 s−1 1.50× 10−13 1.00× 10−13
Ds,c Li+ diffusion coeff. incathode solid particle
m2 s−1 1.13× 10−10 7.50× 10−11
D Electrolyte salt diffu-sion coeff.
m2 s−1 5.85× 10−14 3.90× 10−14
ka Anode reaction rateconst.
m2.5 mol−0.5 s−1 3.00× 10−11 2.00× 10−11
kc Cathode reaction rateconst.
m2.5 mol−0.5 s−1 3.00× 10−11 2.00× 10−11
p Bruggman coeff. - 2.25 1.5Rf Anode Film resistance Ω m2 0.135 0.090t+ Transport number - 0.363 0.363
given by,
• The electrolyte concentration ce,k(x, t), electrolyte potential Φe,k(x, t), and local cur-
rent density ik(x, t) in all regions k = a, s, c.
• The solid concentration cs,k(x, r, t), solid potential Φs,k(x, t), and reaction rate
jn,k(x, t) in the electrodes k = a, c.
• The conductivity of the electrolyte κ0,k(x, t) in all regions k = a, s, c.
• The open-circuit potential Uk(x, t) in the electrodes k = a, c.
69
5. Lithium-ion battery: Finding adequate experimental data
The kinetic and transport parameters to be estimated are presented in Table 5.2. They
are the Li+ diffusion coefficient in the solid particle of the anode Ds,a, the Li+ diffusion
coefficient in the solid particle of the cathode Ds,c, the salt diffusion coefficient in the
electrolyte D, the reaction rate constant in the anode ka, the reaction rate constant in
the cathode kc, the Bruggman coefficient p, the film resistance at the anode Rf , and the
transport number t+. The design and operating variables, constants and fixed parameters
are summarized in Table 5.3.
Table 5.3.: Operating and design variables, constants and fixed parameters in Li-Ion Model. (Tabletaken from publication I - Lopez et al. (2016) in Appendix A.2 - reprinted from In-dustrial & Engineering Chemistry Research with permission from American ChemicalSociety).
Variable/Parameter
Description UnitAnode Separator Cathode
(k = a) (k = s) (k = c)
Operating variables
I Discharge current (1.0 C) A m−2 17.5T Temperature K 298
Design variables
c(0)e Initial electrolyte concentra-
tion in region kmol m−3 2000
c(0)s,k Initial concentration of Li+
in electrode kmol m−3 14870 - 3900
Φe,0 Electrolyte potential at x =0
V 0 - -
Φs,c,0 Electrode potential at x =ℓa + ℓs
V - - 4.2
ℓk Thickness of region k µ m 100 52 174ϵk Porosity of region k - 0.357 1 0.444ϵf,k Volume fraction of fillers in
region k- 0.172 - 0.259
Physical constants
F Faraday’s constant C mol−1 96487R Ideal gas constant J mol−1
K−117.5
Kinetic and transport parameters
cs,k,max Max. concentration of Li+
in electrode kmol m−3 26390 - 22860
Rs,k Radius of active material inelectrode k
µ m 12.5 - 8.5
σk Electronic conductivity ofelectrode k
S m−1 100 - 3.8
70
5.2. Li-Ion Battery Modeling
The model here identified was modified in Ref. 85 to aid the computational performance.
The main modifications are summarized as follows:
• The local current density ik with k = a, s, c was eliminated at the electrodes and
the electrolyte by substituting Faraday’s equation in Ohm’s equation. With this
simplification the model reduced to seventeen dependent variables given by ce,a(x, t),
ce,s(x, t), ce,c(x, t), Φe,a(x, t), Φe,s(x, t), Φe,c(x, t), cs,a(x, r, t), cs,c(x, r, t), Φs,a(x, t),
Φs,c(x, t), jn,a(x, t), jn,c(x, t), κ0,a(x, t), κ0,s(x, t), κ0,c(x, t), Ua(x, t) and Uc(x, t).
• The three PDEs representing the electrolyte phase concentrations across the three
regions (i.e., anode ce,a(x, t), separator ce,s(x, t) and cathode ce,c(x, t)) were approx-
imated by a single PDE with the axis dimension spanning x ∈ [0, L]. The new
variable was called ce(x, t). The same simplification was made for the electrolyte
phase potentials (i.e., anode Φe,a(x, t), separator Φe,s(x, t) and cathode Φe,c(x, t)).
The new continuous variable was Φe(x, t). This reduced the system to thirteen depen-
dent variables given by ce(x, t), Φe(x, t), cs,a(x, r, t), cs,c(x, r, t), Φs,a(x, t), Φs,c(x, t),
jn,a(x, t), jn,c(x, t), κ0,a(x, t), κ0,s(x, t), κ0,c(x, t), Ua(x, t) and Uc(x, t).
• The boundary conditions for the potential of both electrodes and the electrolyte
were modified. Specifically, Φe(0, t) = Φe,0 and Φs,c(ℓa + ℓs, t) = Φs,c,0 were set.
Two additional equations related to the integral form of Faraday’s law (see integral
equations in Table 5.5) were introduced. The transformations were based on the
fact that the current supplied by each portion of the anode and cathode should add
up to the total current I. This modification tries to avoid having in this model a
family of solutions instead of a unique solution when null potential flux as boundary
condition is considered.
• L’Hopital’s theorem to the Li-Ion diffusion equation in the solid active material
(derived from Fick’s law) to avoid the indeterminate boundary condition at the
sphere center (r = 0) was applied. Consequently, an additional equation was intro-
duced to represent the concentration of Li+ ions in the solid-phase of the electrodes
(cs,a(x, r, t) and cs,c(x, r, t)).
• A dimensionless model in the axial x and spherical coordinates r was used. Each
region was normalized according to its width. For instance, x∗ = xℓa
and r∗ = rRs,a
in the anode was defined.
71
5. Lithium-ion battery: Finding adequate experimental data
Table5.4.:Governingequationsformodified
Li-IonPDAE
model.(T
able
takenfrom
publicationI-Lopez
etal.(2016)in
Appendix
A.2
-reprintedfrom
Industrial
&EngineeringChem
istryResearchwithpermissionfrom
AmericanChem
ical
Society).
Region
Govern
ingEquation
Boundary
Condition
IBoundary
Condition
IIIn
itialCondition
Anode
Solid-phase
potential:
Φs,a(x,t)
x=
0x=
ℓ a
σeff,a
∂2Φ
s,a
(x,t)
ℓ2 a∂x2
=Faaj n
,a(x,t)
Φs,a(x,t)=
0∂Φ
s,a
(x,t)
∂x
=0
Solid-phase
concentration:c s
,a(x,r,t)
r=
0r=
Rs,a
t=
0
∂cs,a
(x,r,t)
∂t
=3D
s,a
R2 s,a
∂2cs,a
(x,r,t)
∂r2
ifr=
0∂cs,a
(x,r,t)
∂r
=0
c s,a(x,r,0)=
c(0)
s,a
∂cs,a
(x,r,t)
∂t
=D
s,a
R2 s,a
∂2cs,a
(x,r,t)
∂r2
+2 r
∂cs,a
(x,r,t)
∂r
ifr>
0∂cs,a
(x,r,t)
∂r
=−R
s,a
jn,a
(x,t)
Ds,a
Cathode
Solid-phase
potential:
Φs,c(x,t)
x=
ℓ a+
ℓ sx=
ℓ a+
ℓ s+
ℓ c
σeff,c
∂2Φ
s,c
(x,t)
ℓ2 c∂x2
=Facj n
,c(x,t)
Φs,c(x,t)=
Φs,c,0
∂Φ
s,c
(x,t)
∂x
=−ℓ c
Iσeff,c
Solid-phase
concentration:c s
,c(x,r,t)
r=
0r=
Rs,c
t=
0
∂cs,c
(x,r,t)
∂t
=3D
s,c
R2 s,c
∂2cs,c
(x,r,t)
∂r2
ifr=
0∂cs,c
(x,r,t)
∂r
=0
c s,c(x,r,0)=
c(0)
s,c
∂cs,c
(x,r,t)
∂t
=D
s,c
R2 s,c
∂2cs,c
(x,r,t)
∂r2
+2 r
∂cs,c
(x,r,t)
∂r
ifr>
0∂cs,c
(x,r,t)
∂r
=−R
s,c
jn,c
(x,t)
Ds,c
Anode
Sep
arator
Cathode
Electrolyte-phase
potential:
Φe(x,t)
x=
0x=
ℓ a+
ℓ s+
ℓ c
κeff(x,t)∂2Φ
e(x
,t)
l2∂x2
=2κeff(x
,t)R
T
F(1
−t +
)∂2ce(x
,t)
l2∂x2
−Faj n(x,t)
Φe(x,t)=
Φe,0
∂Φ
e(x
,t)
∂x
=0
Electrolyte-phase
concentration:c e(x,t)
x=
0x=
ℓ a+
ℓ s+
ℓ ct=
0
ϵ∂ce(x
,t)
∂t
=D
eff∂2ce(x
,t)
l2∂x2
+a(1
−t +
)jn(x,t)
∂ce(x
,t)
∂x
=0
∂ce(x
,t)
∂x
=0
c e(x,0)=
c(0)
e
72
5.2. Li-Ion Battery Modeling
Table 5.5.: Auxiliary equations of modified Li-Ion PDAE model. (Table taken from publicationI - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial & EngineeringChemistry Research with permission from American Chemical Society).
Region Auxiliary equation
Anode
σeff,a = σa ϵpa
aa = 3Rs,a
(1− ϵa − ϵf,a)
jn,a(x, t) = 2 ka (cs,a(x,Rs,a, t))0.5 (ce,a(x, t))
0.5(cs,a,max −cs,a(x,Rs,a, t))
0.5sinh( 0.5FRT
(Φs,a(x, t)− Φe,a(x, t)− Ua(x, t)− Fjn,a(x, t)Rf ))
Ua(x, t) = −0.16 + 1.32e−3
cs,a(x,Rs,a,t)
cs,a,max + 10e−2000
cs,a(x,Rs,a,t)
cs,a,max∫ ℓa
0
(Faa jn,a(x, t)|Φs,a=Φe,0)dx = I
Cathode
σeff,c = σc ϵpc
ac = 3Rs,c
(1− ϵc − ϵf,c)
jn,c(x, t) = 2 kc cs,c(x,Rs,c, t)0.5ce,c(x, t)
0.5(cs,c,max −cs,c(x,Rs,c, t))
0.5 sinh( 0.5FRT
(Φs,c(x, t)− Φe,c(x, t)− Uc(x, t)))
Uc(x, t) = 4.198 + 0.0565 tanh(−14.554cs,c(x,r,t)|r=Rs,c
cs,c,max+ 8.609)−
0.0275
(1
(0.9984−cs,c(x,r,t)|r=Rs,c
cs,c,max)0.492
− 1.901
)− 0.157 e
−0.0473 (cs,c(x,r,t)|r=Rs,c
cs,c,max)8
+
0.8102 e−40(
cs,c(x,r,t)|r=Rs,ccs,c,max
−0.133)∫ ℓa+ℓs+ℓc
ℓa+ℓs
(Fac jn,c(x, t)|Φs,c=Φs,c,0)dx = −I
Anode/Separator/Cathodek = a, s, c
Φe(x, t) = (Φe,a(x, t),Φe,s(x, t),Φe,c(x, t))
ce(x, t) = (ce,a(x, t), ce,s(x, t), ce,c(x, t))
jn(x, t) = (jn,a(x, t), 0, jn,c(x, t))
κeff (x, t) = (κeff,a(x, t), κeff,s(x, t), κeff,c(x, t))
κeff,k(x, t) = (1× 102 κ0,k(x, t))ϵpk
κ0,k(x, t) = 1.0793× 10−4 + 6.7461× 10−3(1× 10−3ce,k(x, t))− 5.2245× 10−3(1×10−3ce,k(x, t))
2+1.3605×10−3(1×10−3ce,k(x, t))3−1.1724×10−4(1×10−3ce,k(x, t))
4
Deff = (Deff,a, Deff,s, Deff,c)
Deff,k = D ϵpk
ϵ = [ϵa, ϵs, ϵc]
l = [la, ls, lc]
a = [aa, 0, ac]
73
5. Lithium-ion battery: Finding adequate experimental data
The complete set of equations of the modified model is presented in Tables 5.4 and 5.5.
The PDAE system was discretized by the method of lines in the axial and radial coordinates
to obtain a set of DAEs according to Ref. 85. The consistent initial conditions for this
DAE system were obtained as follows. The algebraic equations of the discretized system
and those of the most interconnected variables were decoupled from the whole model at
t = 0 (i.e., jn,a(x, t), jn,c(x, t), Φs,a(x, t), Φs,c(x, t) and Φe(x, t)). Here, the electrolyte
concentration functions for κ0,k(ce(x, t)) with k = a, s, c and Uk(ce,k(x, t)) with k =
a, c were also included. These equations by assuming all the concentrations equal to their
initial values (ce(x, 0) = c(0)e , cs,a(x, r, 0) = c
(0)s,a and cs,c(x, r, 0) = c
(0)s,c ) were solved. With
this initial solution of the pore wall flux of Li+ ions (i.e., jn,a(x, 0) = j(0)n,a, jn,c(x, 0) = j
(0)n,c)
and potentials (Φs,a(x, 0) = Φ(0)s,a, Φs,c(x, 0) = Φ
(0)s,c and Φe(x, 0) = Φ
(0)e ) the discretized
PDEs for the concentrations (ce(x, t), cs,a(x, r, t), and cs,c(x, r, t)) was finally solved.
5.3. Results and Discussion
For the modified Li-Ion model the differential and algebraic state variable
vectors defined in Eq. 2.1 are given by x = (ce, cs,a, cs,c) and z =
(Φe,Φs,a,Φs,c, jn,a, jn,c, κ0,a, κ0,s, κ0,c, Ua, Uc), respectively (states are assumed to be dis-
cretized). The parameter vector is θ = (D,Ds,a, Ds,c, ka, kc, p, Rf , t+). The cell voltage
Vcell(t) and the electrolyte concentration in the separator core ce(ℓa + ℓs/2, t) as the pre-
dicted response variable, i.e., y = (Vcell(t), ce(ℓa + ℓs/2, t)), are considered. The cell voltage
Vcell(t) is computed using the solid-phase potential on the right-hand side of the cathode
Φs,c(ℓa + ℓs + ℓc, t) and on the left-hand side of the anode Φs,a(0, t) as,
Vcell(t) = Φs,c(ℓa + ℓs + ℓc, t)− Φs,a(0, t) (5.1)
Constant discharge current rates I (galvanostatic process) as the input or controls u = I
are assumed. The change of Vcell(t) for a discharge rate I is known as the voltage discharge
curve. The cell voltage Vcell(t) and the electrolyte concentration in the separator core
ce(ℓa + ℓs/2, t) are assumed to be measured at 100 (equally spaced) time points Y m =
(Vcell(tk), ce(ℓa + ℓs/2, tk)), for tk ∈ T = 1, · · · , 100. For both measurable variables a
standard deviation of 1% is employed.
In order to investigate the effect of experimental data collection on ill-conditioning
and identifiability issues of Li-Ion battery models three different cases are set. In those
cases, two main focus are considered, namely the collection of several experiments of the
same output information (Case 1 and 2) and the addition of a new output information
(Case 3) in the parameter estimation. Case 1 and Case 2 study if information provided by
discharge curves is sufficient to reliably estimate key parameters of interest. Whereas Case
3 demonstrates that the use of new experimental information e.g., electrolyte concentration,
dramatically improves identifiability. The three cases can be summarized as follows:
74
5.3. Results and Discussion
• Case 1 (single discharge curve): one experiment that only uses discharge curve infor-
mation (is set I1 as the standard discharge rate) is considered. The only observable
variable is assumed to be Vcell(·).
• Case 2 (multiple discharge curves): the effect of progressively adding experiments
with discharge curve information (Ii, i = 2, . . . , 6 is set) is considered. The only
observable variable is assumed to be Vcell(·).
• Case 3 (discharge curves and electrolyte concentration profile): the effect of pro-
gressively adding experiments with discharge curve information and include the elec-
trolyte concentration in the middle of the separator ce(ℓa + ℓs/2, ·) as observable
variable (Ii, i = 1, . . . , 4 it is set) is considered. Then, the observable variables are
Vcell(·) and ce(·)|x=ℓa+ℓs/2.
Case 1 is analyzed in detail to discuss the different techniques of the framework in
Chapter 4 while Case 2 and 3 are presented in summarized form. The discharge curve at I1
is shown in Figure 5.2. The sensitivity and Monte Carlo methods of Section 4.2 to analyze
parameters are applied. The results of Case 1 (base case), 2 and 3 are presented in Sections
5.3.1, 5.3.2, and 5.3.3, respectively. It is important to point out that the framework
outlined in Chapter 4 may be easily used to determine if other source of experimental
information (including type of measures, measuring error, sample frequency, etc.) are
indeed aiding the ill-conditioning and identifiability of any battery model.
In Table 5.2 the true parameter vector θ∗ taken from Ref. 34 and the initial guess
vector θIG are displayed. In all cases, the experimental data Y m is virtually generated
by perturbing the model solution Y (u, θ∗) at u and θ∗ under measurement error samples
drawn from a normal distribution with mean zero and variance σ2y = 1 × 10−4. To aid
numerical stability and ill-conditioning, the outputs in Y m and parameters are properly
normalized.
In Figure 5.2 the discharge curves for Cases 1 and 2 are displayed. The discharge rates I
are expressed relative to the base current C = 17.5A/m2 having I1 = 1C (base discharge),
I2 = 2C, I3 = 3C, I4 = 4C (fast discharge), and I5 = 0.5C and I6 = 0.1C (slow discharge).
A cut-off voltage of 2.8V is assumed.
5.3.1. Case 1: Single Discharge Curve.
Sensitivity Method
The results of the sensitivity method discussed in Section 4.2.1 are summarized in Table
5.6. The estimate θ and their variances σ2θj
(diagonal elements of C computed from Eq. 4.3)
are also presented. The estimator accuracy is measured in terms of the squared bias β(Θ)2
computed from Eq. 2.29 and MSE computed from Eq. 2.30. The model predictions at θ
are shown in Figure 5.2.
75
5. Lithium-ion battery: Finding adequate experimental data
Model fitting
Time (s)
0 1000 2000 3000 4000
Cel
l V
olt
age,
Vce
ll (
V)
2.75
3
3.25
3.5
3.75
4
Standard discharge rate
I4 I3
I2
I1
Slow discharge rates
Fast discharge rates
Time (s)
0 250 500 750 1000 1250
Vce
ll (
V)
2.8
3
3.2
3.4
3.6
I6
I5
Time (s)
0 104 2×104 3×104 4×104
Vce
ll (
V)
2.753
3.253.5
3.754
4.25
Figure 5.2.: Discharge curves for Case 1 (base rate I1 = 1C) and Case 2 simultaneously consideringfast (I2 = 2C, I3 = 3C, I4 = 4C), and slow rates (I5 = 0.5C and I6 = 0.1C). Markersare experimental data and solid lines are model predictions after parameter estimationat estimator θ. (Figure taken from publication I - Lopez et al. (2016) in Appendix A.2- reprinted from Industrial & Engineering Chemistry Research with permission fromAmerican Chemical Society).
Table 5.6.: Case 1 for Sensitivity method. (Table taken from publication I - Lopez et al. (2016)in Appendix A.2 - reprinted from Industrial & Engineering Chemistry Research withpermission from American Chemical Society).
ParsTrueθ∗
Estimatedθ
Estimator Performance Identifiability analysis(Parameter Ranking)Precision Accuracy
Varianceσ2θ
Biasβ(Θ)2
MSEVarianceMethod
SVDMethod
QRPMethod
Ds,a 6.67× 10−1 2.54× 10−1 3.56× 10−2 1.70× 10−1 2.06× 10−1 (1)∗ (1) (2)∗
Ds,c 6.67× 10−1 9.13× 10−1 5.79× 10+2 6.07× 10−2 5.79× 10+2 (7) (7) (7)D 6.67× 10−1 1.32× 10+0 1.48× 10+1 4.25× 10−1 1.52× 10+1 (5) (5) (5)ka 6.67× 10−1 3.95× 10+3 4.40× 10+15 1.56× 10+7 4.40× 10+15 (8) (8) (8)kc 6.67× 10−1 7.40× 10−1 3.84× 10+1 5.45× 10−3 3.85× 10+1 (6) (6) (6)p 6.67× 10−1 7.93× 10−1 1.55× 10+0 1.59× 10−2 1.56× 10+0 (2) (2) (1)∗
Rf 6.67× 10−1 5.98× 10−1 6.13× 10+0 4.65× 10−3 6.13× 10+0 (3) (4) (4)t+ 1.00× 10+0 2.92× 10−1 9.83× 10+0 5.01× 10−1 1.03× 10+1 (4) (3) (3)∗
Identifiable Subset Dimension 1 0 3
76
5.3. Results and Discussion
Estimator Analysis. Despite the good fitting exhibited in Figure 5.2, the estimated
parameters have large variances if only one discharge curve (at I1) is used as experimental
data. The most precise parameter is the Li+ ion diffusion coefficient in the solid particle
of the anode Ds,a with a variance of 3.56×10−2 and the worst is the reaction rate constant
in the anode ka with variance of 4.40× 1015. The precision of each parameter in terms of
the length of their confidence intervals presented in Eq. 2.28 is shown in Table 5.7. The
lengths are computed as percentages relative to θ∗.
Table 5.7.: Case 1, 2, and 3: Confidence interval lengths for Sensitivity and Monte Carlo methods.Lengths are expressed as percentages relative to the true parameter. (Table taken frompublication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial &Engineering Chemistry Research with permission from American Chemical Society).
Pars
Sensitivity Method Monte Carlo Method
Case 1 Case 2 Case 1 Case 2 Case 3
SC1 SC4 SC6 SC1 SC4 SC6 SC1 SC4
Ds,a 1.1× 102% 5.3× 101% 6.1× 101% 6.0× 103% 1.9× 102% 1.6× 102% 4.6× 102% 5.8× 101%
Ds,c 1.4× 104% 5.5× 102% 3.6× 102% 8.7× 105% 1.3× 105% 9.3× 103% 2.4× 105% 1.4× 102%
D 2.3× 103% 2.3× 102% 2.2× 102% 8.4× 104% 1.7× 102% 8.2× 101% 6.0× 101% 2.4× 101%
ka 3.9× 1010% 2.2× 102% 1.5× 102% 1.4× 106% 1.4× 106% 9.3× 103% 4.4× 106% 8.7× 105%
kc 3.7× 103% 9.1× 102% 9.0× 102% 6.3× 105% 5.3× 102% 5.1× 102% 1.1× 102% 8.4× 101%
p 7.4× 102% 8.3× 101% 7.1× 101% 1.4× 102% 4.9× 101% 3.7× 101% 2.2× 101% 8.0× 100%
Rf 1.5× 103% 6.5× 101% 6.4× 101% 1.4× 102% 3.7× 101% 2.9× 101% 1.4× 102% 1.2× 101%
t+ 1.2× 103% 5.8× 101% 4.8× 101% 2.4× 102% 4.5× 101% 3.8× 101% 4.2× 101% 1.0× 101%
The estimator accuracy in terms of its bias is now quantified. The film resistance at
the anode Rf presents a squared bias of 4.65 × 10−3 which is equivalent to a relative
bias (with respect to θ∗) of 10%. The parameter ka exhibits the largest squared bias of
1.56× 107 equivalent to a relative bias of 5.92× 105%. The overall performance metrics of
this parameter estimator are 4.40×1015 for precision (related to Tr [C]) and 1.56×107 for
bias (as the squared norm of β(Θ)). With these results it is observed that the estimator
for Case 1 is highly unstable.
Ill-Conditioning Analysis. Structural issues by applying the procedures of Section 4.4.1
are now explored. The singular values of S at the parameter estimate θ vary from ς1 =
9.1087 × 101 to ς8 = 1.5074 × 10−8. The spectrum of the singular values SVs is the
black-solid line with markers in Figure 5.3 (left panel). The condition number and the
collinearity index are κ = 6.0428× 109 and γ = 6.6341× 107, respectively.
On the left-hand side of Figure 5.3 lower bounds with respect to the condition number
and the collinearity index, ϵκ and ϵγ , respectively for Case 1 are presented. These values
are computed by using the predefined empirical upper bounds κmax = 1000 [46, 79, 78]
and γmax = 15σy [17, 79, 78]. The bound γmax is scaled by the measurement standard
deviation σy because the scaled sensitivity matrix S = Σ−1y S. Having so, the lowest bound
on the SVs to select the well-conditioned singular values is ϵ = ϵγ = 6.67× 100. For Case
1, only three singular values pass this test, which implies that S has a numerical rank of
77
5. Lithium-ion battery: Finding adequate experimental data
∈κ: lower bound κ
ill-conditioned ςi
well-conditioned ςi
∈γ: lower bound γ
Case 1-SC1
Sin
gu
lar
Valu
e
10−3
10−2
10−1
100
101
102
103
Singular value (ςi)
ς1 ς2 ς3 ς4 ς5 ς6 ς7 ς8
Case 1-SC1Case 2-SC2Case 2-SC3Case 2-SC4Case 2-SC5Case 2-SC6
ill-conditioned ςi
well-conditioned ςi
∈=∈γ
10−3
10−2
10−1
100
101
102
103
Singular value (ςi)
ς1 ς2 ς3 ς4 ς5 ς6 ς7 ς8
Figure 5.3.: Singular value spectra. Left panel is Case 1 (single discharge curve) and right panelis Case 2 (multiple discharge curves). (Figure taken from publication I - Lopez et al.(2016) in Appendix A.2 - reprinted from Industrial & Engineering Chemistry Researchwith permission from American Chemical Society).
three (rϵ = 3) with five ill-conditioned singular values.
Identifiability Analysis. The results to apply the three different identifiability analysis
methods (variance, SVD, and QRP) described in Section 4.4.2 are here described. The
identifiable ranking list obtained for each method in Table 5.6 is shown. The numbers in
parenthesis indicate the position of each parameter in the ranking and the stars indicate
the identifiable parameters according to each method.
• Variance Method: under this method the most identifiable parameter is Ds,a (the
smallest variance of σ2Ds,a
= 3.56× 10−2), and the least identifiable parameter is ka
(the largest variance of σ2ka
= 4.40× 1015). The variance threshold of ρ = 1.5× 10−1
according to Ref. 108 is used which yields only one parameter as identifiable.
• SVD Method: under this method a similar ranking is found. In Figure 5.4 the contri-
bution of each singular value to the variance of each parameter is displayed. It should
be highlighted the strong influence of the ill-conditioned singular values (i.e., ςi for
i = 4, · · · , 8) on the variance. These are responsible for the large variances observed
in Table 5.6. It is also seen that the last two parameters in the identifiable ranking
list (Ds,c and ka) are fully influenced by the smallest singular values ς7 and ς8, re-
spectively (proportions of 100%). The most identifiable parameter Ds,a exhibits the
smallest impact from the ill-conditioned singular values. The proportion, however, is
still significant (79.5%). In fact, if the proportion threshold πmax = 50% according
to Ref. 11 to select the identifiable parameters is employed, this would indicate that
Ds,a could not be classified as an identifiable parameter. Accordingly, all parameters
would be considered unidentifiable. In Table 5.6 a parameter subset dimension equal
to zero is thus presented. From Figure 5.4 it also becomes evident that the smallest
78
5.3. Results and Discussion
singular values ςi for i = 4, · · · , 8 simultaneously affect many parameters. This is an
indication of linear dependence.
• QR Method: under this method the parameter with the most effect on the outputs
is the Bruggman coefficient p. Consequently, in Table 5.6 this parameter on the first
place of the ranking is located. The reaction rate constant in the anode ka is the last
in the ranking. This means that, after all orthogonal projections, this parameter
has the smallest effect on the measured variables. In order to select the number
of the identifiable parameters, rϵ = 3 (the rank of S obtained in the previous ill-
conditioning analysis) is used as the identifiable parameter subset dimension. Under
this threshold, parameters p, Ds,a and t+ are obtained as identifiable.
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 2 3 4 5 6 7 8
Va
ria
nce
-dec
om
posi
tio
n p
rop
ort
ion
(%)
V1 V2 V3 V4 V5 V6 V7 V8
Identifiable parameter position
ςςςς1 ςςςς2 ςςςς3 ςςςς4 ςςςς5 ςςςς6ςςςς7 ςςςς8
Ds,a Ds,c D ka kc p Rf t+
(1) (7) (5) (8) (6) (2) (4) (3)
Well-conditioned ςi
Ill-conditioned ςi
ππππmax
Figure 5.4.: Case 1: Variance decomposition for SVD identifiability method of Section 4.4.2. (Fig-ure taken from publication I - Lopez et al. (2016) in Appendix A.2 - reprinted fromIndustrial & Engineering Chemistry Research with permission from American Chemi-cal Society).
From previous results several conclusions can be established. Firstly, the three methods
indicate that the whole constant parameter vector using one discharge curve cannot be
identified. This is because the discharge signal u = I1 does not excite enough the system
and this is manifested as ill-conditioning of the sensitivity matrix S. This is reinforced
by Figure 5.5, where the sensitivity time profiles of the cell voltage Vcell to the different
parameters are presented. As can be seen, the measured variable Vcell is not significantly
excited by several parameters (several time profiles are flat and close to zero). Thus
insensitive parameters (e.g., ka, Ds,c and kc) are found which can take any value in a
broad space without affecting the output. It is important to highlight that the estimation
of a set of constant parameters is here considered, however the same analysis may be
applied in the case of estimating time-varying parameters, for instance to analyze capacity
fade [104]. In that case, in each cycle the same methods could be used to address similar
issues associated with their estimation.
79
5. Lithium-ion battery: Finding adequate experimental data
dVCe
ll/dD
s,a
−2
−1
0
1
2
3
4
dVCe
ll/dD
s,c
−2
−1
0
1
2
3
4
dVCe
ll/dD
−2
−1
0
1
2
3
4
dVCe
ll/dk a
−2
−1
0
1
2
3
4
dVCe
ll/dk c
−2−1
01234
Time (s)1 10 100 10000
dVCe
ll/dp
−30
−25
−50
Time (s)1 10 100 10000
dVCe
ll/dR f
−30
−25
−50
Time (s)1 10 100 10000
dVCe
ll/dt +
−2−1
01234
Time (s)1 10 100 10000
Figure 5.5.: Sensitivity time profiles of cell voltage with respect to parameters dVcell(t)/dθ at nomi-nal discharge rate I1 for Case 1. (Figure taken from publication I - Lopez et al. (2016)in Appendix A.2 - reprinted from Industrial & Engineering Chemistry Research withpermission from American Chemical Society).
Secondly, it can be also concluded that parameter variability and identifiability issues
are closely related to the ill-conditioning of the sensitivity matrix S. Accordingly, any im-
provement in the ill-conditioning of S will have a beneficial impact in parameter variances,
confidence intervals and identifiability. This can be achieved by providing alternative
discharge signals that more properly excite the system.
Monte Carlo Method
The sensitivity method provides several indications of poor identifiability. In this section,
the Monte Carlo method described in Section 4.2.2 to validate these observations is em-
ployed. To obtain the data sets Y mj the measurement variances σ2
y = 1×10−4 and L = 200
replications are used. The results are summarized in Table 5.8. It is presented the empir-
ical mean E[Θ] in Eq. 4.5 and the parameter variance σ2θkk as the diagonal elements of
the approximate matrix C. The accuracy performance in terms of the squared bias β(Θ)2
and the MSE are also displayed. The marginal probability density function (pdf) for each
parameter are exhibited in Figure 5.6.
Estimator Analysis. From Table 5.8 it can be observed that variances do not match
those obtained with the sensitivity method presented in Table 5.6. This provides evidence
that the variances obtained from the Fisher-information matrix are badly approximated.
From Table 5.8 it can be also seen that parameters Ds,a, Ds,c, D, ka and kc have large
variances while parameters p, Rf and t+ have small ones. This becomes more evident when
observing the relative confidence interval lengths presented in Table 5.7. According to the
80
5.3. Results and Discussion
Table 5.8.: Case 1: Summary of results for Monte Carlo. (Table taken from publication I - Lopezet al. (2016) in Appendix A.2 - reprinted from Industrial & Engineering ChemistryResearch with permission from American Chemical Society).
Parameterθ
Trueθ∗
Meanθ
Estimator PerformanceIdentifiability Analysis(Parameter Ranking)Precision Accuracy
Varianceσ2θ
Biasβ(Θ)2
MSE Variance Method
Ds,a 6.67× 10−1 2.14× 10+0 1.01× 10+2 2.16× 10+0 1.03× 10+2 (4)Ds,c 6.67× 10−1 2.24× 10+2 2.12× 10+6 5.00× 10+4 2.17× 10+6 (7)D 6.67× 10−1 1.24× 10+1 1.98× 10+4 1.38× 10+2 1.99× 10+4 (5)ka 6.67× 10−1 6.91× 10+2 5.42× 10+6 4.76× 10+5 4.40× 10+6 (8)kc 6.67× 10−1 1.04× 10+2 1.13× 10+6 1.07× 10+4 1.14× 10+6 (6)p 6.67× 10−1 6.97× 10−1 5.77× 10−2 9.48× 10−4 5.86× 10−2 (2)∗
Rf 6.67× 10−1 6.30× 10−1 5.55× 10−2 1.37× 10−3 5.69× 10−2 (1)∗
t+ 1.00× 10+0 8.92× 10−1 3.74× 10−1 1.17× 10−2 3.86× 10−1 (3)
Performance metric 8.68× 106 5.37× 105 9.22× 106 −Identifiable Subset Dimension 2
Monte Carlo method, the most precise parameters are the film resistance at the anode
Rf (σ2Rf
= 5.55× 10−2) and the Bruggman coefficient p (σ2p = 5.77× 10−2). The relative
lengths of the confidence intervals for Rf and p, however, are quite large (140% and 143%,
respectively). The most uncertain parameter is ka with a variance of σ2ka
= 5.42×106 and
a relative length of the confidence interval of 1.4× 106%.
In terms of estimator accuracy, it is observed that the mean E[Θ] (Eq. 4.5) presents a
deviation from θ∗ for parameters Ds,a, Ds,c, D and kc larger than that exhibited by the
parameter estimate θ obtained with the sensitivity method and presented in Table 5.6.
This is a reflection of the instability of the parameters. Parameter ka is the most biased
parameter obtained by Monte Carlo. The most precise parameters p, Rf and t+ are also
the least biased parameters. This ranking is very close to that obtained with the variance,
SVD, and QRP methods of the sensitivity setting.
From Figure 5.6 it is clearly seen that only parameters p, Rf and t+ are identifiable
and their distribution are close to normal. With these results it is confirmed that the
parameter estimator obtained in Case 1 is unstable.
Identifiability Analysis. It is now applied the variance method of Section 4.4.2 to analyze
identifiability under Monte Carlo. Under the variance threshold ρ = 1.5 × 10−1 two
identifiable parameters, Rf and p are found which means an identifiable parameter subset
dimension equal two, as indicated in Table 5.8. From this table it is also seen that the
most identifiable parameter is Rf with variance σ2Rf
= 5.55×10−2 and the less identifiable
parameter is ka with variance σ2ka
= 5.42 × 106. This is partially consistent with the
identifiability results of the sensitivity method.
5.3.2. Case 2: Multiple Discharge Curves.
From Case 1 it can be concluded that only a very small parameter subset is identifi-
able. In particular, our analysis indicates that parameters Rf , p and t+ are the only
81
5. Lithium-ion battery: Finding adequate experimental data
Figure 5.6.: Case 1: Marginal pdfs obtained from Monte Carlo. Solid-black lines and filled re-gions represent the normal and the non-parametric distributions of each estimator,respectively. Parameters with a star are nominated as identifiable. (Figure taken frompublication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial &Engineering Chemistry Research with permission from American Chemical Society).
identifiable parameters. Their variances, however, are large. It is now analyzed the ef-
fect of incorporating additional discharge curves. New curves obtained at different dis-
charge rates (I2 = 2I1, I3 = 3I1, I4 = 4I1, I5 = 0.5I1 and I6 = 0.1I1) are progressively
added. The first scenario (denoted as SC1) uses signal u1 = I1 and data Y m1 = V 1
cell(tk)
(this corresponds to Case 1). Scenario SC2 uses the signal vector u2 = (I1, I2) and
data Y m2 = (V 1
cell(tk), V2cell(tk)) and it is continued until scenario SC6 with signal vector
u6 = (I1, I2, . . . , I6) and data Y m6 = (V 1
cell(tk), V2cell(tk), · · · , V 6
cell(tk)).
Sensitivity Method.
For each scenario SC1, ..., SC6 an estimation is conducted in order to obtain the estimates
θξ, the model prediction vector Yξ and its corresponding scaled sensitivity matrix Sξ. It is
then analyzed the estimator performance for each scenario based on its average estimated
covariance matrix Cξ, confidence intervals, and bias β(θξ).
Estimator Analysis. In Table 5.7 it is shown the confidence interval lengths in relative
terms. It is clearly seen that the addition of experimental information reduces the con-
fidence levels. The main reduction is observed when four experiments (SC4) are used.
Interestingly, considering two additional experiments (SC6) only provides a slight improve-
ment over SC4. Moreover, despite the reduction in parameter variance over Case 1, it is
observed that the parameters still have large uncertainties. For instance, for scenario SC6,
we have that D, Ds,c and kc have interval lengths of 216%, 365% and 902%, respectively.
82
5.3. Results and Discussion
Considering the large parameter variability it can be concluded that information from
discharge curves does not seem sufficient to completely identify the parameter vector.
These results also seem to indicate that slow discharge rates I5, I6 are not informative.
This last observation is corroborated by analyzing the sensitivity profiles for different
rates. In Figure 5.7 the sensitivity profiles for the cell voltage Vcell to the parameters for
the discharge rates I1 (standard discharge), I4 (fast discharge), and I6 (slow discharge)
are presented. Therein it is observed that the slow discharge rate I6 provides significantly
less excitation compared to I1 and I4. In addition, it is interesting to observe that the
fast discharge rate I4 induces richer dynamic behavior (this is particularly evident from
the profiles of the Bruggman coefficient and of the diffusion coefficients).
dVCe
ll/dD
s,a
−2
−1
0
1
2
3
4I6I1I4
dVCe
ll/dD
s,c
−2
−1
0
1
2
3
4
dVCe
ll/dD
−2
−1
0
1
2
3
4
dVCe
ll/dk a
−2
−1
0
1
2
3
4
dVCe
ll/dk c
−2
−1
0
1
2
34
Time (s)1 10 100 10000
dVCe
ll/dp
−30
−25
−20
−15
−10
−50
Time (s)1 10100 10000
dVCe
ll/dR f
−30
−25
−20
−15
−10
−50
Time (s)1 10100 10000
dVCe
ll/dt +
−2
−1
0
1
2
34
Time (s)1 10100 10000
Figure 5.7.: Sensitivity time profiles of cell voltage with respect to parameters dVcell(t)/dθ at slowI6, nominal I1, and fast I4 discharge rates for scenario Case 2-SC6. (Figure takenfrom publication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial &Engineering Chemistry Research with permission from American Chemical Society).
Ill-Conditioning Analysis. It is now evaluated changes in ill-conditioning as discharge
curves are added. The results are presented in the right panel of Figure 5.3 where the lift
in the singular value spectrum with each scenario can be observed. This lift is accompanied
by a reduction in the spectrum slope (related to the condition number κ) and an increase
in the smallest singular value ς8 (related to the collinearity index γ). In particular, the
condition numbers for SC4 and SC6 are 6.668× 102 and 6.024× 102, respectively whereas
the collinearity indexes are 1.564×100 and 1.395×100. The spectra and the ill-conditioning
metrics (κ and γ) demonstrate a significant improvement in the ill-conditioning from SC1
to SC4 but just a slight improvement of SC6 with respect to SC4. In Figure 5.3 it is
also seen that the number of well-conditioned singular values is not the full length of the
parameter vector. In other words, matrix S has a rank equal to five only (rϵ=5).
83
5. Lithium-ion battery: Finding adequate experimental data
Identifiability Analysis. In Table 5.9 the subset dimension of the identifiable parameters
after applying the three methods of Section 4.4.2 is presented. Here, the same thresholds of
Section 5.3.1 for Case 1 for each identifiability method are also used. The variance method
selects two and six parameters as identifiable for scenarios SC2 and SC6, respectively. This
is the result of the progressive reduction of the parameter variance. For SC4 and SC6
the identifiable parameters are Ds,a, kc, p, Rf and t+. Parameters Ds,c and ka remain
unidentifiable in all scenarios. The QR method only selects 5 parameters as identifiable
for scenario SC6 while the SVD method only selects one.
Table 5.9.: Case 2: Summary of results for Sensitivity and Monte Carlo methods. (Table takenfrom publication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial &Engineering Chemistry Research with permission from American Chemical Society).
Input actionuξ
Experimental dataY mξ
Identifiable Subset Dimension
SCξ New Iξ
Sensitivity MethodMonteCarlo
VarianceMethod
SVDMethod
QRMethod
VarianceMethod
1 1.0 C = 17.5 [I1] [V 1cell] 1 0 3 2
2 2.0 C = 35.0 [I1, I2] [V 1cell, V
2cell] 2 1 4 3
3 3.0 C = 52.5 [I1, I2, I3] [V 1cell, V
2cell, V
3cell] 3 1 5 4
4 4.0 C = 70.0 [I1, I2, I3, I4] [V 1cell, V
2cell, V
3cell, V
4cell] 5 1 5 5
5 0.5 C = 8.75 [I1, I2, I3, I4, I5] [V 1cell, V
2cell, V
3cell, V
4cell, V
5cell] 6 1 5 5
6 0.1 C = 1.75 [I1, I2, I3, I4, I5, I6] [V 1cell, V
2cell, V
3cell, V
4cell, V
5cell, V
6cell] 6 1 5 5
Monte Carlo Method.
The variances obtained with Monte Carlo are again several orders of magnitude different
than those estimated with the sensitivity method. This is illustrated in Table 5.7. It can
be also observed that, because ill-conditioning is improved as experimental information is
added, the qualitative behavior of both methods becomes similar. From Table 5.7 it is
confirmed that the most precise parameters predicted by Monte Carlo are p, Rf and t+
and a total of five parameters are considered identifiable for SC4, SC5, and SC6. This
is consistent with the variance and QRP methods under the sensitivity setting. This
again suggests that the sensitivity method can qualitatively diagnose variance behavior.
Parameters Ds,c, ka and kc remain with large variance and are unidentifiable.
In Figure 5.8 the marginal probability density functions (pdfs) for scenarios SC4 and
SC6 are presented. By comparing these pdfs with the pdfs of SC1 (Case 1) in Figure 5.6
it is clearly seen an improvement in the stability of the estimates. From Figure 5.8 it is
observed that there is no noticeable difference in the pdfs of scenarios SC4 and SC6 for the
most precise parameters p, Rf and t+. This indicates that, even if the spectrum of singular
values does not improve from SC4 to SC6 (as suggested by the sensitivity method), there
is additional information provided by the discharge rates I5 and I6. This information,
however, is still insufficient to determine the rest of the parameters (particularly ka, kc,
and Ds,c) which still present large variances.
84
5.3. Results and Discussion
Figure 5.8.: Case 2: Marginal pdfs for parameters obtained with Monte Carlo analysis for scenariosCase 2-SC4 (left) and Case 2-SC6 (right). (Figure taken from publication I - Lopezet al. (2016) in Appendix A.2 - reprinted from Industrial & Engineering ChemistryResearch with permission from American Chemical Society).
5.3.3. Case 3: Discharge curves and electrolyte concentration profile.
In Case 3 two observable outputs are used: the cell voltage Vcell(t) and the electrolyte con-
centration in the middle of the separator ce(x, t)|x=ℓa+ℓs/2. It is here considered scenarios
SC1, SC2, SC3, and SC4 that progressively consider the addition of new experimental
information but this time each experiment measures the two outputs. Accordingly, it is
obtained SC1 and SC4 by collecting the experiments at u1 = I1 and u4 = (I1, . . . , I4),
respectively. For consistency, it is used the same discharge rates of Case 1 and Case 2. It
should be noted that scenarios SC5 and SC6 are not here considered because the inclusion
of slow discharge rates have not demonstrated an extra aid on identifiability. Moreover,
these scenarios are highly demanding in computation.
In Figure 5.9, the model fitting for both variables after solving the parameter estimation
for SC1 are displayed. It should be highlighted the nonlinear response of the concentration
profile. Figure 5.10 shows the singular value spectra for all considered scenarios. Therein
it is found a considerable large lift in the spectrum even when only one experiment is used
(Case 3-SC1 compared to Case 1). The form of the new spectrum defines five singular
values as well-conditioned compared to three in Case 1. A small singular value of ς8 =
9.6113× 10−9, however, is still observed for Case 3-SC1. This small singular value makes
the condition number large (κ = 6.2796 × 1010) and the collinearity index large as well
(γ = 1.0404 × 108). When more experiments are added (Case 3-SC2 to Case 3-SC4) the
conditioning becomes better with an extra lift in the spectra and larger values for ς8. In
Case 3-SC4 seven well-conditioned singular values compared to the five in Case 2-SC4
85
5. Lithium-ion battery: Finding adequate experimental data
^
Time (s)
0 500 1000 1500 2000 2500 3000 3500
Electro
lyte C
on
centra
tion
(mol/m
3)
1400
1500
1600
1700
1800
1900
2000
2100
2200
Cel
l V
olt
age
(V)
2.75
3
3.25
3.5
3.75
4
Figure 5.9.: Case 3: Voltage and electrolyte concentration profile at separator for scenario Case3-SC1. (Figure taken from publication I - Lopez et al. (2016) in Appendix A.2 -reprinted from Industrial & Engineering Chemistry Research with permission fromAmerican Chemical Society).
are observed. The effect of adding electrolyte concentration information is thus highly
beneficial from an ill-conditioning stand-point.
Case 1-SC1Case 3-SC1Case 3-SC2Case 3-SC3Case 3-SC4
∈=∈γ
10−3
10−2
10−1
100
101
102
103
Singular value (ςi)
ς1 ς2 ς3 ς4 ς5 ς6 ς7 ς8
Figure 5.10.: Case 3: Spectrum of singular values under different scenarios. (Figure taken frompublication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial &Engineering Chemistry Research with permission from American Chemical Society).
In Figures 5.11 and 5.12 it is further illustrated that the improvement in the ill-
conditioning is associated with the information supplied by the new observable variable.
By comparing the figures it can be seen that the cell voltage is excited in similar ways
for currents I1 and I4 while this is not the case for the electrolyte concentration. In par-
ticular, the electrolyte concentration presents rich and different dynamic responses at fast
and slow rates which aids the identification of parameters. From Figure 5.12 it is also
observed that the electrolyte concentration at the separator core ce(ℓa + ℓs/2, t) is highly
excited by parameters D, p and t+. Parameters Ds,c and kc also provide more excitation
in comparison with Cases 1 and 2 (it is recalled that Ds,c and kc remain unidentifiable in
Case 2).
86
5.3. Results and Discussion
dVCe
ll/dD
s,a
−2
−1
0
1
2
3
4I1I4
dVCe
ll/dD
s,c
−2
−1
0
1
2
3
4
dVCe
ll/dD
−2
−1
0
1
2
3
4
dVCe
ll/dk a
−2
−1
0
1
2
3
4
dVCe
ll/dk c
−2−1
01234
Time (s)1 10 100 10000
dVCe
ll/dp
−30
−25
−50
Time (s)1 10 10010000
dVCe
ll/dR f
−30
−25
−50
Time (s)1 10 10010000
dVCe
ll/dt +
−2−1
01234
Time (s)1 10 100 10000
Figure 5.11.: Sensitivity time profiles of cell voltage with respect to parameters dVcell(t)/dθ atnominal I1 and fast I4 discharge rates for scenario Case 3-SC4. (Figure taken frompublication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Industrial &Engineering Chemistry Research with permission from American Chemical Society).
dce/d
Ds,a
−2
−1
0
1
2
I1I4
dce/d
Ds,c
−2
−1
0
1
2
dce/d
D
−20
−15
0
5
dce/d
k a
−2
−1
0
1
2
dce/d
k c
−2
−1
0
1
2
Time (s)1 10 100 10000
dce/d
p
−200
20406080
100
Time (s)1 10 10010000
dce/d
R f
−2
−1
0
1
2
Time (s)1 10 10010000
dce/d
t +
−5
0
5
10
15
Time (s)1 10 100 10000
Figure 5.12.: Sensitivity time profiles of electrolyte concentration in the separator with respect toparameters dce(ℓa + ℓs/2, t)/dθ at nominal I1 and fast I4 discharge rates for scenarioCase 3-SC4. (Figure taken from publication I - Lopez et al. (2016) in Appendix A.2- reprinted from Industrial & Engineering Chemistry Research with permission fromAmerican Chemical Society).
87
5. Lithium-ion battery: Finding adequate experimental data
The observations for Case 3 are further validated by using Monte Carlo. In Table 5.7
confidence interval lengths are presented and in Figure 5.13 the pdfs for each parameter
for scenarios SC1 and SC4 are depicted. Therein it is observed an important reduction
in the confidence intervals compared to Case 2 presented in Table 5.7. In particular,
parameters Ds,c and kc have large variances and are unidentifiable in Case 2-SC4 while
these parameters become identifiable in Case 3-SC4. These results confirm observations
obtained from the sensitivity analysis presented in Figures 5.11 and 5.12. It can be also
concluded that voltage and electrolyte concentration at a single location is sufficient to
reliably identify 90% of the parameters.
An intriguing finding of this study is that the anode reaction constant ka remains highly
unstable, despite the addition of electrolyte concentration information. This is particu-
larly evident from the marginal pdfs presented in Figure 5.11. This is also confirmed by
the spectrum analysis which gives only seven well-conditioned singular values. From the
sensitivity profiles it can be seen that this parameter indeed excites the output variables.
This, however, does not seem sufficient to reliably estimate the parameter. This situation
can be explained from the observations made by the authors in Ref. 103; who note that
very strong variations (orders of magnitude) of the anode reaction constant are needed to
have an impact on the voltage curve. Our results indicate that, at its nominal value, this
parameter has little influence on the discharge curve. It is possible, however, that around
another nominal point such insensitivity disappears. This issue will be explored in future
work.
Finally, it should be noted that any change in the number, type or quality of exper-
imental information could be analyzed by using the techniques outlined in this study.
Doing so, a better selection of experimental data with the target to increase the number
of identifiable parameters can be done. That means, only if the change in experimental
information gives evidence of ill-conditioning and identifiability improvements should be
taken into consideration (for experimenters and modelers) to increase the quality of the
model parameters.
5.3.4. Computational Issues
The PDAE system is implemented in Matlab and is discretized by using 11 points in the
axial direction for each region and 11 points for the radial direction by using the method
of lines according to Ref. 85. The discretization routines functions dss010 and dss044
in MATLAB Release 2013a (The MathWorks Inc., Natick, Massachusetts, United States)
are used. The routine dss010 computes a tenth-order finite difference approximation of a
first-order derivative. Whereas the routine dss044 computes a fourth-order approximation
of a second-order derivative. In contrast to the solution reported in Ref. 85, the result-
ing DAE system containing 561 equations is here solved by using the integrator IDAS
from SUNDIALS which provides forward and adjoint sensitivity analysis capabilities [55].
88
5.3. Results and Discussion
Figure 5.13.: Case 3: Marginal pdfs for scenarios Case 3-SC1 (left) and Case 3-SC4 (right). (Figuretaken from publication I - Lopez et al. (2016) in Appendix A.2 - reprinted from Indus-trial & Engineering Chemistry Research with permission from American ChemicalSociety).
The parameter estimation problems are solved using the nonlinear least-squares routine
lsqnonlin of MATLAB (trust-region-reflective algorithm). All results were obtained on
a Intel (R) Core(TM) i7-4770K CPU running at 3.50GHz and with 32.0 GB of available
RAM memory.
Table 5.10.: Computational results for simulation and parameter estimation problems. (Tabletaken from publication I - Lopez et al. (2016) in Appendix A.2 - reprinted fromIndustrial & Engineering Chemistry Research with permission from American Chem-ical Society).
Simulation Estimation
Discharge Rate Time [seconds] Instance Time [seconds] Iterations
1.0 C 6.1 Case 2-SC1 476.6 732.0 C 5.4 Case 2-SC2 711.8 543.0 C 4.3 Case 2-SC3 520.6 314.0 C 3.6 Case 2-SC4 343.6 180.5 C 6.6 Case 2-SC6 3751.8 320.1 C 48.1 Case 3-SC1 388.8 70
Case 3-SC2 664.0 57Case 3-SC4 915.2 47
The computational results are summarized in Table 5.10. The average simulation time
(including the time to compute the consistent initial solution) was in the range of 3-7
seconds except for the slow discharge rate which required 48 seconds. This is because
the slow discharge rate requires a long integration time to reach the cut-off voltage. The
number of iterations required to solve the parameter estimation problems is decreased as
experimental information is added because the problem becomes better conditioned and
the optimization algorithm can more easily identify a solution. The only exception is the
Case 2-SC6 which has more iterations. It can be attributed to the lack of information in-
89
5. Lithium-ion battery: Finding adequate experimental data
troduced by scenario SC6 at slow discharge rates. The solution times for Case 2 are longer
than those of Case 3 because the latter requires the computation of additional sensitivity
information. A Monte Carlo procedure with 200 replications for Case 3-SC4 currently
requires around two days to complete. It is noted, however, that these replications can
be performed in parallel and can potentially reduce the time down to fifteen minutes. In
addition, simulations for estimation problems with multiple experiments can also be per-
formed in parallel. For Case 3-SC4 this could reduce the time to four minutes. Parallel
parameter estimation approaches have been proposed in Refs. 37 and 135. The use of
reduced models, as those proposed in Ref. 103 will be also investigated.
5.4. Conclusions and Future Work
The computational framework in Chapter (4) that includes sensitivity and Monte Carlo
methods to evaluate quality of parameter estimates, detect ill-conditioning issues and
diagnose identifiability problems to the Lithium Battery model was applied. It was here
demonstrated that sensitivity methods could qualitative detect unidentifiable parameters
using only structural information of the sensitivity matrix. This provided an advantage
over the most rigorous but also more expensive Monte Carlo method. This framework
could be easily used to determine if other source of experimental information are indeed
aiding the ill-conditioning and identifiability of any battery model.
In this case study the analysis indicated that cell voltage profile information collected
from constant discharge experiments only enabled the estimation of a small parameter
subset. The incorporation of electrolyte concentration profiles at a single axial point
was sufficient to estimate seven of eight parameters: the Li+ diffusion coefficient in the
solid particle of anode, the Li+ diffusion coefficient in the solid particle of cathode, the
salt diffusion coefficient in the electrolyte, the reaction rate constant in the cathode, the
Bruggman coefficient, the film resistance at the anode, and the transport number. The
only unidentifiable parameter was the reaction rate constant in the anode.
As part of future work, the framework will be used to investigate impacts of other
sources of experimental information (including type of measures, measuring error, sam-
ple frequency, etc.) on identifiability and ill-conditioning of constant and time-varying
parameters. On the other hand, it is highlighted that methods here described might be
also used to address similar issues associated with the estimation of not only constant
parameters but also time-varying parameters. Moreover, another strategy different to con-
stant discharge current as constant power mode will be also analyzed. An implementation
tailored to high-performance computing architectures to accelerate analysis times will be
developed as well. Moreover, it will be evaluated the benefit of regularizing the parameter
estimation as a direct way to deal with the ill-conditioning of the sensitivity matrix.
90
6. Bioethanol: Identifying an
over-parameterized model with large
parameter correlations
6.1. Abstract
1 Structure and parameter identification of nonlinear models of biological reaction net-
works is usually over-parameterized with large correlations among parameters. Hence, the
related inverse problems for parameter determination are mathematically ill-posed and nu-
merically difficult to solve. This chapter is aimed at the parameterization and validation of
a highly non-linear process model for the Simultaneous Saccharification and Fermentation
(SSF) process for producing ethanol from lignocellulosic waste materials. To do so, several
components of the computational framework outlined in Chapter 4 such as tracking of an
adequate initial guess, iterative parameter estimation, ill-conditioning analysis, local iden-
tifiability analysis, implementation of the regularization technique called parameter subset
selection (SsS) and estimator precision evaluation are here addressed. Model selection and
reduction are executed, although incipiently. The ill-conditioning analysis and the identifi-
able parameter selection is based on the analysis of the sensitivity matrix by rank-revealing
factorization methods. Using this, a reduction of the parameter search space to a reason-
able subset, which can be reliably and efficiently estimated from available measurements,
is achieved. The successful application of the iterative regularized parameter estimation
with ill-conditioning and identifiability analysis to the SSF process finds a relatively large
reduction in the identified parameter space. It is shown by a cross-validation that using
the practically identified parameters (even though the reduction of the search space), the
model is still able to properly predict the experimental data. Moreover, it is shown that
the model is easily and efficiently adapted to new process conditions by solving reduced
and well-conditioned problems.
1The content of this chapter is reprinted (adapted) with permission from (D. C. Lopez C., T. Barz,M. Penuela, A. Villegas, S. Ochoa, and G. Wozny. Model-based identifiable parameter determinationapplied to a simultaneous saccharification and fermentation process model for bio-ethanol production.Biotechnology Progress, 29(4):1064-1082, 2013). Copyright (2013) John Wiley and Sons. (PublicationII in Appendix A.2)
91
6. Bioethanol: Identifying an over-parameterized model with large parameter correlations
6.2. Bio-ethanol from cane bagasse by SSF process
Bio-ethanol has been an excellent candidate for the replacement of traditional fossil fuels
by alternative sources of liquid fuels. However, bio-ethanol made from sugar sources,
such as starch, corn or sugar cane has not been competitive with traditional fuels from
petrol because of relatively high costs and the complicated situation of the food security.
Therefore, alternative raw materials, for instance (ligno) cellulosic waste materials such as
crop residues, grass, wood ships, wheat straw or bagasse, which can act as an abundant and
cheap source of fermentable sugars, have become increasingly interesting [26, 36, 76, 95].
Experimental data were taken from the SSF process using sugarcane bagasse as biomass
[95]. The SSF process combines enzymatic hydrolysis with ethanol fermentation to keep
the concentration of glucose low. Among the various cellulose bioconversion schemes
this process is considered to be the most promising regarding its efficiency [26, 35]. If
fermentation occurs simultaneously with saccharification, glucose produced during the
saccharification will be rapidly converted into ethanol, reducing the inhibition caused by
high sugar concentration, and therefore the hydrolysis rate can be accelerated [92]. In
comparison with the process where these two stages are sequential, the SSF method enables
attainment of higher (up to 40%) yields of ethanol by removing end-product inhibition, as
well as by eliminating the need for separate reactors for saccharification and fermentation
[76].
Other advantages of this SSF are an easier operation, a shorter fermentation time, a
reduced risk of contamination with undesired microorganisms, due to the high tempera-
ture of the process, the presence of ethanol in the reaction medium, and the anaerobic
conditions and a lower equipment requirement than the sequential process since no hy-
drolysis reactors are needed [76, 23]. In spite of the clear advantages presented by the
SSF, the inconvenience is that there exist different optimal conditions for hydrolysis and
fermentation, which implies a difficult control and optimization of process parameters [23].
Besides, ethanol itself and some toxic substances arising from the pretreatment of the
lignocelluloses inhibit the action of fermenting microorganisms, as well as the cellulase
activity [76]. On the other hand, some compounds (e.g., proteolytic enzymes) that are re-
leased on cell lysis or are secreted by a particular strain can degrade the cellulase affecting
the microorganism-enzyme compatibility. On the whole, several process parameters must
be optimized: substrate concentration, enzyme to substrate ratio, dosage of the active
components (β-glucosidase ) in the enzymatic mixture, and yeast concentration [76].
In the experimental case study, parameters of a model for bio-ethanol production from
sugarcane bagasse in a SSF process were determined.
6.2.1. Experiments
Table 6.1 shows the five different experiments E1, E2, E3, E4, and E5 considered in
this case study. Experiments E1, E2, E3 and E4 were reported by Ref. 95 and carried
92
6.2. Bio-ethanol from cane bagasse by SSF process
out in the Bioprocess Development Laboratory in chemical school of Federal University
of Rio de Janeiro. Experimental conditions of E1-E4 were defined in Ref. 95 by using
the analysis of variance method (ANOVA) in factorial designs (e.g. 23 and 34 designs)
and subsequent optimizations with response surface analysis. Those experiment designs
were addressed: a) to reveal the significant factors on the enzymatic hydrolysis (analyzed
factors: temperature, pH, enzymatic load and cellulignin content), and b) to investigate
the responses of the SSF process under the manipulation of the corresponding three main
factors (i.e. cellulignin content, pre-hydrolysis time and initial yeast). Experiment E5
was achieved in the Bioprocesses Laboratory in the Engineering Faculty of University of
Antioquia, Colombia; this experiment was designed using the optimal experimental con-
ditions identified by Ref. 95 for cellulignin content and pre-hydrolysis time but changing
the values of all remaining input factors of the SSF process in Table 6.1.
Table 6.1.: Experimental Conditions for Bio-Ethanol Production from Sugarcane Bagasse in theSSF Process. (Table taken from publication II - Lopez et al. (2013) in Appendix A.2- reprinted from Biotechnology Progress with permission from American Institute ofChemical Engineers).
EXPERIMENT
Cellulignin
dry solid
content
(w/w)
Pre-
hydrolysis
time (h)
Enzymatic load*Initial
yeast, Cx0
(g/L)
Experiment
duration
(h)GC-220
(FPU/g)
β-glucosidase
(UI/g)
1 E1 30% 12 26 17 6 33
2 E2 20% 8 26 27.3 6 30
3 E3 20% 8 26 - 6 33
4 E4 20% 12 30 - 2 33
5 E5 30% 12 35 - 3 33
* With respect to cellulignin dry mass (g)
Each experiment was repeated twice and a very good repeatability was obtained for
all experiments. Experimental data E2 and E3 (E2 & E3) were used in the parameter
estimation, and experiments E1, E4, and E5 were considered for the cross-validation of
the identified model.
The common characteristics of all five experiments are listed below:
1. The lignocellulosic material (cellulignin) was a pretreated residue from sugarcane
(i.e., sugarcane bagasse).The cellulignin was obtained by acid hydrolysis of sugarcane
bagasse from which the hemicellulosic fraction was removed. This resulting solid
residue was pretreated for increasing the accessibility of enzymes to cellulose by a
partial removal of the lignin. The pretreatment of ligno- cellulosic biomass and the
measurements of enzymatic load were conducted as described in Ref. 96
2. The initial suspension contained dry weight solid composed of the lignocellulosic
material considering a content of cellulose of 67% (w/w).
93
6. Bioethanol: Identifying an over-parameterized model with large parameter correlations
3. Before the SSF process, an enzymatic pre-hydrolysis at 50 C was carried out to
allow for the buildup of fermentable glucose.
4. Commercial enzymatic preparations with GC 220-Genencor and β-glucosidase (ac-
tivities of 104 FPU/mL and 439 IU/mL, respectively) were used; the concentration
of protein per mL of GC 220 was 220 mg/mL and 109 mg/mL of β-glucosidase.
Enzymatic load in Table 6.1 makes reference to cellulignin dry mass in gram.
5. After pre-hydrolysis stage, microorganisms were added at 37 C; the microorganism
was a commercially available Saccharomyces cerevisiae (Fleischmann’ yeast).
6. Glucose, cellobiose, and ethanol concentrations were determined by high-
performance liquid chromatography (HPLC) using a ShodexSCIOlI ion exchange
column for sugars (300 × 8 mm2; Shoko Co., Tokyo) at 80 C as stationary phase
and degassed Milli-Q (Molsheim, France) water as the mobile phase at a flow rate
of 0.6 mL/min 95. Standard deviations of cellobiose, glucose, and ethanol concen-
trations in the HPLC were 0.022 g/L, 0.016 g/L, and 0.020 g/L, respectively.
6.2.2. Modeling
The SSF process is described by a generic dynamic model taken from Ref. 35 4 which
is a compilation of models of different complexity presented in Refs. 36, 99, 98, 97.
For parameterizing these models, several experimental substrate-enzyme-microorganism
systems were used in literature depending on the goal of the study. For instance, the
systems of Avicel-Cellubrix 2-Saccharomyces cerevisiae [35] and waste paper-Econase 3-
Saccharomyces cerevisiae [97] were used for parameterizing both process hydrolysis and
fermentation, whereas the systems Avicel-Cellubrix 4-Saccharomyces cerevisiae and wheat
straw-Cellubrix 5-Saccharomyces cerevisiae were used only for the hydrolysis stage [36] and
the system glucose-Brettanomyces custersii system only for the fermentation stage [99].
The generic dynamic model considers the four most influencing factors for the kinetics of
SSF process, namely, the cellulosic substrate concentration, the cellulase and β-glucosidase
enzyme system, the substrate-enzyme interaction, and the enzyme-yeast interaction. The
simplified reaction mechanisms are presented in Figure 6.1, where cellulose is simulta-
neously hydrolyzed to cellobiose (υ1 as production rate of cellobiose from cellulose by
cellulase) and glucose (υ3 as production rate of glucose from cellobiose by β-glucosidase),
cellobiose is then converted to glucose (υ2 as production rate of glucose from cellulose
by cellulase), and glucose is catabolized to ethanol, biomass, and carbon dioxide by the
fermentative microorganism. Yeast growth and glucose consumption rates are expressed
2Enzyme complex of cellulase and β-glucosidase of Novozymes Corp., Denmark.3Enzyme complex of cellulase and β-glucosidase of Enzyme Development Corp., New York, NY.4Enzyme complex of cellulase and β-glucosidase of Novozymes Corp., Denmark.5Enzyme complex of cellulase and β-glucosidase of Novozymes Corp., Denmark.
94
6.2. Bio-ethanol from cane bagasse by SSF process
by υ4 (i.e., biomass production rate) and υ5 (i.e., substrate consumption rate), respec-
tively. The hydrolysis of cellulose by cellulase is a reaction that takes place on the surface
of the insoluble substrate (heterogeneous catalysis), whereas hydrolysis of cellobiose by
β-glucosidase is carried out in the aqueous phase (homogeneous catalysis).
Figure 6.1.: Simplified reaction mechanisms in SSF processes [36]. (Figure taken from publicationII - Lopez et al. (2013) in Appendix A.2 - reprinted from Biotechnology Progress withpermission from American Institute of Chemical Engineers).
The dynamic behavior of cellulose (Cc ), cellobiose (Ccb ), glucose (CG ), biomass (CX
), and ethanol (CEtOH ) concentration during a batch SSF process are described by the
following mass balances in Eqs. 6.1-6.6. The change in enzyme concentration (CE) [36] is
presented in Eq. 6.6.
∂Cc
∂t= − [υ1 + υ3] (6.1)
∂Ccb
∂t= 1.056υ1 − υ2 (6.2)
∂CG
∂t= 1.053υ2 − 1.111υ3 + υ5 (6.3)
∂CX
∂t= υ4 (6.4)
CEtOH = −0.511 [1.111(Cc − Cc0) + 1.053(Ccb − Ccb0) + (CG − CG0) + (CX − CX0)]
(6.5)
∂CE
∂t= −KDCE (6.6)
The kinetics of the hydrolysis model (υ1 , υ2 , and υ3 expressed by g L−1 h−1 ) are
presented in Eqs. 6.7 - 6.9, in which kmax,i with i = 1, · · · , 3, is the maximum specific
95
6. Bioethanol: Identifying an over-parameterized model with large parameter correlations
rate at full saturation of the substrate with enzyme in the respective reaction. The active
amount of enzyme, for reactions with cellulose as substrate (υ1 and υ3 ), is assumed to be
determined by enzyme adsorption onto the cellulose substrate according to the principles
of heterogeneous catalysis (Langmuir adsorption constant KL ). Inhibition by glucose to
cellulase and β-glucosidase was assumed through K1,G and K2,G . For all three reactions,
υ1 , υ2 , and υ3 in Figure 6.1, the zero-order rate constant is given as a function of
temperature (activation energy Ea ); in addition, all enzyme activity is assumed to be
subject to thermal inactivation (KD ). Moreover, inhibition of cellulase by ethanol is
assumed to affect the rates of reaction υ1 and υ3 by inhibition constant K1,EtOH [35].
The thermal inactivation constant KD follows an Arrhenius type relationship, KD(T ) =
ADe−∆H
T . Reductions in the glucose production originated by changes in nature of the
cellulose substrate during the SSF process is considered using a recalcitrance constant Krec
[35].
υ1 =
[kmax,1 ·
CE
KL + CE
]· Cc
[K1,G
K1,G + CG
]·[e−KD(T )·t
]·
[e
−EaRT
e−Ea
RTref
]·[
K1,EtOH
K1,EtOH + CEtOH
]·[e−Krec·
(1− Cc
Cc0
)] (6.7)
υ2 = [kmax,2 · eg · eT ] ·
⎡⎣ Ccb
Km
(1 + CG
K2,G
)+ Ccb
⎤⎦ · [e−KD(T )t]·
[e
−EaRT
e−Ea
RTref
](6.8)
υ3 =
[kmax,3 ·
CE
KL + CE
]· Cc
[K1,G
K1,G + CG
]·[e−KD(T )·t
]·
[e
−EaRT
e−Ea
RTref
]·[
K1,EtOH
K1,EtOH + CEtOH
]·[e−Krec·
(1− Cc
Cc0
)] (6.9)
For modeling glucose consumption and biomass formation, standard Monod kinetics ex-
panded to include ethanol inhibition on yeast [35] were assumed. Yeast growth rate (υ4 )
and substrate consumption rate (υ5 ) are described in Eqs. 6.10 and 6.11.
υ4 =µmax · CG
KG + CG· CX ·
[Kiy,EtOH
Kiy,EtOH + CEtOH
](6.10)
υ5 = −υ4Yxg− CX ·ms (6.11)
In the above depicted model Eqs. 6.1 - 6.11, neither the in- hibition by released com-
pounds on cell lysis or secreted by a particular strain on cellulases, nor the inhibition by
components in the enzyme preparation which might reduce microbial viability leading to
cell lysis [76] were considered. Moreover, the inhibition by some toxic substances arising
from pretreatment of the lignocellulose on fermenting microorganisms, as well as on the
cellulase activity [76] was also not formulated in the mentioned model.
96
6.3. Results and discussion
6.3. Results and discussion
All computations were carried out in Matlab 2008 R⃝. The solver ode15s was used for
integration of the process model, nonlinear regression was performed using the function
nlsqnonlin and the Levenberg-Marquardt least squares minimization algorithm. The pa-
rameter set of the DAE system Eqs. 6.1-6.6 analyzed in this chapter is given in Table 6.2,
where the parameter set dimension is Nθ = 14 . An analysis of the concentration mea-
surements showed that the errors for all three analyzed components (cellulose, cellobiose,
and glucose) were in the same range with no correlations between measurements. Thus,
the covariance matrix of experimental errors Cy was set to a diagonal matrix based on
the standard deviations reported in Section 6.2.1. Measured variables in the experiment
conducted by [95] were cellobiose Ccb, glucose CG, and ethanol CEtOH concentrations such
that the experimental data vector was Y m = (Ccb, CG, CEtOH)T with Ny = 3. E2 and E3
described in Section 6.2.1 and summarized in Table 6.1 were considered simultaneously in
the parameter estimation. With 15 sampling points for each measured variable in E2 and
18 points in E3, the total number of sampling points, for the experimental set from now
on named E2&E3, is then Nm = 99. Because of the difference between the magnitudes
of each parameter involved in this problem, the i, j-th element of the sensitivity matrix S
was normalized as follows
sij =∂yi∂θj
max(|θj | , θtrsh)max(|yi| , ytrsh)
, (6.12)
where θtrsh and ytrsh are user-defined thresholds, e.g., as a function of the machine preci-
sion root square√ϵmach. The normalized sensitivity matrix S was then used to perform
the ill-conditioning and local identifiability analysis. The methodology used to determine
and analyze model parameters was the sensitivity method (see Section 4.2.1). The stages
“Experimentation“, “Modeling“ and “Parameter estimation“ of the consolidated computa-
tion framework depicted in Figure 4.1 were executed. In this case study the optimal
experimental design was not conducted. The results for the different mentioned stages
of the algorithm are discussed in the following. It should be point out that for the local
identifiability analysis only the QR method 4.4.2 is considered whilst the SsS approach is
applied as regularization technique.
6.3.1. Model selection
In the first step of the “Modeling“ stage in Figure 4.1, an appropriate model for the
SSF process was selected. For doing so, several model candidates were proposed. These
candidates had their origins in the formulation of the SSF models proposed by Refs. [36,
35]. Additional changes of these preliminary candidates were analyzed. For each candidate
structure a parameter estimation was solved using the same experimental data Y m of
all experiments described in Table 6.1. In order to identify the best model structure
97
6. Bioethanol: Identifying an over-parameterized model with large parameter correlations
the selection of the best candidate was based on the model fitting performance, i.e, the
minimum value of the cost function in Eq. 2.5b.
The model in Eqs. 6.1-6.11 was subject to three main changes. The first two changes
affected production rates with cellulose as substrate (i.e., υ1 and υ3), whereas the last
change affected production rates with cellobiose as substrate (i.e., υ2). The summary of
the referenced changes are given in the following:
• Change 1: the factor Cc was removed from Eqs. 6.7 and 6.9. Moreover, the
new lumped parameters k∗max,1 and k∗max,3 in Eq. 6.13 with units g L−1 h−1 were
used for production rates υ1 and υ3. This change may be supported by assuming
that the hydrolytic system is independent from substrate concentration because all
experiments were conducted with higher initial substrate concentrations (Cc0 taking
values of 133.3 and 200 g/L, see 6.1) compared with values in Ref. [36] (i.e., Cc0 = 40
g/L). Having so, it is considered that the whole time there was an excess amount of
cellulignin available for the enzymes.
υi =
[k∗max,i ·
CE
KL + CE
]·[
K1,G
K1,G + CG
]·[e−KD(T )·t
]·
[e
−EaRT
e−Ea
RTref
]·[
K1,EtOH
K1,EtOH + CEtOH
]·[e−Krec·
(1− Cc
Cc0
)], i = 1, 3
(6.13)
• Change 2: the term making reference to as substrate recalcitrance effect using the
parameter Krec was removed from Eqs. 6.7 and 6.9. This term had been previously
considered in Ref. [35] for explaining some reductions in the glucose production
according to its experimental data. The reason for the deletion was that a decrease
in glucose production by this effect was not expected in the current system due to
the fact that the cellulose substrate used in each experiment of Table 6.1 was highly
pretreated, which physically transformed the structure of the sugarcane bagasse (see
Ref. [95]), making it more susceptible to the enzymatic action, and thus reducing
the possibility for this event to take place. Values of Krec close to zero obtained
by solving initial parameter estimation problems (results here not shown) confirmed
this assumption. With this additional change, the Eq. 6.13 takes the next form:
υi =
[k∗max,i ·
CE
KL + CE
]·[
K1,G
K1,G + CG
]·[e−KD(T )·t
]·
[e
−EaRT
e−Ea
RTref
]·[
K1,EtOH
K1,EtOH + CEtOH
], i = 1, 3.
(6.14)
• Change 3: the ethanol inhibition proposed by Ref. [99] was here included by using
the constant K2,EtOH , which affects the rate of the reaction of cellobiose to glucose
98
6.3. Results and discussion
(i.e., υ2) in Eq. 6.8. The new formulation is shown in Eq. 6.15:
υ2 = [kmax,2 · eg · eT ] ·
⎡⎣ Ccb
Km
(1 + CG
K2,G
)+ Ccb
⎤⎦ · [e−KD(T )t]·
[e
−EaRT
e−Ea
RTref
]·
[K2,EtOH
K2,EtOH + CEtOH
] (6.15)
The finally selected model including the three changes described above is composed
of Eqs. 6.1-6.6, 6.10, 6.11, 6.14, and 6.15. Figure 6.2 shows the experimental data of
E1 and the adjusted predictions of the original model proposed by Ref. [35] (shown
in Eqs. 6.1 - 6.11) and the adjusted predictions of the finally selected model with cost
function of 33.1 and 18.5, respectively. The parameter values for the thermal inactivation
0
20
40
60
80
100
0 10 20 30
Co
nce
ntr
ati
on
(g
/L)
Time (h)
Cellobiose Glucose
Ethanol Ccb Model
CG Model CEtOH Model
0
20
40
60
80
100
0 10 20 30
Time (h)
Figure 6.2.: Model fitting to experimental data E1 by using (left panel) the original model proposedin Ref. [36] and (right panel) using the finally selected model after the model selectionstep. (Figure taken from publication II - Lopez et al. (2013) in Appendix A.2 -reprinted from Biotechnology Progress with permission from American Institute ofChemical Engineers).
constant KD(T ) = AD · e∆H/T as well as the activation energy Ea in Eqs. 6.14 and 6.15
of the final model were maintained constant taking into account that these parameters
had limited impact on the solution according to previous sensitivity analysis. Reference
values were taken from Ref. [35] i.e., ∆H = 1.48 × 105 J mol−1, AD = 3.64 × 1018 h−1,
and Ea = 2.98 × 104 J mol−1. On the other hand, the initial enzyme concentration CE0
at t = 0 and values of eT and eg depended on the specific enzyme dosage of the specific
experiment. For both experiments E2 and E3 the value of CE0 was 5200 FPU/L, whereas
values of eT were 5.35 g/L and 4.58 g/L, and values for eg were 2,900 U/g and 1,800 U/g
for experiment E2 and E3, respectively.
99
6. Bioethanol: Identifying an over-parameterized model with large parameter correlations
6.3.2. Parameter initial guess selection
There are different options for defining an initial parameter guess (see Section 4.6.2) for
the initialization of the computational framework in Figure 4.1. Here, the first attempt
was to use literature values taken from Ref. [35] and [97] summarized in Table 6.2 as
IGDrissen and IGPhilippidis, respectively. The solutions of the PE problem in Eq. 2.5
Table 6.2.: Initial guesses taken from literature and generated by MBLHD. (Table taken frompublication II - Lopez et al. (2013) in Appendix A.2 - reprinted from BiotechnologyProgress with permission from American Institute of Chemical Engineers).
PARAMETER UNITIGDrissen IGPhilippidis IG4 IG22
q0 q0 q0 q0
1 kmax,1 h-1 0.081 0.0827 3.49 21.50
2 kmax,2 gU-1h-1 0.0108 0.00406 4.42 0.92
3 kmax,3 h-1 0.058 0.0834 11.5 8.5
4 ms h-1 0.02 0 0.884 0.183
5 Yxg gg-1 0.11 0.113 0.050 0.750
6 mmax h-1 0.25 0.19 0.582 2.083
7 KG gL-1 0.0252 0.000037 150 683
8 KL FPUL-1 18.2 544.89 984 917
9 K1,G gL-1 6.3 53.16 275 75
10 K1,EtOH gL-1 95 50.35 15.0 45.0
11 Kiy,EtOH gL-1 50 50 48.3 81.7
12 Km gL-1 1.92 10.56 425 358
13 K2,G gL-1 0.54 0.62 58.2 491.8
14 K2,EtOH* gL-1 - - 225 108
simultaneously considering experimental data from E2 and E3 and starting from initial
values IGDrissen and IGPhilippidis are shown in Figure 6.3. It can be seen that extremely
low values for Ccb , CG , and CEtOH compared with measured concentrations were calcu-
lated after successful termination of the parameter estimation algorithm. The worst result
is obtained for IGPhilippidis , with values of the cost function CFIGDrissen = 191.6 and
CFIGPhilippidis = 290.5 (see Table 6.2). The poor fitting might be attributed to remark-
able differences between the substrate, enzymatic load, and operating conditions used in
the literature experiments compared with experiments E2 and E3. Moreover, differences
exist between parameters in model from literature and the selected model because of the
dissimilarity in the model structure, (e.g., kmax,1 and kmax,3 in Table 6.2 are different
from k∗max,1 and k∗max,3 in Eq. 6.13); consequently, their true parameter values are far
away from that considered in literature [35, 97]. The second strategy for selecting the
parameter initial guess was the data collection plan MBLHD (see Section 2.8.1). This
method allows to sample different parameter guesses within a prescribed parameter range.
From the parameter range an by using MBLHD(θ) 30 initial guesses (i.e., NIG = 30) were
generated and their corresponding parameter estimation problems were solved. In order to
select the most appropriate initial guess the procedure in Section 4.6.2 was applied. The
cost function CF of the parameter estimation was the model fitting measure, whereas the
100
6.3. Results and discussion
0
20
40
60
80
100
0 10 20 30
Time (h)
0
20
40
60
80
100
0 10 20 30
Con
cen
trati
on
(g/L
)
Time (h)
Cellobiose Glucose Ethanol
Ccb (Drissen) CG (Drissen) CEtOH (Drissen)
Ccb (Philippidis) CG (Philippidis) CEtOH (Philippidis)
Figure 6.3.: Fitting of experimental data using the model selected in step 1: (left panel) resultsfor E2 and (right panel) results for E3. Different initial guesses, IGDrissen[35] andIGPhilippidis [97], were used for the solution of the parameter estimation problemswhere measured data from E2 and E3 was considered simultaneously. (Figure takenfrom publication II - Lopez et al. (2013) in Appendix A.2 - reprinted from Biotechnol-ogy Progress with permission from American Institute of Chemical Engineers).
numerical rank rϵ (see Section A.5.11) of the sensitivity matrix was the ill-conditioning
measure. The best initial guess was that with minimum CF and maximum rϵ.
The obtained CF values as well as the corresponding numerical rank rϵ are shown in
Figure 6.4. It should be noted that points where no convergence could be reached were not
shown in Figure 6.4 (i.e., IG8, IG11, and IG27). Additionally, the dashed line indicates
a user defined upper bound of the cost function (“CF Bound=50“), where all IGi smaller
than this value generated an appropriate fitting to experimental data. In that context the
best initial guess was IG4 with the lowest CF value, i.e., CF4 = 27.0, however it did not
have a large numerical rank, i.e., rϵ = 8. On the other hand, in terms of ill-conditioning,
an initial guess is adequate if it promotes a matrix as well-conditioned as possible, in this
case study that means a system with a large numerical rank rϵ. The largest numerical rank
among the evaluated initial guess set was rϵ = 12 achieved by IG22. In this case, IG22 had
also an acceptable fitting performance (CF12 = 40.1 ≤ CF Bound). With the requirements
of fitting and ill-conditioning accomplished, IG22 was selected as the appropriate initial
guess to continue the parameter determination as θIG. Notice that IG3 could also be the
selected as a adequate initial guess. In summary, the selection of the appropriate initial
guess for nonlinear models follows the idea to generate an acceptable model fitting to the
data but also the highest performance of well-posedness.
101
6. Bioethanol: Identifying an over-parameterized model with large parameter correlations
0
2
4
6
8
10
12
14
1
10
100
1000
Nu
meric
al
ran
k (
rεε εε)) ))
Cost
fu
ncti
on
(C
F)
Initial Guess
CF CF Bound Rank
IG4
IG22
Figure 6.4.: Cost function (CF) of parameter estimation and numerical rank (rϵ) of the sensitivitymatrix obtained for 30 different initial guesses generated by MBLHD considering datafrom E2 and E3. The maximum acceptable cost function value to accept an initialguess is denominated “CF Bound“. (Figure taken from publication II - Lopez et al.(2013) in Appendix A.2 - reprinted from Biotechnology Progress with permission fromAmerican Institute of Chemical Engineers).
6.3.3. Iterative parameter estimation with structural analysis
In this case study an iterative procedure for estimating parameters including structural
analysis, i.e., ill-conditioning and identifiability, was performed. The ill-conditioning di-
agnosis was conducted based on the sensitivity method (see Section 4.4.1). Accordingly,
the computation of the condition number κ, collinearity index γ and numerical rank rϵ
of the sensitivity matrix were the ill-conditioning measures (see Section 3.2.1). Regarding
the identifiability analysis the local QR method detailed in Section 4.4.2 of the Chapter
4 was implemented. The parameter estimation was then regularized by using the SsS ap-
proach (see Section 3.3.1). The integration of the structural analysis in the iterative cycle
of parameter estimation permits to find those parameters which are actually identifiable
in the system (reducing the dimension of the parameter vector) according to their linear
independence. The new reduced problem is then well-conditioned and its solution has a
better accuracy than the initial problem including all parameters and their correlations.
Accordingly, when SsS regularization is applied, unidentifiable parameters associated to
the ill-conditioned singular values are fixed at prior estimates (referred to as “inactive“
parameters) and reduced-order problems are considered for the determination of the re-
maining “active“ parameters.
Figure 6.5 shows the results of 6 iterations of parameter estimation and structural anal-
ysis. Four parameters (i.e., µmax, kmax,2, Km, K1,G) were finally identified using data
102
6.3. Results and discussion
from experiments E2 and in the 6th iteration E3 6. The iterative process stopped be-
cause the ill-conditioning measures κ and γ in the current iteration k were smaller than
their thresholds (κmax = 1000 and γmax = 15) and the rank of the current iteration rk
equaled the rank of the immediately before iteration rk−1. In Figure 6.5, for each itera-
tion k = 1, · · · , 6, the following results are given: cost function value (CFk ), condition
number ( κk), collinearity index (γk), rank of sensitivity matrix (rK), the current esti-
mated parameter vector (θrk−1), its relative standard deviation (%σθ) and its variance
(σ2θ). The respective last row for the k-th iteration in Figure 6.5 contains the the new
ordered parameter vector θ =((θ(rk))T , (θ(Nθ−rk))T
)T, such that the most identifiable
parameter is located at the first position (shady cells in Figure 6.5 contain the elements of
identifiable parameter (θ(rk))T and the lowest identifiable parameter is at the last position
(elements of (θ(Nθ−rk))T include the current nonidentiable parameters θ(rk−1−rk) and those
nonidentifiable obtained in previous iterations).
6It has to be noted that in order to overcome local solutions, the identifiable parameter subset of reducedPE problems was initialized using its corresponding elements from θ0 = IG22.
103
6. Bioethanol: Identifying an over-parameterized model with large parameter correlations
pa
r(1
)p
ar(2
)p
ar(3
)p
ar(4
)p
ar(5
)p
ar(6
)p
ar(7
)p
ar(8
)p
ar(9
)p
ar(1
0)
pa
r(1
1)
pa
r(1
2)
pa
r(1
3)
pa
r(1
4)
k*
ma
x,1
km
ax
,2k
*m
ax
,3m
sY
xg
µ µµµm
ax
KG
KL
K1
,G
K1
,EtO
HK
iy,E
tOH
Km
K2
,GK
2,E
tOH
CF
140
.1
T
11.2
0.0
97
8.1
52.2
9E
-03
0.3
37
10.3
88
39
14
29.7
31.8
9.2
38
06
13.7
128
κ κκκ1 111
178
11
σ σσσ2 222
0.2
81
0.1
50
.28
659
2.8
91
.77
28
.86
30
.42
0.6
50
.17
3.9
617
.76
0.1
215
.88
31
.38
γ γγγ1 111
81
.3σ σσσ
53
%3
19
%5
3%
81
20%
133
%5
37
%5
52
%8
1%
41
%1
99
%4
21
%3
5%
399
%5
60
%
r1
12
T
K1
,G
Km
k*
ma
x,1
k*
ma
x,3
KL
Yxg
Kiy
,EtO
HK
1,E
tOH
km
ax,2
K2
,Gµ
ma
xK
2,E
tOH
KG
ms
pa
r(1
)p
ar(2
)p
ar(3
)p
ar(4
)p
ar(5
)p
ar(6
)p
ar(7
)p
ar(8
)p
ar(9
)p
ar(1
0)
pa
r(1
1)
pa
r(1
2)
K1
,G
Km
k*
ma
x,1
k*
ma
x,3
KL
Yx
gK
iy,E
tOH
K1
,EtO
Hk
ma
x,2
K2
,Gµ µµµ
ma
xK
2,E
tOH
CF
228
.1
T
4.5
17
66
52.8
20.8
90
94.7
3E
-02
28.0
7.4
80
.51
1.9
91
.74
23.5
κ κκκ2
222
45
σ σσσ2
22
.17
0.0
02
.48
1.7
6E
-03
2.2
720
6.6
811
1.3
26
.14
0.1
30
.76
23
9.8
77
.38
γ γγγ2
23
.7σ σσσ
21
47
%1%
158
%4%
151
%14
38%
10
55%
248
%3
6%
87
%15
49%
272
%
r2
8
TK
mk
*m
ax,3
km
ax,2
K2
,GK
1,G
K
1,E
tOH
µm
ax
KL
Kiy
,EtO
Hk
*m
ax,1
Yxg
K2
,EtO
H
pa
r(1
)p
ar(2
)p
ar(3
)p
ar(4
)p
ar(5
)p
ar(6
)p
ar(7
)p
ar(8
)
Km
k*
ma
x,3
km
ax
,2K
2,G
K1
,G
K1
,EtO
Hµ µµµ
ma
xK
L
CF
327
.9
T
61
02
8.3
0.3
17
2.3
44
.12
7.1
61
.75
99
7
κ κκκ3
13
15
σ σσσ2
0.1
92
.06
0.3
72
.55
0.4
50
.32
0.0
81
.46
γ γγγ3
2.6
σ σσσ4
3%
144
%6
1%
160
%6
7%
56
%2
8%
121
%
r3
7
Tµ
ma
xK
1,E
tOH
Km
km
ax,2
K1
,G
KL
k*
ma
x,3
K2
,G
pa
r(1
)p
ar(2
)p
ar(3
)p
ar(4
)p
ar(5
)p
ar(6
)p
ar(7
)
µ µµµm
ax
K1
,EtO
HK
mk
ma
x,2
K1
,G
KL
k*
ma
x,3
CF
427
.9
T1
.74
6.9
75
93
0.6
57
3.7
91
00
03
5.7
κ κκκ4
65
93
σ σσσ2
0.0
60
.31
0.1
01.7
9E
-03
0.2
211
.33
0.0
5
γ γγγ4
3.4
σ σσσ2
5%
56
%3
1%
4%
47
%3
37
%2
3%
r4
5
T
km
ax,2
K1
,G
µm
ax
k*
ma
x,3
Km
K1
,EtO
HK
L
pa
r(1
)p
ar(2
)p
ar(3
)p
ar(4
)p
ar(5
)
km
ax
,2K
1,G
µ µµµ
ma
xk
*m
ax
,3K
m
CF
527
.9
T0
.65
73
.79
1.7
43
5.7
59
3
κ κκκ5
21
76
σ σσσ2
3.3
3E
-03
0.0
60
.05
0.7
10
.03
γ γγγ5
0.9
σ σσσ6%
24
%2
3%
84
%1
6%
r5
4
Tµ
ma
xk
ma
x,2
Km
K1
,G
k*
ma
x,3
pa
r(1
)p
ar(2
)p
ar(3
)p
ar(4
)
µ µµµm
ax
km
ax
,2K
mK
1,G
CF
628
.0
T1
.74
4.5
6E
-03
8.8
53
.75
κ κκκ6
62
0σ σσσ
20
.05
7.8
3E
-02
4.2
3E
-07
0.0
9
γ γγγ6
0.4
σ σσσ2
2%
28
%0
.07%
30
%
pa
r(1
)p
ar(2
)p
ar(3
)p
ar(4
)p
ar(5
)p
ar(6
)p
ar(7
)p
ar(8
)p
ar(9
)p
ar(1
0)
pa
r(1
1)
pa
r(1
2)
pa
r(1
3)
pa
r(1
4)
µ µµµm
ax
km
ax
,2K
mK
1,G
k
*m
ax
,3K
1,E
tOH
KL
K2
,GK
iy,E
tOH
k*
ma
x,1
Yx
gK
2,E
tOH
KG
ms
1.7
44.5
6E
-03
8.8
53
.75
35.7
6.9
71
00
02
.34
28.0
52.8
4.7
3E
-02
23.5
883
2.2
9E
-03
k=
4
k=
5
k=
6
Iden
tifi
ab
le E
stim
ate
d
Pa
ram
ete
r V
ecto
r
k=
3
k=
1, r
0=
Nθ θθθ=
14
k=
2
7ˆ
θ
)(
0ˆ
rθ
)(
1ˆ
rθ
)(
2ˆ
rθ
)(
3ˆ
rθ
)(
4ˆ
rθ
)(
5ˆ
rθ
)(
2r
θ
)(
3r
θ
)(
4rθ
)(
5rθ
()
()
TT
rN
pT
r
=
−)
()
(5
5ˆ
,ˆ
ˆθ
θθ
()T
rN
)(
1ˆ
−θ
θ
()T
rN
)(
2ˆ
−θ
θ
()T
rN
)(
3ˆ
−θ
θ
()T
rN
)(
4ˆ
−θ
θ
()T
rN
)(
5ˆ
−θ
θ
)(
1rθ
Figure
6.5.:Resultsof
param
eter
estimationwithidentifiabilityanalysis.
(Figure
takenfrom
publicationII
-Lopez
etal.
(2013)in
Appendix
A.2
-reprintedfrom
BiotechnologyProgresswithpermissionfrom
AmericanInstitute
ofChem
icalEngineers).
104
6.3. Results and discussion
0
20
40
60
80
100
0 10 20 30
Time (h)
0
20
40
60
80
100
0 10 20 30
Con
cen
trati
on
(g/L
)
Time (h)
Cellobiose Glucose Ethanol
Ccb (k=1) CG (k=1) CEtOH (k=1)
Ccb (k=4) CG (k=4) CEtOH (k=4)
Ccb (k=6) CG (k=6) CEtOH (k=6)
Figure 6.6.: Experimental vs. predicted concentrations using the parameter vector θ calculatedin iteration k = 1, 4, 6 using experimental data of E2 and E3. Results for E2 (leftpanel) and results for E3 (right panel). (Figure taken from publication II - Lopez etal. (2013) in Appendix A.2 - reprinted from Biotechnology Progress with permissionfrom American Institute of Chemical Engineers).
Figure 6.6 shows measured data for Ccb , CG , and CEtOH of experiments E2 and
E3 and the corresponding predicted concentrations using the parameter vector calculated
in iteration k = 1 (original parameter vector with 14 estimated elements), in iteration
k = 4 (7 estimated elements in the reordered parameter vector), and in iteration k = 6 (4
estimated elements in the reordered parameter vector).
Figure 6.7 depicts the ranking of the parameters according to their sensitivity measure
δj (see Eq. 2.17). The highest parameter sensitivity measure found in k = 1 (i.e., max-
imum value of δj of the columns in the sensitivity matrix S(r0) correspond to Km and
the parameter with lowest sensitivity measure correspond to ms. It can be seen that the
ranking based on sensitivity measures does not give the same results as the ranking ac-
cording to identifiability (shady cells in Figure 6.7), e.g., in k = 1, the parameter KG had
a medium sensitivity measure but is nonidentifiable. The QR method for selecting the
identifiable parameters (Section 4.4.2) considered parameters with high sensitivity mea-
sures first for the construction of the identifiable parameter subset. It can be seen that
for all iterations k = 1 to k = 6, the parameters Km, kmax,2, µmax, and K1,G, which
were the identifiable parameters in the iteration k = 6, were in the first positions of
the sensitivity ranking, which demonstrates that QR method had preferences for select-
ing the most uncorrelated parameters with high sensitivity. However, some parameters
which also highly affected the predicted response variables, e.g., k∗max,3 in Figure 6.7,
were not identifiable due to correlations with identifiable parameters (i.e., µmax, kmax,2,
Km, and K1,G). In the first iteration (k = 1) with r0 = Nθ = 14, after applying the
PE and the structural analysis, the sensitivity matrix rank of the original problem was
105
6. Bioethanol: Identifying an over-parameterized model with large parameter correlations
k=1 k=2 k=3 k=4 k=5 k=6 Iteration
r=12 r=8 r=7 r=5 r=4 r=4 Rank
1 Km Km Km kmax,2 Km Km
2 k*max,1 kmax,2 k*max,3 k*max,3 kmax,2 kmax,2
3 k*max,3 k*max,3 kmax,2 Km k*max,3 K1,G
4 K1,G K1,G K2,G K1,G K1,G µmax
5 kmax,2 KL K1,G µmax µmax
6 µmax k*max,1 K1,EtOH K1,EtOH
7 KG K2,G KL KL
8 Kiy,EtOH µmax µmax
9 K2,G Yxg
10 KL K1,EtOH
11 K1,EtOH K2,EtOH
12 Yxg Kiy,EtOH
13 K2,EtOH
14 ms
Sensitivity
Ranking
Figure 6.7.: Ranking of parameters according to sensitivity (the most sensitive parameter abovein position 1). The identifiable parameter subset in every iteration k is marked byshaded cells. (Figure taken from publication II - Lopez et al. (2013) in Appendix A.2- reprinted from Biotechnology Progress with permission from American Institute ofChemical Engineers).
r1 = 12, with CF1 = 40.1, κ1 = 17811, and γ1 = 81.3 (see Figure 6.5). The large ill-
conditioning measures κ1 and γ1 confirmed that the parameter estimation problem was
ill-conditioned, despite of the good fitting of the experimental data (see dash-dot line
in Figure 6.6). Strong correlations among the parameters in θ(r0) were then expected.
The local identifiability analysis yielded a new ordered active parameter vector (˜θ(r1))T =
(K1,G Km k∗max,1 k∗max,3 KL Yxg Kiy,EtOH K1,EtOH kmax,2 K2,G µmax K2,EtOH)T = (θ(r1))T
and the nonactive vector˜θ(r0−r1) = (KG ms)
T = (θ(Nθ−r1))T ; thus, the complete param-
eter vector for the next iteration k = 2 reads θ =((θ(r1))T (θ(Nθ−r1))T
)Tin Figure 6.5)
where the shaded cells make reference to the identifiable parameter subset˜θ(rk) in iteration
k. In the second iteration (k = 2), the two unidentifiable parameters obtained from the
iteration k = 1 were removed from the parameter estimation problem by fixing them to
values found at the previous iteration (KG = 883 and ms = 2.29× 10−3).
The reduced parameter estimation problem was solved
for the previously identifiable vector θ(r1) =˜θ(r1) =
(k∗max,1, kmax,2, k∗max,3, Yxg, µmax, KL, K1,G, K1,EtOH , Kiy,EtOH , Km, K2,G, K2,EtOH)T
and the identifiability analysis was again performed in the second iteration. The results
were r2 = 8, CF2 = 28.1, κ2 = 22245, and γ = 23.7. The reduction in γ2 indicated
that this reduced problem was better conditioned than the problem in k = 1 but there
are still four parameters which are not identifiable, indicated by r2 = 8 and because
κ2 >> κmax and γ2 > γmax. The new identifiable parameter vector reads (˜θ(r2))T =
(Km, k∗max,3, kmax,2, K2,G, K1, G, K1,EtOH , µmax, KL )T = (θ(r2))T , and the nonactive
parameters to be fixed are stored in (˜θ(r1−r2))T = (Kiy,EtOH , k∗max,1, Yxg, K2,EtOH)T . The
total vector of nonactive parameters to be held constant at previous estimates in the next
106
6.3. Results and discussion
iteration k = 3, are stored in θ(Nθ−r2) =(Kiy,EtOH , k∗max,1, Yxg, K2,EtOH , KG, ms
)T.
The vector of active parameters in k = 3 is θ(r2) =˜θ(r2).
The iterative parameter estimation stopped at k = 6, with κ and γ being smaller than
their defined thresholds (κ6 < κmax and γ6 < γmax). Consequently, the active parameter
subset of this reduced parameter estimation problem did not need a further reduction and
thus all its estimated parameters(θ(r5)
)T= (µmax, kmax,2, Km, K1,G)
T were identifiable
with r6 = 4.
6.3.4. Estimator performance assessment
A statistical analysis of the of the estimated parameter vector˜θ(rk) obtained in each
iteration k was done. The reliability tests of parameter estimates of Section 2.5.3, namely
the 95% confidence interval and the hypothesis test with a significance level of α = 0.05,
were here applied. This is shown in Figure 6.8, where DoF is the degree of freedom, rk is the
current parameter vector dimension in the k-th iteration, Nm the number of experimental
data points, TH0 the Student t-value for each j-th parameter (˜θ(rk)j ∈ ˜
θ(rk)), L and U are
the lower- and upper-confidence limits described by Eq. 2.28, such that L ≤ θ(rk)j ≤ U .
Shaded cells in Figure 6.8 indicate significant parameter values in the k-th iteration,
which were statistically accepted to be different to zero (alternative hypothesis H1 in Eq.
2.31b) with a failure probability of 5%. The low number of significant parameters at initial
iterations demonstrated how parameter correlations influence results of this hypothesis
test. For strong correlations, it is not possible to indicate accurately which parameters are
not significant. Therefore, those parameters with low TH0-values at the beginning, where
correlations are present, should not be interpreted as negligible parameters to the model,
but those with higher TH0-values definitely should be considered statistically important.
107
6. Bioethanol: Identifying an over-parameterized model with large parameter correlations
K1
,G
Km
k*
ma
x,1
k*
ma
x,3
KL
Yxg
Kiy
,EtO
HK
1,E
tOH
km
ax,2
K2
,Gµ µµµ
ma
xK
2,E
tOH
KG
ms
Do
F8
72.434
2.833
1.879
1.894
1.240
0.752
0.237
0.502
0.314
0.2509
0.186
0.1785
r 11
229.7
806
11.2
8.15
914
0.337
9.23
31.8
0.097
13.7
10.3
127.6
Nm
99
L5.44
240
-0.65
-0.404
-551
-0.554
-68.1
-94.0
-0.517
-94.6
-99.7
-1294
α ααα0
.05
U54.0
1371
23.1
16.7
2380
1.23
86.6
158
0.711
122
120
1549
Km
k*
ma
x,3
km
ax,2
K2
,GK
1,G
K
1,E
tOH
µ µµµm
ax
KL
Kiy
,EtO
Hk
*m
ax,1
Yxg
K2
,EtO
H
Do
F9
19
7.2
02
3.8
12
.75
51
.14
80
.67
90
.40
40
.06
50
.66
3
r 28
76
62
0.8
0.5
12
2.0
4.5
17
.48
1.7
49
09
Nm
99
L7
51
19
.10
.14
3-1
.5-8
.69
-29
.4-5
2.0
-18
14
α ααα0
.05
U7
82
22
.60
.88
25
.41
7.7
44
.35
5.4
36
32
µ µµµm
ax
K1
,EtO
HK
mk
ma
x,2
K1
,G
KL
k*
ma
x,3
K2
,G
Do
F9
23.600
1.773
2.317
1.653
1.487
0.828
0.696
r 37
1.7
57
.16
61
00
.31
74
.12
99
6.5
22
8.3
Nm
99
L0
.78
-0.8
68
7.1
-0.0
63
9-1
.38
-13
93
-52
.4
α ααα0
.05
U2
.71
15
.19
11
34
0.6
98
9.6
23
38
61
09
.0
km
ax,2
K1
,G
µ µµµm
ax
k*
ma
x,3
Km
K1
,EtO
HK
L
Do
F9
42
3.6
42
.12
83
.97
34
.35
3.2
4
r 45
0.6
57
3.7
91
.74
35
.75
93
Nm
99
L0
.60
20
.25
30
.87
19
.42
29
α ααα0
.05
U0
.71
37
.32
2.6
25
2.0
95
7
µ µµµm
ax
km
ax,2
Km
K1
,G
k*
ma
x,3
Do
F9
54
.39
41
7.3
26
.26
74
.23
2
r 54
1.7
40
.65
75
93
3.7
9
Nm
99
L0
.96
0.5
82
40
52
.01
α ααα0
.05
U2
.53
0.7
33
78
15
.56
µ µµµm
ax
km
ax,2
Km
K1
,G
Do
F9
54
.45
63
.57
31
53
83
.30
2
r 64
1.7
40
.00
45
68
.85
3.7
5
Nm
99
L0
.97
0.0
02
03
8.8
41
.50
α ααα0
.05
U2
.52
0.0
07
09
8.8
66
.01
k=
6
k=
4
k=
5
k=
1
k=
2
k=
3
Tr
)(
2
~ θ
Tr
)(
1
~ θ
Tr
)(
3
~ θ
Tr
)(
4
~ θ
Tr
)(
5
~ θ
Tr
)(
6
~ θt H0
t H0
t H0
t H0
t H0
t H0
Figure
6.8.:Estim
ator
perform
ance
assessment:
param
eter
statisticalsignificance
(tj-value)
and95%
confidence
intervals(L≤
θ(rk)≤
U).
Shadycells
indicateparam
eter
withstatisticalsign
ificance
of95%.(F
igure
takenfrom
publicationII
-Lopez
etal.(2013)in
Appendix
A.2
-reprinted
from
BiotechnologyProgresswithpermissionfrom
AmericanInstitute
ofChem
icalEngineers).
108
6.3. Results and discussion
All components of the last identifiable parameter vector θ(r5) obtained in iteration k =
6, showed a statistical significance of 95%, which also means that their 95% confidence
intervals do not include zero (0.97 ≤ µmax ≤ 2.52, 2.03 × 10−3 ≤ kmax,2 ≤ 7.09 × 10−3,
8.84 ≤ Km ≤ 8.86, 1.50 ≤ K1,G ≤ 6.01 in Figure 6.8), and therefore, they actually have a
strong effect on predicted variables and should remain in the model.
Moreover, to compare the results from the initial problem (k = 1, Nθ = 14) with the
last reduced PE problem (k = 6, r6 = 4), enormous improvements in the parameter accu-
racy demonstrated by reductions in relative standard deviation and narrower confidence
intervals were observed. For instance, an improvement in the maximum relative standard
deviation of the most uncertain parameter µmax from 537% to 22% (see Figure 6.5) and
corresponding confidence interval from −99.7 ≤ µmax ≤ 120 to 0.97 ≤ µmax ≤ 2.52 (see
Figure 6.8).
6.3.5. Validation of the identified model
The estimated parameter vector θ with four identifiable and ten unidentifiable parameters
shown at the end of Figure 6.5 fits properly data of two experimental data sets (i.e., E2
and E3 in Table 6.1). Those data set had different experimental conditions such that
enzymatic load due to the presence of β-glucosidase and substrate pretreatment, but also
common experimental conditions such that pre-hydrolysis time, initial cellulose concentra-
tion CC0, and initial yeast concentration CX0. Accordingly, the fact that ten parameters
of θ were fixed might be explained not only by parameter correlations (nonidentifiability)
and weak effects on predicted variables (low sensitivities), but also by the aforementioned
common conditions of both experiments. The different and shared conditions of E2 and
E3 are reflected in the experimental data used for parameter determination. Those shared
parameters which are not influenced by the differences are nonidentifiable parameters. In
contrast, the four identifiable parameters of θ are related to the variability of predicted data
for E2 and E3. However, the range of experimental conditions for which the model was
valid is certainly limited and structural errors can be partially compensated by adjusting
the parameter values. Therefore, during the validation processes, it is tested whether the
identified model is able to predict new experimental conditions and also if the selection of
identifiable and unidentifiable parameter subsets is valid for these conditions. The model
validation was carried out using experiments E1, E4, and E5 in Table 6.1; it included:
1. cross-validation in order to test the validity of model predictions regarding new
operating conditions and,
2. validation of the identifiable parameter subset where the ability to adjust the model
predictions to different operating conditions (here E1-E5 in Table 6.1) is tested by
re-estimating the identifiable parameter values only.
109
6. Bioethanol: Identifying an over-parameterized model with large parameter correlations
In the former analysis (cross-validation), model predictions based on the identifiable
estimated parameter vector θ after the sixth iteration (see Figure 7) were calculated for
new experimental conditions which were not considered for parameter determination:
• Cellulose initial concentration CC0 (E1 and E5),
• Pre-hydrolysis time (E1, E4, and E5),
• GC-220 and β-glucosidase enzyme load (E1, E4, and E5), and
• Initial yeast concentration CX0 (E4 and E5).
These predictions were then compared with experimental data and the quality of fitting
was analyzed.
The latter analysis evaluated the fitting of model predictions to different experimental
conditions when a parameter estimation problem was solved where only the identifiable
parameter subset (i.e., θ(r6)) was re-estimated. By doing so, the capacity of active pa-
rameters to describe the variability in measured data caused by different experimental
conditions was assessed.
Cross-Validation
The validity of the model predictions based on the identifiable estimated parameter vector
θ (see last row in Figure 6.5) was tested using experiments E1, E4, and E5 of Table 6.1.
For each experimental condition, SSF process simulations using θ were carried out. The
predicted variables are shown as solid lines in Figures 11a-c, respectively. It can be seen
that the identified model predicted the behavior of new experimental conditions reasonably,
with an appropriate fitting for E5 with cost function of 22.6 and a moderate fitting for
E1 and E4 with cost functions of 56.4 and 57.1, respectively. During the pre-hydrolysis
stage, the model overestimated the cellobiose (Ccb−PE E2&E3) and glucose production
(CG − PE E2&E3) for E1 and E5, whereas it underestimated the same concentrations
for E4. During the fermentation stage, the model underestimated the glucose production
(CG − PEE2&E3) and overestimated the ethanol production (CEtOH − PEE2&E3) for
E1 and E4.
Assessment of the identifiable parameter subset
Glucose (CG) and ethanol (CEtOH) model predictions for E1 and E4 in Figure 6.9 demon-
strated that the behavior of new experimental conditions could not be predicted properly
by the identified model. However, very acceptable predictions are obtained for E5. A
re-estimation of the active or identifiable parameter sub- set of θ was performed. The iden-
tifiable subset (i.e., vector (θ(r6=4))T = (µmax kmax,2 Km K1,G)T ) was newly estimated
for E1, E4, and E5 individually, such that the parameter dimension of the corresponding
parameter estimation was Nθ = 4.
110
6.3. Results and discussion
0
20
40
60
80
100
0 10 20 30
Time (h)
0
20
40
60
80
100
0 10 20 30
Time (h)
0
20
40
60
80
100
0 10 20 30
Co
nce
ntr
ati
on
(g
/L)
Time (h)
Cellobiose
Glucose
Ethanol
Ccb-PE E2&E3
CG-PE E2&E3
CEtOH-PE E2&E3
Figure 6.9.: Cross-validation using parameter vectors obtained from parameter estimation withE2&E3: experimental vs. predicted concentrations for E1 (left panel), for E4 (middlepanel), and for E5 (right panel). (Figure taken from publication II - Lopez et al.(2013) in Appendix A.2 - reprinted from Biotechnology Progress with permission fromAmerican Institute of Chemical Engineers).
0
20
40
60
80
100
0 10 20 30Time (h)
0
20
40
60
80
100
0 10 20 30Time (h)
0
20
40
60
80
100
0 10 20 30
Co
nce
ntr
ati
on
(g
/L)
Time (h)
CellobioseGlucoseEthanolCcb-PE Np=4CG-PE Np=4CEtOH-PE Np=4Ccb-PE Np=14CG-PE Np=14CEtOH-PE Np=14
Figure 6.10.: Validation of the identifiable parameter subset using parameter vectors obtained aftersolving parameter estimation problems with Nθ = 4 and Nθ = 14: experimental vs.predicted concentrations for E1 (left panel), for E4 (middle panel), and for E5 (rightpanel). (Figure taken from publication II - Lopez et al. (2013) in Appendix A.2 -reprinted from Biotechnology Progress with permission from American Institute ofChemical Engineers).
The fitted model predictions for E1, E4, and E5 are shown as solid lines (Nθ = 4) in left,
middle and right panels of Figure 6.10, respectively. Comparing the results for E1 and E4
111
6. Bioethanol: Identifying an over-parameterized model with large parameter correlations
in left and middle panels of Figure 6.10 with the results obtained in the cross-validation in
the left and middle panels of Figure 6.9), large improvements can be observed. Whereas
the fitting for E5 in right panel of Figure 6.10 remained almost the same (compare with
right panel of Figure 6.9). These results are reflected by the cost functions for E1, E4,
and E5 of 25.4, 42.3, and 20.7, meaning reductions with respect to the cross validation
residuals of 55%, 26%, and 8%, respectively. Finally, all parameters were re-estimated
(Nθ = 14) for E1, E4, and E5 individually. Dashed-lines in Figure 6.10 show the model
predictions for Nθ = 14 which present a better fitting than using those parameters found
when Nθ = 4. Their cost functions were 9.5, 35.9, and 9.3 meaning reductions with respect
to the cross validation residuals of 83%, 37%, and 59% for E1, E4, and E5, respectively.
Comparing the results for the parameter estimations with Nθ = 14 with the problems with
Nθ = 4, limited improvements regarding the data fitting are visible. These improvements
have to be contrasted with a large increase in computation time due to the increased
problem dimension. Additionally, these problems are highly ill-conditioned with condition
number κ equal to 1.08× 106, 3.10× 105, and 1.17× 107 for E1, E4, and E5, respectively
(similar to the first iteration step with k = 1 of the initial problem in Figure 6.5). Table
6.3 contains the parameter vectors estimated for E1, E4, and E5, the columns “E2 & E3“
display the estimated parameter vector θ, the columns “Nθ = 4“ display the parameter
vector θ when the four identifiable parameters (elements of θ(r6)) are re-estimated, and the
columns “Nθ = 14“ display the parameter vector θ when all parameters are re-estimated;
finally, underlined values make reference to estimated parameters.
112
6.3. Results and discussion
Table 6.3.: Estimated parameter vector using E1, E4, and E5 experimental data after solution ofdifferent PE problems. Column labeled “E2&E3“ contains the estimated parametersafter finishing the iterative parameter estimation with structural analysis (see Figure6.5). (Table taken from publication II - Lopez et al. (2013) in Appendix A.2 - reprintedfrom Biotechnology Progress with permission from American Institute of ChemicalEngineers).
PARAMETER E1 E4 E5
E2&E3 Nθθθθ=4 Nθθθθ=14 E2&E3 Nθθθθ=4 Nθθθθ=14 E2&E3 Nθθθθ=4 Nθθθθ=14
1 k*max,1 52.84 52.84 3.00 52.84 52.84 58.64 52.84 52.84 4.68
2 kmax,2 4.56×10-3 1.37×10-2 2.11×10-1 4.56×10-3 7.81×10-3 3.65×10-1 4.56×10-3 7.03×10-3 2.78×10-1
3 k*max,3 35.7 35.7 4.4 35.7 35.7 64.2 35.7 35.7 4.9
4 ms 2.29×10-3 2.29×10-3 7.79×10-4 2.29×10-3 2.29×10-3 4.64×10-1 2.29×10-3 2.29×10-3 2.10×10-5
5 Yxg 0.047 0.047 0.124 0.047 0.047 0.079 0.047 0.047 0.250
6 mmax 1.74 0.92 2.47 1.74 2.29 2.58 1.74 1.58 7.88
7 KG 883 883 743 883 883 863 883 883 813
8 KL 1000 1000 167 1000 1000 293 1000 1000 501
9 K1,G 3.75 3.60 220.56 3.75 3.92 1.97 3.75 3.69 109
10 K1,EtOH 6.97 6.97 20.4 6.97 6.97 1.54 6.97 6.97 46.3
11 Kiy,EtOH 28.0 28.0 13.8 28.0 28.0 13.3 28.0 28.0 6.7
12 Km 8.85 25.80 962 8.85 15.5 823 8.85 14.0 875
13 K2,G 2.34 2.34 4.63 2.34 2.34 1.66 2.34 2.34 4.02
14 K2,EtOH 23.5 23.5 569.9 23.5 23.5 21.2 23.5 23.5 110
Cost Function 56.4 25.4 9.5 57.1 42.3 35.9 22.6 20.7 9.3
6.3.6. Discussion of the results
The model used in this article describes the dynamic behavior of cellulose, glucose, cel-
lobiose, ethanol, and biomass. The cellulose hydrolysis model involved parameters and con-
stants related to the nature and dosage of the enzyme (eg, eT , KD ), the enzyme-cellulosic
substrate interaction (k∗max,1, kmax,2, k∗max,3, KL, Km), enzyme-glucose interaction (K1,G,
K2,G) and enzyme-ethanol interaction (K1,EtOH , K2,EtOH). Moreover, the glucose fer-
mentation model took into account some parameters to describe cell growth (µmax, KG),
substrate consumption (Kiy,EtOH ), and enzyme-yeast interaction (Yxg, ms). Thus, differ-
ences in the parameter values compared with the values presented by other authors may
be explained as by the presence of local minima due to the nonlinearity of the model as
by variations in the structural features of the pretreated sugarcane bagasse, variations in
the enzymatic mixture (i.e., the addition of pure β-glucosidase to the enzymatic complex
113
6. Bioethanol: Identifying an over-parameterized model with large parameter correlations
GC-220), and variations in the performance of Saccharomyces cerevisiae under the specific
conditions of the hydrolysis used in this case study. The accuracy of estimated parameters
not only depends on the frequency and accuracy of measurements but also on the selection
of measured process variables. Two key variables of the hydrolysis stage were measured
(i.e., glucose and cellobiose concentrations), whereas only one key variable of the fermen-
tation stage was measured (i.e., ethanol concentration). According to the experimental
fittings, the enzymatic hydrolysis behavior of the experimental data is better described
by the model than the fermentative behavior. Consequently, in the fermentation stage,
the residuals are bigger in the time region when microorganisms were added; this may be
explained by the few experimental information in the fermentation process (i.e., there is
no measurement of the biomass concentration which would contribute significantly to the
parameter identifiability and might be considered in future works, as additional response
variable). Therefore, the fermentation-related parameters were not identifiable and could
not be estimated properly.
Finally, besides the previously mentioned fact about little experimental information, a
more detailed modeling of the fermentative and the hydrolytic process (e.g., including
mass transfer phenomena and additional inhibition factors not explored here) should be
considered to describe in a more accurate way the real behavior of the process variables.
In this application, the fact that ten parameters were fixed might be explained not only by
parameter correlations (nonidentifiability) and weak effects on predicted variables (low sen-
sitivities) but also by the common conditions of both experiments (E2 and E3) displayed
in Table 6.1. In contrast, the four identifiable parameters are related to the variability of
predicted data for E2 and E3, to improve the fitting of model predictions to new experi-
mental conditions reflecting the high sensitivity and independence of this selected subset.
The identified model was the result of a model selection step where different literature
sources and two different experimental data sets were analyzed. The prediction capacity
of this model facing new experimental conditions was assessed and shown in Section 6.3.5.
Differences in some of these model output predictions regarding measured variables were
evidenced in Figure 6.9. These fittings were expected taking into account the experimen-
tal conditions of new experiments E1, E4, and E5 were in regions not considered by the
experiments used in the parameterization (see E2 and E3 in Table 6.1). However, proper
fittings were also shown in Figure 6.9, proving that identified model could still predict well
new experimental conditions (e.g., fitting of experiment E5). Limited improvements of
data fitting, when parameter estimation problems with Nθ = 14 were solved with respect
to estimation with Nθ = 4, were obtained in Section 6.3.5. Totally different parameter
vectors, a large increase in computation times and ill-conditioned matrices were the fea-
tures of each PE with Nθ = 14. According to this evaluation, it is possible to say the
four identifiable parameters had the ability to adapt to new experimental conditions, and
therefore, they should be re-estimated for a new experiment only, maintaining the others
(unidentifiable parameters) fixed on previous estimate.
114
6.4. Summary and Conclusions
6.4. Summary and Conclusions
In this chapter, a SSF process model for bio-ethanol production from sugarcane bagasse
was identified and critically evaluated. It was presented a systematic identification
methodology including an iterative parameter estimation with structural analysis (i.e.,
ill-conditioning and identifiability) to determinate kinetic parameters of interest. The
strategy to estimate and analyze the parameters was based on the sensitivity method.
Model structure, parameter initial guess selection and estimator performance evaluation
from statistical point of view were also here investigated.
SSF models proposed in literature by Refs. [35] and [97] are generic formulations for
the hydrolysis and fermentation process of any cellulose-enzyme-microorganism system.
However, the stated parameter values were identified for a specific cellulosic substrate,
enzymatic complex preparation, and microorganisms. Moreover, different phenomena were
studied independently in order to estimate individual parameters or parameter subsets.
The quality and concentration of the specific cellulosic substrate, enzyme dosage, and
mode of substrate-enzyme interactions play a dominant role for the performance of the
SSF process. Thus, it was necessary to evaluate the adequacy of literature models and
to adapt them (neglecting or adding terms) to the specific characteristics of the precise
cellulosic substrate-enzyme-microorganisms system (i.e., pretreated sugarcane bagasse /
GC-220 and β-glucosidase / Saccharomyces cerevisiae system).
The application of the framework in Chapter 4 proved to efficiently work detecting
and dealing with a strongly over-parameterized model. After successful termination, the
number of considered parameters was reduced to a relatively small subset of the original
parameter space in order to regularize the ill-posed problem. Thus, the most influencing
parameters for selected operating conditions were identified and their uncertainty was
significantly decreased.
The assessment of the identifiability of original parameter vector(with fourteen elements)
revealed that a maximum of four model parameters were identifiable (i.e., µmax, kmax,2,
Km, K1,G) from the used data (experiments E2 and E3). Uncertainties in the identi-
fied parameters expressed by their relative standard deviation was decreased from 537%
to 22%, from 319% to 28%, from 35% to 0.07%, from 41% to 30%, for µmax, kmax,2,
Km, K1,G , respectively. Those large reductions were also reflected in their smaller confi-
dence intervals compared to the original parameter vector (e.g., confidence reduction from
−99.7 ≤ µmax ≤ 120 to 0.97 ≤ µmax ≤ 2.52).
Due to differences between the cellulosic substrate- enzyme-microorganism system used
in literature and here, the use of literature parameters as initial guesses for the parameter
estimation was not successful. A parameter data collection plan for sampling different
parameter guesses within a prescribed parameter range was applied instead (i.e., MBLHD).
It has been shown by a cross-validation that the identified model is able to predict
different operating conditions well. The identified model fitted data of two different ex-
115
6. Bioethanol: Identifying an over-parameterized model with large parameter correlations
perimental data sets (E2 and E3) with some shared experimental conditions properly. A
cross-validation using three new experimental data sets (E1, E4, and E5) conducted in
different bioprocess laboratories (Federal University of Rio de Janeiro, Brazil and Univer-
sity of Antioquia, Colombia) was also carried out. Identified model proved to be able to
predict the behavior of new experimental conditions with different cellulose initial con-
centration, pre-hydrolysis time, GC-220, and β-glucosidase enzyme load and initial yeast
concentration. Identifiable subset enclosed the data variability for new experimental condi-
tions. Moreover, it has been shown that the identifiable parameter subspace should also be
used in order to adapt the model to different experimental conditions. The corresponding
parameter estimation problems could be efficiently solved, as they were of reduced-order
and well-conditioned, while giving nearly the same prediction error as if all parameters
were considered.
116
7. More cases from bioprocessing: the effect
of ill-posed parameter estimation on
optimal experimental design
7.1. Abstract
1 The previous chapters 5 and 6 have already considered the estimation of parameters of
ill-posed problems. They have also explained the detection and mitigation of its structural
problems i.e., ill-conditioning and identifiability, by using several techniques. This mitiga-
tion has been done by increment in quantity and quality the experimental information and
a properly selection of the parameter initial guess. However, they have neither dealt with
numerical strategies to stabilize the solution nor analyzed the effect of those strategies in
optimal experimental design. In order to point this out, this chapter discuses different nu-
merical approaches for handling nonlinear ill-posed problems within the offline parameter
estimation and optimal experimental design framework of Chapter 4.
Literature models with increasing complexity of three bio-processes namely, a semi con-
tinuous (fed-batch) fermentation [2], a reconstruction of biochemical networks [71] and a
biological system to water treatment [16, 65] are used to illustrate the effects of ill-posed
problems on nonlinear parameter estimations and optimal experimental designs. The
combination of the ill-conditioning analysis by the sensitivity method (see Section 4.4.1 )
and the local identifiability analysis by the SVD method (see Section 4.4.2) to diagnose
identifiable parameters is detailed.
The case studies are classified either rank-deficient or of ill-determined rank [52, 78]. A
link is established between the measures of the presented analysis to the common alpha-
betic design criteria in OED and information for the solution of ill-posed OED is derived.
It is important to point out that this analysis in the context of OED had not been available
in literature till the publication of the peer-review article in Ref. [78] (Publication III )
which this chapter is based on. Moreover, regularization techniques for handling nonlinear
ill-posed PE problems from singular value analysis point of view are discussed, namely,
orthogonal decomposition based techniques (i.e., SsS and TSVD) [19, 45, 52, 79, 83, 131]
and the Tikhonov strategy [4, 48, 49, 52, 63, 118]. These techniques are then applied
to ill-posed OED problems. To avoid the computational demanding bi-level optimization
1The content of this chapter is reprinted (adapted) with permission from (D. C. Lopez C., T. Barz, S.Korkel, and G. Wozny. Nonlinear ill-posed problem analysis in model-based parameter estimation andexperimental design. Computers & Chemical Engineering, 77:24-42, 2015). Copyright (2015) Elsevier.(Publication III in Appendix A.2)
117
7. Effect of ill-posed parameter estimation on optimal experimental design
approach addressed by authors in Ref. [49, 59], it is considered the variance contribution
of the biased estimator only [3, 38].
Finally, in this chapter two new sections containing information which has not been
published or used to prepare a new peer-review article are included. This content is
here marked as “(New)“. The first new section (Section 7.3.5) shows the evolution of
ill-conditioning after adding new experimental information. The aid of informative exper-
imental data on ill-conditioning is demonstrated. The second new section (Section 7.3.6)
exposes Monte Carlo studies which are accomplished in order to test the efficacy of optimal
designs computed from ill-posed parameter estimations and regularized parameter estima-
tions. The efficacy is measured in terms of estimator performance and the new states of
ill-conditioning and identifiability of the parameter estimation.
7.2. Applications
The effect of ill-posed parameter estimations on optimal experimental designs is studied
for three different case studies E1, E2, E3. These applications are derived from dynamic
systems from bioprocessing, being E1: a semi continuous (fed-batch) fermentation reactor
[2]; E2: a biochemical growth model in a reactor [71]; and E3: a sequencing batch reactor
for water treatment [65]. The degree of complexity of each example increases according
to its number of parameters and its state of ill-posedness of the PE from E1 to E3. The
starting information (i.e., model, experimental design uIG, parameter initial guess θIG and
true parameters θ∗) is taken from literature when available or defined in this study. For
instance, convergence tests on the PE are used to select the values of θIG. A brief problem
overview for each case study is given in Table 7.1. Model details for E1, E2 and E3 are
described in Refs. [2], [71] and [65], respectively.
118
7.2. Applications
Desc
rip
tion
N
ota
tion
Fed
batc
h f
erm
enta
tion
(see
Asp
rey
an
d M
acch
iett
o, 200
2)
Bio
chem
ica
l n
etw
ork
(see
Krem
lin
g e
t al.
, 20
04)
Act
iva
ted
Slu
dge
Mod
el (
AS
M3
)
(see
Kael
in e
t a
l, 2
009)
Exa
mp
le E
1
Exa
mp
le E
2
Ex
am
ple
E3
E1
E1
a
E1
b
E1c
Sta
te v
aria
ble
s x ∈
RN
x
(x1,
x2)
(x1, x
2)
(x1, x
2)
(x1,
x2)
(V, B
, S
, M
1, M
2, M
3)
(SO, S
S, S
NH, S
NO
2, S
NO
3, S
N2,
SA
LK,
SI,
XI,
XH,
XS, X
ST
O, X
AO
B, X
NO
B,
XT
SS)
Mea
sure
d s
tate
var
iable
s y ∈
RN
y
(x1)
(x1, x
2)
(x1, x
2)
(x1,
x2)
(B, S
, M
1, M
2, M
3)
(SO, S
S, S
NH, S
NO
2, S
NO
3,S
AL
K,X
ST
O)
Dat
a sa
mpli
ng t
ime
gri
d
t∈R
Nm
(2:2
:20)
(2:2
:20)
(0.5
:0.5
:20)
(0.2
5:0
.25:2
0)
(2:2
:60)
(0.0
02:0
.002:0
.2) a
Init
ial
con
dit
ions
x(t
0)
= x
0
(5.5
, 0.1
) (1
.0, 0
.1008, 1
.9134,
0.0
620
,
0.0
079,
0.0
749)
(0, 100
0, 20
, 20,
20, 0
, 5, 0
, 0, 2
00, 2
00, 5
0,
20, 2
0, 16
16.7
)
Input
acti
on v
aria
ble
s u ∈
RN
u
(u1, u
2)
(qin, q
out,
c in)
u1
Input
acti
on t
ime
gri
d
t∈R
Nm
u
(2 1
4 2
0)
(20 3
0 6
0)
(0.0
2:0
.02:0
.2) a
Input
acti
on i
nit
ial
des
ign
∈R
Nu⋅N
mu
u=
(0.1
2,0
.12,0
.12) u=
(15
,15
,15)
u=
(0.2
5 0
.35
0.3
5) u=
(0.2
5
0.3
5 0
.35) u=
(2.0
2.0
0.5
)
u=
(0.0
5, 1.0
, 0.0
5, 0.3
, 0.5
, 1.0
, 0.7
, 0
.2,
0.0
5, 1.0
)
Par
amet
ers
θ∈
RN
p
θi,
i=1,…
,4
θi,
i=1,…
,10
(r1
max, K
s, K
2, K
M1,
r 3m
ax, K
M2,
Ksy
nm
ax, K
IB,
Yxs,
KIA
)
In c
ontr
ast
to K
rem
lin
g, 2004,
wher
e so
me
par
amet
ers
val
ues
are
assu
med
to
be
exac
tly k
now
n t
o
over
com
e id
enti
fiab
ilit
y i
ssues
, in
this
work
all
par
amet
ers
are
con
sider
ed a
s u
nkno
wn.
θi,
i=1
,…,4
4
(iN
SS, i N
XI,
i NX
S, i N
BM
, f X
I,YH
O2,
YH
NO
3,
YH
NO
2,
YS
TO
O2, Y
ST
ON
O3, Y
ST
ON
O2, Y
AO
B, Y
NO
B, K
H,
kst
o, µ
H,
µA
OB, µ
NO
B, b
HO
2, b
ST
OO
2, b
AO
B,
bN
OB,
ηH
NO
3, η
HN
O2, η
HendN
O3, η
Hen
dN
O2, η
Nend, K
X,
KH
O2, K
HO
2in
h, K
HS
S,
KH
NH
4,
KH
NO
3, K
HN
O2,
KH
AL
K,
K
HS
TO, K
AO
BO
2,
KN
OB
O2, K
AO
BN
H4,
KN
OB
NO
2,
KN
AL
K, T
, K
la2
0, S
OS
AT)
Par
amet
er i
nit
ial
gues
s
θ
∈R
Np
(0.5
0.5
0.5
0.5
)
(7200, 0.1
33119, 16
6770
0, 3.6
6 ,
900
000,
3, 0.0
0246,
0.0
498,
0.0
000
211, 300)a
(0.0
12, 0
.01
6, 0.0
12, 0.0
28
,0.0
8, 0.3
2, 0.2
6,
0.2
6, 0
.32
, 0.2
8,
0.2
8, 0.0
72, 0.0
24
, 3.6
, 4
.8,
1.2
, 0.3
6,
0.2
6, 0
.12
, 0.1
2, 0.0
6, 0.0
88, 0.0
6,
0.0
6, 0.1
, 0.1
4, 0.0
4, 0.4
, 0.0
8, 0.0
8, 4
.0,
0.0
04, 0.2
, 0.2
, 0.0
4,
0.0
4, 0
.32
, 0
.32, 0.0
56
,
0.1
12
, 0.2
, 8
.0,
400,
4185
3) a
Dif
fere
nti
al E
q. S
yst
em
D
AE
con
ver
ted t
o O
DE
D
AE
co
nver
ted t
o O
DE
D
AE
con
ver
ted t
o O
DE
aD
efin
ed b
y a
uth
ors
Table7.1.:Problem
description
forcase
studiesE1,
E2an
dE3.(F
igure
takenfrom
publicationIII-Lopez
etal.(2015)in
Appendix
A.2
-reprintedfrom
Com
puters
&Chem
ical
Engineeringwithpermissionfrom
Elsevier).
119
7. Effect of ill-posed parameter estimation on optimal experimental design
7.3. Results and Discussion
What follows is a numerical analysis of the different nonlinear discrete ill-posed problems
E1, E2, E3 referenced in Section 7.2. The link between identifiability problems and ill-
conditioning is established and general problems concerning the solution of OED for ill-
posed PE using conventional alphabetic design criteria are discussed. The application of
different regularization techniques and the solution of the corresponding modified OED
problems are examined.
Solve OED:
= arg min
Ψ u
Model | θ |
None | SsS | TSVD | Tikh
Regularization technique
(Reg):
A | D | E
Design criterion
(crit):Starting information:
()
()
()
Optimal design:
Original matrix :
Regularized matrix
:
Ill-conditioning analysis by
sensitivity method
And Identifiability by QR
method
(Section 5.1.3.)
Regularized matrix
:
Original matrix :
Initial design:
(Section 8.3.2.)
(Section 8.3.1.)
(Section 8.3.3.)
(Section 8.3.3.)
Original matrix :
Regularized matrix :
Original matrix :
Regularized matrix
:
Sensitivity Matrix
Sensitivity Matrix
Parameter covariance
matrix
Parameter covariance
matrix
(Section 8.3.2.)
(Section 8.3.2.)
(Section 8.3.3.)
(Section 8.3.3.)
E1 | E2 | E3
Case study:
Figure 7.1.: Main procedure and nomenclature of Section 7.3. (Figure taken from publicationIII - Lopez et al. (2015) in Appendix A.2 - reprinted from Computers & ChemicalEngineering with permission from Elsevier).
The optimal design ucrit is obtained from the solution of the OED (see Eq. 2.35) for
different design criteria crit=A, D, E according to Eqs. 2.36-2.38 and different regu-
larization techniques Reg=None, SsS, TSVD, Tikh (see Table 3.1). The regularization
is achieved for the initial design uIG and the SVs of the regularized sensitivity matrix
SReg(uIG) is computed. It is then compared to the SVs obtained for the solution of
the regularized OED with the regularized sensitivity matrix SReg(ucrit) evaluated at the
optimal design ucrit. The ill-conditioning analysis is conducted by using the sensitivity
method in Section 4.4.1 whereas the identifiability diagnosis is performed by using the
three techniques described in Section 4.4.2, namely variance, SVD and QR methods.
The remainder of this paper is organized according to the flow diagram depicted in
120
7.3. Results and Discussion
Figure 7.1. In Section 7.3.2 the original state of ill-conditioning and identifiability for
each case study E1, E2, E3 at the initial design uIG using the sensitivity matrix S(uIG)
is established. In Section 7.3.3 the new state of conditioning and identifiability after
the application of OED without regularization at the optimal design ucrit using S(ucrit)
is shown. In Section 7.3.4 the application of the different regularizations methods is
discussed.
Two new unpublished sections are at the end of this chapter included. Section 7.3.5
addresses the effect of experimental data in the ill-conditioning of the sensitivity matrix.
Whereas, the results of Monte Carlo studies are displayed in Section 7.3.6 in order to
illustrate the effect of using ill-posed and regularized optimal designs on new parameter
estimations.
7.3.1. Computational Issues
All numerical computations are performed on an Intel(R) Core(TM)2 (CPU 6600 @2.40-
GHz) computer with 4-GB RAM. Parallel programming is not used. Model and parame-
ter sensitivity equations are integrated by CVODES from SUNDIALS (Hindmarsh et al.,
2005). Parameter estimation and optimal experimental design problems are solved by us-
ing MATLAB Release 2013a (The MathWorks Inc., Natick, Massachusetts, United States).
Parameter estimations are solved by using“lsqnonlin“ (nonlinear least squares) function us-
ing the trust-region-reflective algorithm. Whilst, regularized estimation by Tikhonov2 and
OED problems are solved by using “fmincon“ (constrained nonlinear multivariable) func-
tion using the interior-point algorithm. Singular values are computed with “svd“ function
whereas the eigenvalue system is solved with “eig“ function.
7.3.2. Ill-conditioning and identifiability diagnosis
This section summarizes results from the ill-conditioning analysis for case studies E1, E2
and E3 at the initial design uIG using the sensitivity matrix S(uIG). The classification of
the ill-conditioning and the selection of singular values generating a well-posed problem
are described. The maximum values in Table 7.2 are used in order to determine the ϵ-
threshold and to define the numerical rank rϵ as explained in Section 4.4.1. Notice that
γmax(S) = γmax(S) · σ and κmax(S) = κmax(S). Moreover, in this section for the sake of
simplicity κ(S) and γ(S) are referred to as κ and γ, respectively .
E1 - Fed Batch Fermentation
Ill-conditioning analysis. Figure 7.2a contains the singular value spectrum of S(uIG)
spanning from ς4 = 3.1912 × 10−2 to ς1 = 1.2407 × 101. The condition number and
the collinearity index are κ = 3.8879 × 102 and γ = 3.1376 × 101, respectively, and this
2With regularized sensitivity matrix STikh =
[S
λ2L
].
121
7. Effect of ill-posed parameter estimation on optimal experimental design
Table 7.2.: Thresholds for condition number (κmax) and collinearity index (γmax). (Figure takenfrom publication III - Lopez et al. (2015) in Appendix A.2 - reprinted from Computers& Chemical Engineering with permission from Elsevier).
Problem κmax γmax
E1 1000 10E2 1000 15E3 1000 15
problem is categorized as rank-deficient. Note that only the collinearity measure is not
fulfilled, namely γ > γmax because the singular value ς4 is relatively close to zero, thus S
is ill-conditioned. The effect of including new experimental information on ill-conditioning
will be further analyzed in Section 7.3.5.
The ϵ-threshold is defined by γmax, i.e., ϵ = ϵγ according to Eq. 3.3. As can be seen in
Figure 7.2a, ς4 does not fulfill this threshold having ς4 < ϵ, therefore the numerical rank
is rϵ = 3.
Identifiability analysis. When the identifiability is analyzed by the variance method in
Section 4.4.2 the conclusion is that all parameters should be nominated as unidentifiable
because they have inflated variances which should be analyzed carefully. The variance of
all parameters is large and can be seen in Table A.1 of Appendix A.3. For instance, the
variance of the most uncertain parameter θ2 is larger than the variance of the most precise
parameter θ4, i.e., var(θ2) = 74V ar(θ4).
Observing the variance-decomposition in Appendix A.3 Table A.1 obtained by applying
the SVD method in Section 4.4.2, the ill-conditioned singular value ς4 contributes with
more than 73% to the variance of all parameters. This contribution exceeds the predefined
threshold (πmax = 50%), consequently this method also suggests that θ is practically
unidentifiable with all unidentifiable parameters.
By using the results of the ill-conditioning analysis and after applying the QR method
for identifiability in Section 4.4.2, only the parameter θ2 is unidentifiable and the model
is then considered practically unidentifiable.
E2 - Biochemical network
Ill-conditioning analysis. Figure 7.2b contains the singular value spectrum of S(uIG)
spanning from ς10 = 2.0383 × 10−6 to ς1 = 6.4266 × 101 with κ = 3.1529 × 107 and
γ = 4.9061 × 105. With this large ill-conditioning and collinearity measures (see Section
3.2.1) the matrix S(uIG) is categorized as ill-conditioned (κ > κmax and γ > γmax)
and thus this problem is a rank-deficient ill-posed problem. Notice, that there is a well-
determined gap in the SVs of S(uIG) between large (ς1-ς7) and small (ς8-ς10) singular
values, see Figure 7.2b. In this case, the condition γmax is the limiting condition for the
ϵ-threshold, i.e., ϵ = ϵγ (see Eq. 3.3) and then only the first seven singular values are
122
7.3. Results and Discussion
1E-02
1E-01
1E+00
1E+01
1E+02
1 2 3 4
Singular value index i
∈∈∈∈κκκκ: lower bound κκκκ
∈∈∈∈
well-conditioned ςςςςi
ill-
conditioned
ςςςςi
∈∈∈∈γγγγ: lower bound γγγγ
Sin
gu
lar
valu
e ( ςς ςς
i)
(a)
1E-06
1E-05
1E-04
1E-03
1E-02
1E-01
1E+00
1E+01
1E+02
1 2 3 4 5 6 7 8 9 10
Singular value index i
∈∈∈∈=∈∈∈∈γγγγ∈∈∈∈κκκκ
well-conditioned ςςςςi
ill-
conditioned
ςςςςi
Sin
gu
lar
valu
e ( ςς ςς
i)
(b)
1E-16
1E-14
1E-12
1E-10
1E-08
1E-06
1E-04
1E-02
1E+00
1E+02
1 3 5 7 9 1113151719212325272931333537394143
Singular value index i
∈∈∈∈κκκκ
∈∈∈∈=∈∈∈∈γγγγ
ill-conditioned ςςςςi
wel
l-co
nd
itio
ned
ςς ςςi
Sin
gu
lar
valu
e ( ςς ςς
i)
(c)
Figure 7.2.: Singular value spectrum (SVs) of the sensitivity matrix evaluated at the initial designS(uIG) for problem (a) E1, (b) E2 and (c) E3. Each singular value less than ϵ-thresholdis considered ill-conditioned. (Figure taken from publication III - Lopez et al. (2015)in Appendix A.2 - reprinted from Computers & Chemical Engineering with permissionfrom Elsevier).
well-conditioned because ςi ≥ ϵ for i = 1, · · · , 7.
Identifiability analysis. Analyzing the parameter variance obtained by the variance
method in Section 4.4.2, the most precise parameter is σ29 = 0.15, which is the only pa-
rameter with variance less than 1 in this case study, and the most imprecise parameter is
σ210 = 2.4066× 1011 (see Appendix A.3 Table A.2 for more variance details). Furthermore,
other parameters also have variances larger than 2.6 × 107, i.e., θ5 and θ6. With these
inflated variances any parameter might be considered identifiable regardless the value of
the variance-threshold ρ.
123
7. Effect of ill-posed parameter estimation on optimal experimental design
The effect of the ill-conditioned singular values can be better observed by the SVD
method in Section 4.4.2 based on the variance-decomposition. Therein, the ill-conditioned
singular values ς8, · · · , ς10 contribute more than 50% to the variance of all parameters
except θ4 and θ9. Thus, these eight parameters are preliminary selected as practically
unidentifiable.
A further analysis is performed by using the QR method in Section 4.4.2 which is the
based of the SsS regularization (see Section 3.3.1). This method finds seven practically
identifiable parameters because of rϵ = 7 of S(uIG). The identifiable parameters are θ1,
θ3, θ5, θ9, θ7, θ4, θ2. In all analysis the model is considered practically unidentifiable.
E3 - ASM3
Ill-conditioning analysis. Figure 7.2c contains the singular value spectrum of S(uIG)
spanning from ς44 = 2.8427 × 10−16 to ς1 = 4.3461. Compared to E1 and E2, problem
E3 shows the largest condition number and collinearity index, with κ = 1.5289 × 1016
and γ = 3.5178 × 1015. The singular values gradually decay to zero which indicates an
ill-determined rank problem. Hence, it has the biggest severity of ill-posedness. Only the
first seven singular values (i.e., ςi for i = 1, · · · , 7) correspond to a well-conditioned matrix
(rϵ = 7). The selection of these well-conditioned singular values is based on ϵ = ϵγ .
Identifiability analysis. As expected, this model is categorized as practically unidentifi-
able for all techniques of Section 4.4.2. The variance of the most precise parameter is
σ214 = 1.23 × 106 (see Table A.3 of Appendix A.3). All parameter variances are highly
influenced3 by the ill-conditioned singular values and therefore, none of them are reliable.
Moreover, the parameter variances are inflated masking the individual identifiability of
parameters and making a preliminary parameter selection impossible. Thus, a more elabo-
rate analysis is applied to determine the identifiable parameters according to QR method
in Section 4.4.2. With this method seven parameters θ23, θ44, θ15, θ12, θ16, θ14, θ8 are
considered identifiable. More effects of this severe ill-posedness on computations of OED
criteria without regularization are discussed in Section 7.3.3.
Similarly as in E1 (Section 7.3.2), in E3 the presence of small singular values (defined by
ϵ = ϵγ) strongly reduces the number of well-conditioned singular values and is an indication
for nearly linearly dependencies rather than for numerical instabilities (see Section 3.2.1).
Moreover, if those small singular values are considered well-conditioned, the serious con-
sequences are overestimation of parameter variances, erroneous uncertainty quantification
and improper results of the identifiability analysis.
3In Table A.3 of Appendix A.3 21 parameters have individual variance-decomposition proportions greaterthan πmax = 0.5 corresponding to the most ill-conditioned singular values π40, · · ·π44. Nonetheless, thecomplete set of ill-conditioned singular values π8, · · ·π44 contribute almost totally to the variance of the44 parameters
124
7.3. Results and Discussion
7.3.3. Optimal design without regularization
In this section the results of the optimal design without regularization are discussed. For
all problems E1-E3, the application of the A-, D- and E- criteria yield the expected char-
acteristic changes in the spectrum according to Section 2.7.2. Generally, A and E-criteria
focus on the lower part of the SVs whereas the target of the D-design is mostly directed
to raise the top section of the spectrum. The values in Table 7.2 are used to define the
ϵ-threshold.
E1 - Fed Batch Fermentation
For E1, A and E-designs reach almost the same optimal experimental solution thus, almost
the same SVs (see Figure 7.3a) which generate a lift of the bottom section. This indicates
a variance reduction for all parameters because of the increment of the smallest singular
values (i.e., ς4(S(uA)) = 1.6735× 10−1 and ς4(S(uE)) = 1.6775× 10−1). For instance, the
variance of the most uncertain parameter θ2 is substantially reduced with respect to the
most precise parameter θ4 (i.e. σ22 = 8.6σ2
4). A- and E-designs also yield spectra with the
smallest collinearity index γ(S(uA)) = 5.9756 and γ(S(uE)) = 5.9611, respectively. The
ill-conditioned singular values are completely removed from the problem according to the ϵ-
threshold (see in Figure 7.3a that neither the spectrum of S(uA) nor S(uE) are intercepted
by the bound ϵ). On the contrary, for D-design the ill-conditioning remains with only three
singular values being well-conditioned (see Figure 7.3b). Nevertheless, the lift of the SVs
observed in this figure (especially in the bottom section) promotes parameter variance
reduction. All parameters have at least 50% of variance reduction (e.g., the most precise
parameter θ1 has 88% variance reduction). Unfortunately, the most uncertain parameter
θ2 remains with large variance (σ22 = 43σ2
1) and it is still considered unidentifiable.
E2 - Biochemical network
Figures 7.4a-b show the change in the SVs of A- and E- optimal designs for E2, respectively.
In Figure 7.4a can be observed that A-design generates a lift of the whole spectrum,
although the main elevation is seen in the bottom section. This indicates a variance
reduction for all parameters. For E-design in Figure 7.4b only the bottom section of the
spectrum is lifted. Indeed, A-design yields the best results in terms of condition number
and collinearity index with κ(S(uA)) = 3.0198× 107 and γ(S(uA)) = 1.7995× 104 which
correspond to ς10 = 5.5570×10−5 and ς1 = 1.6781×102. However, those values still indicateill-conditioning of S(uA) as κ > κmax and γ > γmax, and thus there exist unidentifiable
parameters. The number of well-conditioned singular values remains seven (ς1 − ς7), the
same as for the initial design. For the A-design, κmax is the limiting condition of the
ϵ-threshold (see Eq. 3.3), i.e., ϵ = ϵκ. In Figure 7.4a, the A-criterion (average variance,
see Eq. 2.36) is reduced from ΨA(uIG) = 2.4071 × 1010 to ΨA(uA) = 3.2393 × 107. For
the A-design, the most precise parameter θ1 has a variance of 2.6323 × 10−3 whilst the
125
7. Effect of ill-posed parameter estimation on optimal experimental design
1E-02
1E-01
1E+00
1E+01
1E+02
1 2 3 4
None-IG
None-A
None-E
∈∈∈∈=∈∈∈∈γγγγ
1E-02
1E-01
1E+00
1E+01
1E+02
1 2 3 4
Singular value index i
None-IG
None-D
ill-
con
dit
ion
ed
ςς ςςi
well-conditioned ςςςςi
∈∈∈∈=∈∈∈∈γγγγ
(b)
(a)Sin
gu
lar
va
lue
( ςς ςςi)
S
ing
ula
r v
alu
e ( ςς ςς
i)
Figure 7.3.: Change in the singular value spectrum (SVs) of the sensitivity matrix for the OEDwithout regularization. Results are shown for the sensitivity matrix at initial designS(uIG) and at optimal A-design S(uA) and E-design (S(uE)) for problem E1. (Figuretaken from publication III - Lopez et al. (2015) in Appendix A.2 - reprinted fromComputers & Chemical Engineering with permission from Elsevier).
most imprecise parameter θ10 has 3.2383× 108. Moreover, parameters θ5 and θ6 also have
enormous variances larger than 5×104. For θ3, θ4, θ5, θ6, θ7, θ8 and θ10 the ill-conditioned
singular values ς8 − ς10 contribute with more than 65% to their variance.
For E-optimal design the results are not better. This is mainly because of the big
(numerical) difficulty in minimizing the large value of the design criterion generated by
the nearness to zero of the smallest singular values. Note ΨE(uIG) = 1/ς210 = 2.4069×1011.Accordingly the E-criterion could not be further reduced, with ΨE(uE) = 3.1409 × 109
and the SVs of S(uE) varies from ς10 = 1.7843× 1010 to ς1 = 6.7548× 101 still indicating
an ill-conditioned matrix S(uE). With this design eight well-conditioned singular values
are found. After the optimal design, the most precise parameter θ1 has a variance of
1.0304× 10−1 whilst the most imprecise parameter θ10 has 3.1407× 109 according to Eq.
3.5. Moreover, parameters θ5 and θ6 also have enormous variances larger than 1×105. For
θ5, θ6 and θ10 the ill-conditioned singular values ς9− ς10 contribute with more than 99.9%
of their variance. In general, there is an improvement in parameter variance because of
the increment of the small singular values. However, also the E-design does not produce
a well-posed problem, i.e., a problem free of identifiability problems.
126
7.3. Results and Discussion
1E-06
1E-05
1E-04
1E-03
1E-02
1E-01
1E+00
1E+01
1E+02
1 2 3 4 5 6 7 8 9 10
Singular value (ςςςςi)
None-IG
None-E
1E-06
1E-04
1E-02
1E+00
1E+02
1 2 3 4 5 6 7 8 9 10
None-IG
None-A
∈∈∈∈=∈∈∈∈κκκκ
ill-conditioned
ςςςςi
well-conditioned ςςςςi
Sin
gu
lar
va
lue
( ςς ςςi)
∈∈∈∈=∈∈∈∈κκκκ
well-conditioned ςςςςi
ill-
conditioned
ςςςςi
Sin
gu
lar
va
lue
( ςς ςςi)
(b)
(a)
Figure 7.4.: Change in the singular value spectrum (SVs) of the sensitivity matrix for the OEDwithout regularization. Results are shown for the sensitivity matrix at initial designS(uIG) and at optimal design for problem E2 with: (a) A-criterion S(uA), and (b) E-criterion S(uE). Note that the shown lower bound ϵ is computed for S(ucrit). (Figuretaken from publication III - Lopez et al. (2015) in Appendix A.2 - reprinted fromComputers & Chemical Engineering with permission from Elsevier).
E3 - ASM3
The severe ill-posedness of problem E3 (see 7.3.2) makes a reliable computation of an
OED impossible. The computation of the eigensystem of F (uIG) and C(uIG) does not
give reasonable results. For F (uIG) one negative eigenvalue is obtained whereas eight
negative eigenvalues are found for C(uIG). The reasons are numerical errors (related to
the machine precision), originated by matrix operations (matrix product in Eq. 2.10 and
inversion in Eq. 4.3) and the solution of the ill-conditioned matrix eigensystem. It is
obvious that the computed negative eigenvalues are illogical for the positive semi-definite
matrices F and C.
Still, while a D-criterion value cannot be computed because of an error in the applied
algorithm, A- and E-criterion values can be obtained. To illustrate the magnitude of ill-
conditioned OED criteria, the singular values ςi for i = 1, · · · , Nθ of the sensitivity matrix
are used to compute A and E-criteria according to Eqs. 2.36 and 2.38, respectively. Those
values read ΨA(uIG) = 2.8×1029 and ΨE(uIG) = 1.2×1031, respectively. It has to be noted,that the same criteria have values of ΨA(uIG) = 1.8×1016 and ΨE(uIG) = 8.8×1017 when
the eigenvalues ςi for i = 1, · · · , Nθ of C(uIG) are taken into consideration. In general,
criteria computed from the singular values of S are by definition more reliable. They
also capture better the high singularity of the matrix C (e.g., presence of singular values
127
7. Effect of ill-posed parameter estimation on optimal experimental design
near the machine precision, i.e., 10 × 10−16). It is important to note that under these
circumstances a gradient-based minimization is impossible. Eventually computed results
are not reliable and it is not advisable to draw conclusions about the identifiability of
parameters for this kind of severe ill-conditioned problems.
To summarize, for all studied problems the ill-conditioning of S cannot be overcome by
solving the optimal design problem without regularization. Although the ill-conditioning
is sometimes improved (reduction of the parameter variance), in none of the problems this
is sufficient for guaranteeing stable PE and practically identifiability.
7.3.4. Optimal design with regularization
In this section the solution of the regularized OED problem is discussed. The common fea-
ture for all regularizations techniques is the generation of a new better-conditioned sensitiv-
ity matrix at the initial design SReg(uIG), Reg=SsS,TSVD,Tikh, i.e., the ill-conditioningis partially or completely removed depending on the selected value of the regularization
parameter. Details about the effect of the regularization parameter λ in Tikhonov will
be given in Section 7.3.4. The difference between the original SVs of the sensitivity ma-
trix without regularization, i.e., SNone(uIG) and the new regularized sensitivity matrix
SReg(uIG) is shown. Notice that ill-conditioned singular values are removed by either pa-
rameter space reduction (Reg=SsS) or singular value truncation (Reg=TSVD) or addition
of a well-conditioned matrix (Reg=Tikh). For the determination of the well-conditioned
singular values of S (via the numerical rank rϵ, see Section 2.2.1.) the thresholds in Table
7.2 are used.
Subset Selection (Reg=SsS)
For the SsS (see Section 3.3.1) the following features are observed:
1. For all studied examples E1-3, at the initial design uIG, only a subset of the parameter
space (determined by the rank of S(uIG)) is identifiable. The new reduced problems
(i.e., regularized problems) promote a well-conditioned sensitivity matrix SSsS(uIG),
that means all singular values in its spectrum are well-conditioned. Thus, even for
the strongly ill-conditioned problem E3, the OED could be successfully computed
for all considered design criteria.
For problem E2 and E3, the SVs of the regularized matrices SSsS(uIG) and
SSsS(ucrit) are shown in Figures 7.5a and 7.5b, respectively. For totally linear inde-
pendent parameters each singular value of S would match exactly with one parameter.
As this is not the case, each singular value is related to several parameters. Accord-
ingly, the SVs of the regularized matrix SSsS(uIG) is not identical to the SVs of the
original matrix S(uIG). Moreover, it can be seen, in Figures 7.5a and 7.5b, that
those singular values which belong to the reduced matrix SSsS take equal or smaller
128
7.3. Results and Discussion
1E-06
1E-04
1E-02
1E+00
1E+02
1 2 3 4 5 6 7 8 9 10
SsS-IG
SsS-A
None-IG-Org
SsS-A-Org
1E-04
1E-03
1E-02
1E-01
1E+00
1E+01
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Singular value index i
SsS-IG
SsS-E
None-IG
SsS-E
∈∈∈∈=∈∈∈∈κκκκ
∈∈∈∈=∈∈∈∈γγγγ
Section of the complete
SVs from ς1 to ς44
Sin
gu
lar
va
lue
( ςς ςςi)
S
ing
ula
r v
alu
e ( ςς ςς
i)
(b)
(a)
Figure 7.5.: Singular value spectrum (SVs) of the original and the regularized sensitivity matrixafter applying Reg=SsS, S and SSsS , respectively. Results are shown for the initialuIG and optimal ucrit experimental designs for problem: (a) E2 where ucrit = uA,and (b) E3 where ucrit = uE . The solid-black curve shows the SVs of the originalsensitivity matrix without regularization at initial design S(uIG). The black-crossand gray-cross markers show the SVs of the regularized (reduced) matrix at initialand optimal designs SSsS(uIG) and SSsS(ucrit), respectively. Note that the lowerbound ϵκ is computed for SSsS(ucrit). (Figure taken from publication III - Lopez etal. (2015) in Appendix A.2 - reprinted from Computers & Chemical Engineering withpermission from Elsevier).
values than those coming from the original matrix S (interlacing inequality for sin-
gular values [117, 24]). Nevertheless, the tendency (for rank-deficient problems) to
conserve the largest singular values associated with the well-conditioned parameters
is evident (see Figure 7.5a for Problem E2).
2. The regularized OED improves only the active parameter subset and thus, only the
associated singular values. In Figure 7.5 it can be seen how the reduced SVs is lifted
(shifted to higher values) from SSsS(uIG) to SSsS(ucrit).
Despite of the improvements in the ill-conditioning of the reduced matrix SSsS(ucrit),
for the optimal design the ill-conditioning of the original matrix S(ucrit) is not always
improved, e.g., for problem E1 (data not shown here), all optimal designs generates
SVs with higher condition number and collinearity index. For the D-design the
following values are obtained: κ(S(uD)) = 8.4 × 102 > κ(S(uIG)) and γ(S(uD)) =
3.6× 101 > γ(S(uIG)).
For example E2, at the A-optimal design the original matrix S(uA) continues being
ill-conditioned with γ(S(uA)) = 1.9 × 104 > γmax. Nevertheless, the increment in
129
7. Effect of ill-posed parameter estimation on optimal experimental design
ς1 and ς10 of S(uA) (Figure 7.5a) produces a reduced condition number κ(S(uA)) =
3.2 × 106 < κ(S(uIG)) and also smaller parameter variances, e.g., σ25 = 5.19× 104,
σ26 = 5.26× 104 and σ2
10 = 3.2383× 108. But still κ(S(uA)) > κmax and the rank
of the problem is still seven. Accordingly, the ill-conditioning is improved but the
problem still remains ill-posed.
3. In some cases A and E-optimal designs could improve the problem rank and attract
new parameters to the identifiable region.
This is shown for problem E3 with the E-design in Figure 7.5b, where the problem
rank improves from 7 to 9. The improvement of the SVs of SSsS (from uIG to uE)
moves ς8 and ς9 into the feasible region defined by ϵ = ϵγ . The condition number
and collinearity index are slightly reduced κ(S(uE)) = 1.4365× 1016 < κ(S(uIG) =
1.5289×1016) and γ(S(uE)) = 2.3338×1015 < γ(S(uIG)) = 3.5178×1015. However,
the matrix S(uE) conserves its severe ill-conditioning. The same increment in the
active parameter subset dimension is observed for the A-optimal design.
Truncated Singular Value Decomposition (Reg=TSVD)
For the TSVD (see Section 3.3.2) the following features were observed:
1. In contrast to the indirect manipulation of the SVs in SsS, in TSVD the SVs is di-
rectly manipulated (truncated) and the remaining singular values do not change. The
truncated spectrum is then used to construct (approximate) a new matrix STSV D.
2. The TSVD produces a well-conditioned but rank-deficient matrix STSV D(uIG). The
covariance matrix CTSV D(uIG) (see Table 3.1) is then also rank-deficient and thus,
the D-optimal criterion cannot be computed. Hence, only A and E-designs were
studied for TSVD.
3. In Figure 7.6a-b, the SVs of the A and E-designs for problem E2 and E3, respec-
tively are depicted. From the comparison of Figures 7.5 and Figure 7.6 it can be
observed, that for both TSVD and SsS based optimal designs, very similar results
were obtained. This is true for the problem rank (number of identifiable parameters/
singular values) as well as for the influence of the different design criteria (see also
Figure 2.2). However, both, the computed design criterion and the optimal design
vector differ in their values according to the regularization technique applied. This
is certainly due to the differences in the number of considered parameters (whole
parameter space for TSVD and reduced space for SsS) and the differences in the
values of the initial SVs (compare Figures 7.5 and Figure 7.6). It should be also
noted that the design criteria in Eq. 2.36-2.38 are scaled differently, with Nθ and rϵ
for TSVD and SsS, respectively.
130
7.3. Results and Discussion
1E-04
1E-03
1E-02
1E-01
1E+00
1E+01
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Singular value (ςςςςi)
None-IG
TSVD-E
None-IG
TSVD-E
1E-06
1E-04
1E-02
1E+00
1E+02
1 2 3 4 5 6 7 8 9 10
None-IG-OrgTSVD-A-OrgNone-IG-OrgTSVD-A-Org
∈∈∈∈=∈∈∈∈κκκκ
Sin
gu
lar
va
lue
( ςς ςςi)
S
ing
ula
r v
alu
e ( ςς ςς
i)
∈∈∈∈=∈∈∈∈γγγγ
Section of the
complete SVs from
ς1 to ς44
(b)
(a)
Figure 7.6.: Singular value spectrum (SVs) of the original and the regularized sensitivity matrixafter applying Reg=TSVD, S and STSV D, respectively. Results are shown at initialuIGand optimal ucrit experimental designs, respectively for problem: (a) E2 whereucrit = uA, and (b) E3 where ucrit = uE . The solid-black curve shows the SVs of theoriginal sensitivity matrix without regularization at initial design S(uIG). The black-cross and gray-cross markers show the SVs of the regularized (approximated) matrixat the initial and optimized designs STSV D(uIG) and STSV D(ucrit), respectively. Notethat the lower bound ϵκ is computed for STSV D(ucrit). (Figure taken from publicationIII - Lopez et al. (2015) in Appendix A.2 - reprinted from Computers & ChemicalEngineering with permission from Elsevier).
Tikhonov Regularization (Reg=Tikh)
In this section the effect of weak and strong regularizations caused by the regularization
parameter λ (see Eq. 3.10) in Tikhonov regularization (see Section 3.3.3) is addressed.
The weak regularization is determined by λ1 = 0.001 whereas λ2 = 0.1 for the strong
regularization. The new features of the regularized sensitivity matrix 4 ST ikh are also
described. After applying this regularization, the following features are observed:
1. This regularization yields an increment (lift) of the SVs of the matrix S(uIG) at the
initial design maintaining the original parameter space dimension and the number
of singular values. For the stronger regularization parameter (i.e., λ2) the transfor-
mation of the SVs is also stronger (compare Figures 7.7a and b). The lift of the SVs
yields a transformation of the bottom section of the SVs by fixation of the small
singular values to λ2. Accordingly, singular values less than the respective regular-
ization parameter are replaced by λ2, i.e., ∀iςi < λ2 then ςi ≈ λ2. However, the
4STikh =
[S
λ2L
].
131
7. Effect of ill-posed parameter estimation on optimal experimental design
larger singular values may be also shifted.
For Problem E2 with λ1, the effect of the regularization is very small and the SVs
of ST ikhλ1
almost remains identical to that shown in Figure 7.2b. It may be expected,
taking into account that the smallest singular value of S, ς10 = 2.0383 × 10−6 is
larger than λ21 = 1.0 × 10−6. Nevertheless, a reduction of 10% in the condition
number and collinearity index is achieved. On the other hand, choosing a stronger
regularization with λ2, the last three singular values of S are increased to values
around λ22 = 1.0 × 10−2 in ST ikh
λ2)(uIG) (see Figure 7.7). Therefore, κ(ST ikh
λ2)(uIG))
and γ(ST ikhλ2
)(uIG) are drastically decreased to 6.4× 103 and 1.0× 102, respectively.
Despite of this improvement, the number of well-conditioned singular values is seven
(because ςi ≈ λ22 < ϵ = 6.67×10−2 for i = 8, · · · , 10) demonstrating that the problem
still remains ill-posed.
For Problem E3 with λ1, the smallest nine singular values (i.e., ςi ≈ λ21 for
i = 36, · · · , 44) are increased (see Figure 7.7a). However, the regularization is not
strong enough. Hence the matrix CT ikh1 still has three negative eigenvalues. For the
stronger regularization with λ2, the smallest thirty five singular values (i.e., ςi ≈ λ22
for i = 10, · · · , 44) are increased (see Figure 7.7b). The new regularized matrix
CT ikh2 has positive eigenvalues and the new condition number and collinearity index
are κ(ST ikhλ2
)(uIG)) = 4.3461 × 102 and γ(ST ikhλ2
)(uIG)) = 1.0 × 102. Again the ill-
conditioning improvement is not enough to turn the problem in a well-conditioned
one (the same seven well-conditioned singular values), however its transformation
is notorious. It must be noted that values of λ2 larger than ϵ, which completely
remove the ill-conditioning, might be selected. Nonetheless, those values strongly
affect the problem fitting producing large residuals. Thus they are not considered in
this study.
2. For problem E1 and E2, A- and D-optimal designs are successfully computed, see
e.g. Figure 7.8. Moreover, the influences of the different criteria are as described in
Section 2.5, see e.g. the results for E2 for the A-design in Figure 7.8a.
For Problem E3 with λ1, the D-design could not be computed as its criterion is
infinite at uIG. In contrast, for the stronger regularization with λ2 the D-design could
be computed and the criterion is reduced from ΨD(uIG) = 2.2 × 103 to ΨD(uD) =
1.6× 103.
3. Generally, the computed E-designs are not always reliable especially for strong reg-
ularization with λ2. Here the smallest singular values of ST ikh(ucrit) are repeatedly
corrected in each optimization iteration (i.e., ςi ≈ λ2 ∀iςi < λ2). Accordingly, if
the optimal design does not promote a SVs with a smallest singular value ςNθlarger
than λ2, this singular value is always approximated to λ2. This situation generates
an almost constant E-criterion value, equal to 1/(λ2)2, and makes the optimization
132
7.3. Results and Discussion
1E-16
1E-14
1E-12
1E-10
1E-08
1E-06
1E-04
1E-02
1E+00
1E+02
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43
None-IG
Stikh_
ill-conditioned ςςςςi
∈∈∈∈=∈∈∈∈γγγγ
wel
l-co
nd
itio
ned
ςς ςςi
Sin
gu
lar
valu
e ( ςς ςς
i)
Sin
gu
lar
valu
e ( ςς ςς
i)
(b)
(a)
1E-16
1E-14
1E-12
1E-10
1E-08
1E-06
1E-04
1E-02
1E+00
1E+02
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43
Singular value index i
None-IG
Stikh_
∈∈∈∈=∈∈∈∈γγγγ
ill-conditioned ςςςςi
wel
l-co
nd
itio
ned
ςς ςςi
Figure 7.7.: Singular value spectrum (SVs) of the original and the regularized sensitivity matrixafter applying Reg=Tikh, S and STikh, respectively for problem E3 with: (a) λ1 =0.001 (weak regularization), and (b) λ2 = 0.1 (strong regularization). The solid-blackcurve shows the SVs of the original sensitivity matrix without regularization at thethe initial design S(uIG). The black-cross markers show the SVs of the regularizedmatrix at the initial design STikh(uIG). (Figure taken from publication III - Lopez etal. (2015) in Appendix A.2 - reprinted from Computers & Chemical Engineering withpermission from Elsevier).
unreliable.
7.3.5. Influence of the available measurement information on ill-posedness
(New)
This section summarized unpublished results about the influence of the available mea-
surement information on the conditioning of the sensitivity matrix (as a measure of the
ill-posedness of the PE) for problem E1. For doing so, a simple adaptation of E1 by adding
more experimental data is considered. In the following, the original sensitivity matrix S
was considered as base case. The problem formulation is changed and three different ex-
perimental data sets are considered, i.e., E1a, E1b and E1c (see Table 7.1). Firstly, the
state variable x2 is considered as additional measured variable, then the new experimental
data vector reads ym = (x1, x2)T . Secondly, the sampling times are changed to increase
the number of experimental points to (2 : 2 : 20), (0.5 : 0.5 : 20), and (0.25 : 0.25 : 20),
with a total number of experimental points Ny ·Nm ·Ne equal 20, 80 and 160 for problem
E1a (case reported in Asprey and Macchietto, 2000), E1b, E1c respectively.
The SVs of S evaluated at uIG for problem E1, E1a, E1b, E1c are shown in Figure 7.9.
It can be seen that adding more measurement information, the SVs reaches larger values
and shifts especially the singular value ςNθto higher values. The ϵ-threshold for E1 and
E1a is ϵ = maxϵκ, ϵγ = ϵγ , whereas for E1b and E1c the ϵ-threshold was ϵ = ϵκ. Note,
133
7. Effect of ill-posed parameter estimation on optimal experimental design
1E-06
1E-05
1E-04
1E-03
1E-02
1E-01
1E+00
1E+01
1E+02
1 2 3 4 5 6 7 8 9 10
None-IG
Tikh_2_lambda0.1-IG
Tikh_2_lambda0.1-A
_ _()
∈∈∈∈=∈∈∈∈γγγγ
1E-06
1E-05
1E-04
1E-03
1E-02
1E-01
1E+00
1E+01
1E+02
1 2 3 4 5 6 7 8 9 10
Singular value index i
None-IG
Tikh_2_lambda0.1-IG
Tikh_2_lambda0.1-E
_
_()
∈∈∈∈=∈∈∈∈κκκκ
Sin
gu
lar
valu
e ( ςς ςς
i)
Sin
gu
lar
valu
e ( ςς ςς
i)
(a)
Figure 7.8.: Singular value spectrum (SVs) of the original and the regularized sensitivity matrixafter applying Reg=Tikh, S and STikh, respectively. Results are shown at initialand optimal experimental design, uIG and ucrit, respectively, for problem E2 with:(a) ucrit = uA and (b) ucrit = uE . The solid-black curve shows the SVs of theoriginal sensitivity matrix without regularization at the initial design S(uIG). Theblack-cross and gray-cross markers show the SVs of the regularized matrix at initialdesign STikh(uIG) and at optimal design STikh(ucrit), respectively. Note that thelower bound ϵκ is computed for the SVs of STikh(ucrit). (Figure taken from publicationIII - Lopez et al. (2015) in Appendix A.2 - reprinted from Computers & ChemicalEngineering with permission from Elsevier).
1E-02
1E-01
1E+00
1E+01
1E+02
1E+03
1 2 3 4
Singular value index i
∈∈∈∈γγγγ
∈∈∈∈κκκκ|E1c
E1c
E1b
E1a
E1
∈∈∈∈κκκκ|E1
Sin
gu
lar
va
lue
ind
ex (
ςς ςςi)
∈∈∈∈κκκκ|E1a
Figure 7.9.: Comparison between the singular value spectrum (SVs) of the sensitivity matrix Sat the initial design uIG for example E1 with different experimental data sets. E1a-care adaptations of E1 including x1 and x2 as measurable variables and increasing thenumber of experimental points to 20, 80 and 160, respectively.
134
7.3. Results and Discussion
that ϵκ is a function of ς1. Therefore it is changing for each spectrum whilst ϵγ is constant
for all cases. The SVs for E1a, E1b and E1c are over the bounds ϵγ and ϵκ and thus, it
indicates well-conditioned problems with reduced condition number κ(S) and collinearity
index γ(S). As a result, E1a, b, c are well-posed. It can also be observed in Figure 7.9,
that the influence of the additional measured variable x2 in E1a-c is larger compared to
the influence of an increase in sampling points because it changes the slope (condition
number) of the spectrum of E1a. Nevertheless, the addition of more experimental points
(although only generating a slight parallel lift of the spectra) also reduce the parameter
variance of problems E1b and E1c.
1E-02
1E-01
1E+00
1E+01
1E+02
1E+03
1 2 3 4
Singular value index i
∈∈∈∈γγγγ
E1c-D
E1c-A
E1c-E
E1c-IG
()
Sin
gu
lar
va
lue
ind
ex (
ςς ςςi)
∈∈∈∈κκκκ|E1c-D
∈∈∈∈κκκκ|E1c-IG
∈∈∈∈κκκκ|E1c-A/E
Figure 7.10.: Comparison between the singular value spectrum (SVs) of the sensitivity matrixS(ucrit) at the optimal design ucrit with crit=A,D,E for the well-posed case E1c(no regularization). Spectrum labeled S is evaluated at the initial design uIG.
Finally, the A, D and E-optimal designs for E1 and its adaptations E1a-c are computed.
Note that the singular value spectra in Figure 7.9 correspond to the initial guesses of the
corresponding OED problems. In Figure 7.10, the spectra are shown for S(ucrit) after
performed A, D and E-optimal designs for the adaptation E1c. For all adaptations E1a-c,
the A-, D-, E-optimal designs promote well-conditioned sensitivity matrices S(ucrit) with
reduced collinearity indices, i.e., γ(S(ucrit)) < γ(S), despite of an increase in the condition
number not exceeding κmax, i.e., κ(S) < κ(S(ucrit)) < κmax). On the contrary, the ill-
conditioned case E1 retained the ill-conditioning of the sensitivity matrix after conducting
D-design (data not shown here).
Moreover, for all evaluated problems E1, E1a, E1b, E1c, similar effects were observed
regarding the SVs of S(ucrit) when applying the different design criteria. According to
Figure2.2, the A and E-designs lifted the bottom section of the SVs. However, the top
section was also elevated for E1a-c. Generally, A and E-designs produced almost identical
SVs (see, for instance, Figure 7.10 for E1c). They also encouraged SVs with the smallest
collinearity index. In all D-optimal designs the largest increment was observed for the top
135
7. Effect of ill-posed parameter estimation on optimal experimental design
section of the SVs, see, for instance, Figure 7.10 for E1c. The D-optimal SVs had the
largest condition number κ (without exceeding κmax). For instance, the biggest increase
in κ was computed for the D-optimal design for E1c as 148%.
7.3.6. Monte Carlo study (New)
This section is intended to numerically evidence the weaknesses of optimal designs coming
from ill-posed parameter estimations. All results shown here are accomplished by applying
the Monte Carlo method (second version) described in Section 4.2.2 to the problem E2.
The sets of synthetic data are sampled from N (Y m, Cy), the number of replications is 5000
(L=5000) and the assumed standard deviation of measurement error is σ2y = 0.01. With
these results the estimator performance in terms of precision (Section 4.3.1) and accuracy
(Section 4.3.2) as well as ill-conditioning by Monte Carlo (Section 4.4.1) are presented.
Moreover, identifiability analysis is conducted by using the variance method in Section
4.4.2.
Study of the initial design
This section statistically investigates the estimator behavior computed from parameter
estimations using perturbed experimental data generated by the experimental conditions
of the initial design uIG without optimization, u = uIG. Figure 7.11a summarizes the
statistical results of the (normalized) parameter estimates in form of box plots. Therein
the cost function (residual) norm and the true parameter values are also schematized.
Estimator analysis The performance of the estimator Θj of the parameter θj for j =
1, · · · , Nθ is analyzed as follows. In this scenario the estimator of the parameters θ1, θ2, θ7
and θ9 are quite precise with relative standard deviation5 less than 8% and accurate with
relative bias6 less than 1% (see comparatively short box plots in Figure 7.11a). However,
parameters θ5, θ6, θ8 and θ10 have box plots comparatively tall with mean values (solid-
black points in Figure 7.11a) far away from the true values. That means those parameters
are the most uncertain in the system. Concretely, in terms of accuracy θ5, θ6, θ8 and
θ10 have large relative bias 60%, 61%, 283% and 2777%, respectively, whereas in terms
of precision the same parameters have relative standard deviations equal to 99%, 99%,
903% and 197%, respectively. Finally, in order to simultaneously consider the effects of
variance and bias of the estimator Θ the empirical MSE is computed. The value of MSE
is 5.75× 104 with 85% of variance variance and 15% of bias contribution.
Ill-conditioning and identifiability diagnosis Using the results in Figure 7.11a and Sec-
tion 7.3.6 is easy to conclude that the parameter estimation of E2 at uIG is unstable due
5with respect to the mean of the parameter distribution6with respect to the true value
136
7.3. Results and Discussion
10-2
100
102
104
r1max Ks K2 KM1 r3max KM2 Ksynmax KIB Yxs KIA ResNorm
Para
mete
r v
alu
es
Mean Value
True Value
θθθθ1
θθθθ2
θθθθ3
θθθθ4
θθθθ5
θθθθ6
θθθθ7
θθθθ8
θθθθ9
θθθθ10
CF
Norm
(a)
10-2
100
102
104
r1max Ks K2 KM1 r3max KM2 Ksynmax KIB Yxs KIA ResNorm
Para
mete
r v
alu
es
Mean Value
True Value
θθθθ1
θθθθ2
θθθθ3
θθθθ4
θθθθ5
θθθθ6
θθθθ7
θθθθ8
θθθθ9
θθθθ10
CF
Norm
(b)
Figure 7.11.: Monte Carlo problem E2: Box plots of normalized parameter estimates and corre-sponding cost function norm obtained at (a) initial design uIG and (b) E-optimaldesign uE without regularization.
to the large variability in estimators of some parameters, i.e., θ5, θ6, θ8 and θ10. This fact
proves the findings by using the sensitivity method in Section 7.3.2. However, it is here
said that these four parameters already reflect the ill-conditioning (being unidentifiable)
but they are not the complete consequence. Other parameters such as θ3 and θ4 also
have large relative standard deviations of 57% and 60%, respectively. If it is considered
a maximum acceptable parameter variability of 20% those parameters are also consid-
ered unidentifiable. In total six parameters are practically unidentifiable and the model
is therefore unidentifiable. It is important to point out that the the sensitivity method
for ill-conditioning and QR method for identifiability have pretty good qualitative results
compare to Monte Carlo studies but due to the effect of inflated parameter variance the
quantitative results are not reliable.
Study of the optimal design without regularization
This section statistically investigates the estimator behavior computed from parameter
estimations using perturbed experimental data generated by the experimental conditions
of the E-optimal design uE without optimization, u = uE . Figure 7.11b summarizes the
statistical results of the (normalized) parameter estimates in form of box plots. Therein
the cost function (residual) norm and the true parameter values are also schematized.
137
7. Effect of ill-posed parameter estimation on optimal experimental design
Estimator analysis The experimental conditions in the E-optimal design promotes es-
timates with reduced parameter variance for those parameters related to the most ill-
conditioning parameters θ5, θ6, θ8 and θ10 and not only to the most imprecise parameter
θ10, see Figure 7.11b. That is expected according to the explained in Section 2.7.2. The
new reduced relative standard deviation for parameters θ5, θ6, θ8 and θ10 are now 47%,
46%, 39% and 69%, respectively. That seems an appropriate behavior of an optimal design.
However, the cost is a lose of precision of parameters such as θ1, θ2, θ7 and θ9 which had
less variability in the scenario before conducting the optimal design (see Section 7.3.6).
The new relative standard deviations of θ1, θ2, θ7 and θ9 are 12%, 41%, 14%, and 15%,
respectively. Moreover, the bias of these parameters is larger than before reaching up to
7% (compared to the maximum 1% with uIG). The most affected parameter in this design
was θ4 whose relative standard deviation increased to 187% and accuracy deteriorated to
a relative bias of 106%.
On the other hand, the MSE of this scenario is 2.61× 103 with 40% of variance con-
tribution and 60% of bias contribution. These MSE reduction is basically generated by
the parameter variance control of the most uncertain parameters θ8 and θ10 which also
encourages an improvement in their accuracy. However, it is important to note that the
large variability of θ8 and θ10 is here generated at uIG by their low sensitivities. That
means those parameters did not have enough influence on the measured outputs at the
initial design. This situation does not change by the E-design and those parameters are
again in the last positions of insensitive parameters. Having to, to increase precision of
parameters which by definition does not have enough effect in the model at the expense
of those which really may be identified with the available data does not make sense. This
a consequence to conduct optimal experimental designs under ill-posedness.
Ill-conditioning and identifiability diagnosis According to Monte Carlo results in Figure
7.11b this scenario still preserves the ill-conditioning although it is significant improved
after the E-optimal design. As mentioned immediately above the parameter variability
has been reduced however the reduction is not sufficient to classify the problem as iden-
tifiable (all parameter variances should be less than 20%). The parameters which do not
fulfill the maximum acceptable parameter variability of 20% are θ3, θ4, θ5, θ6, θ8 and θ10.
Consequently they are categorized as unstable and also practically unidentifiable.
Study of the optimal design with regularization
In this section the regularized A-optimal design computed by using subset selection
(Reg=SsS) in OED (results are shown in Figure 7.5a) is implemented in two Monte Carlo
studies:
• Study I: Monte Carlo with parameter estimations without regularization, i.e.,
Reg=None but with experimental conditions of the A-optimal design when Reg=SsS.
138
7.3. Results and Discussion
• Study II: Monte Carlo with parameter estimations with regularization, i.e., Reg=SsS
but with experimental conditions of the A-optimal design when Reg=SsS.
Figure 7.12 summarizes the results of the (normalized) parameter estimates and the cost
function (residual) norm in form of box plots. The true parameter values are also schema-
tized. It will be shown that optimal designs which do not completely overcome the ill-
conditioning issues of the original problem do not perform well in parameter estimations
without regularization. However, if the parameter estimations are again regularized the
parameter variance control imposed by fixing the ill-conditioned parameters does not af-
fect the most independent parameters of the system. Special attention should be paid to
those parameters shown correlated behavior at least to monitor their new variance which
should not be larger than the scenario with the initial design.
10-2
100
102
\theta_1 \theta_2 \theta_3 \theta_4 \theta_5 \theta_6 \theta_7 \theta_8 \theta_9 \theta_10 ResNorm
Para
mete
r v
alu
es
Mean Value
True Value
θθθθ1
θθθθ2
θθθθ3
θθθθ4
θθθθ5
θθθθ6
θθθθ7
θθθθ8
θθθθ9
θθθθ10
CF
Norm
(a)
10-2
100
102
\theta_1 \theta_2 \theta_3 \theta_4 \theta_5 \theta_6 \theta_7 \theta_8 \theta_9 \theta_10 ResNorm
Para
mete
r v
alu
es
Mean Value
True Value
θθθθ3
θθθθ5
θθθθ7
θθθθ8
θθθθ9
θθθθ10
CF
Norm
θθθθ1
θθθθ2
θθθθ4
θθθθ6
(b)
Figure 7.12.: Monte Carlo problem E2: Box plots of normalized parameter estimates and cor-responding cost function norm obtained at regularized A-optimal design uA withReg=SsS by solving parameter estimations with (a) Reg=None (Study I) and (b)Reg=SsS (Study II).
Estimator analysis The study I yields estimators with large parameter variance and bias.
This time six parameters (θ3, θ4, θ5, θ6, θ8 and θ10) have parameter variance larger than
85% whereas in the scenario where the experimental conditions without optimization are
used i.e., at uIG only four parameters overpassed this percentage. The only parameters
with slightly parameter variance reduction are θ8 and θ10 with relative standard deviation
139
7. Effect of ill-posed parameter estimation on optimal experimental design
of 752% and 86%, respectively. Regarding bias the only parameter with improvement is
θ10 with new relative bias of 1870% (which is not promising). Other parameters remain
with similar accuracy to the case with the initial design although θ8 is more inaccurate
(relative bias of 516%). The new value of MSE is 3.12× 104 with 87% of variance contri-
bution and 13% of bias contribution. These overall performance measure indicates that
the new optimal design does not any effect in the still ill-conditioned parameter estima-
tions. Therefore, the effort invested in conducting and implementing this kind of optimal
experimental design is meaningless if the new estimations are not numerically stabilized
by regularization.
The Study II illustrates this new scenario. The results in Figure 7.12b display the
effect complete parameter variance control of the regularization. The former problematic
parameters (due to large variance) θ5, θ6, θ8 and θ10 at the initial design (see Section
7.3.6) are in this case the most precise parameters (except θ8 which has a relative standard
deviation of 23%) due to their fixation by regularization. Parameter θ10 is in all replications
fixed at its initial guess. Nonetheless, parameters θ1 and θ2 become more imprecise maybe
due to remaining ill-conditioning not detected by the ϵ-threshold. In terms of accuracy
the relative bias of all parameters is reduced except for θ1 and θ2 with 3% and 5%. The
apparently positive effect of the regularization by SsS on parameter bias might be related to
the low sensitivity and (maybe) high correlation of the selected unidentifiable parameters.
The combination of these factors makes possible to fix the problematic parameters without
introduce an extra bias in the remaining parameters. In other cases where the selection of
the parameters to be fixed does not considered these aspects might lead to highly biased
estimations.
Ill-conditioning and identifiability diagnosis According to Monte Carlo results in Figure
7.12a, Study I completely preserves the ill-conditioning without any significant improve-
ment after using the A-optimal design. Eight parameters (θ1, θ2, θ3, θ4, θ5, θ6, θ8 and θ10)
are considered unstable and unidentifiable overpassing the predefined relative standard de-
viation limit of 20%. On the contrary, the ill-conditioning in the regularized Monte Carlo
(Study II) seems to be better. However, parameters θ1, θ2, θ3, θ4 and θ8 are considered
slightly unstable but anyway unidentifiable because they have relative standard deviation
between 23% and 41%.
7.3.7. Summary and conclusions
In model-based parameter estimation and experimental design the formulation of well-
posed problems is crucial for the numerical computation of stable and unique solutions.
Thus, it is strongly advisable to perform the relatively simple local analysis of the ill-
conditioning of the sensitivity matrix (S). Moreover, if an ill-posed problem is identified,
its problem type, either rank deficient or of ill-determined rank, and problem severity
140
7.3. Results and Discussion
should be assessed. Important indicators suggested in this contribution are the singular
value spectrum, condition number, and collinearity index of the sensitivity matrix.
There exists a relationship between those singular values in the SVs of S and commonly
used metrics for parameter identifiability and OED, namely the eigenvalues of F (FIM)
and the eigenvalues of the parameter covariance matrix (C). This information is partic-
ularly useful when dealing with ill-posed experimental design problems, as it is highly
recommendable to do the numerical analysis and implementation based on the singular
values of S. Additionally, a graphical interpretation of the influence of the alphabetic ex-
perimental design criteria applied to the SVs has been presented. It turns out that A-
and E-optimal criteria mainly improve the smallest singular values of S while D-optimal
criterion improves the largest singular values. Thus, the potential of an experimental de-
sign for improving the parameter precision can be analyzed similarly to the well-known
graphical interpretation of the influence of the alphabetic design criteria applied to the
parameter variances stored in C. However, for ill-posed problems with ill-conditioned S
and consequently ill-conditioned F and C, the direct application of the alphabetic design
criteria may lead to numerical instabilities, especially in the eigensystem of F and mean-
ingless designs for the next parameter estimation. Whereas the computation of the SVs is
numerically stable also for ill-posed problems, small singular values of S (especially those
near zero) have a large influence on the design criterion evaluated by F (FIM) and C. They
can produce huge criterion values, which then complicate or even impede an appropriate
optimization. Thus, a control of the smallest singular values of S is needed (e.g., maximiz-
ing the smallest singular values) and by this, a control of the most uncertain parameters
(those related to the largest eigenvalues of C).
As a solution to this, different regularization techniques, namely SsS, TSVD and
Tikhonov, were presented together with details on their implementation. To illustrate
their application to parameter identification and OED problems, three different case stud-
ies of increasing complexity and ill-posedness were considered. The following particular
features, strengths and weaknesses of each technique were identified.
Each regularization technique implies a transformation of the original problem. Hence,
the regularized optimal design certainly improves the ill-conditioning of the regularized
problem but not necessarily the ill-conditioning of the original one. Moreover, the methods
ensure to obtain a solution but provide no information whether the obtained solution is
still useful in the original context.
Particular attention should be paid to ill-determined rank problems. Because of the
nearness to zero of the singular values, a strong regularization is needed. Unfortunately,
this pushes the regularized problem away from the original problem and increases the bias
in the solution.
The effect of each regularization technique modifies the problem to improve its ill-
conditioning which is evidenced in the SVs of the matrix S of the regularized problem.
In SsS, the problem is transformed excluding the unidentifiable parameters and by this
141
7. Effect of ill-posed parameter estimation on optimal experimental design
the related (ill-conditioned) singular values. In TSVD the ill-conditioned singular values
are substituted by zero and the new problem only considers singular values with large mag-
nitude in the original problem. Finally, the effect of Tikhonov regularization is to fix the
ill-conditioned singular values to the magnitude of the squared regularization parameter
(λ2), whilst the large singular values are not largely modified.
The selection of the regularization parameter, i.e., ϵ-threshold for SsS and TSVD, and
the scalar parameter λ in Tikhonov, is problem dependent and should be done carefully.
Firstly, the ill-conditioning of the problem should be improved and, secondly, the problem
should not be completely changed. For instance, in Tikhonov, the regularization parameter
λ should improve the ill-conditioning in the sensitivity matrix (which should also control
the solution norm of the PE) without greatly affecting the essence of the original problem
(controlling the norm of the residual in the PE). In that sense, the value of this parameter
(according to the construction of the regularized sensitivity matrix) should be as large as
the first singular value which is considered to cause the ill-conditioning.
When applying Tikhonov regularization, a computed E-optimal design is not viable,
if this new design does not promote a SVs with the smallest singular value larger than
λ2. Here, the regularization parameter continues fixing the small singular values (i.e.,
ςNθ≈ λ2) and makes an optimization impossible (fixing the E-criterion value to 1/(λ2)2
in each iteration).
For SsS and TSVD there exists the tendency to conserve the largest singular values which
are associated to the well-conditioned parameters. From an applications point of view, the
SsS seems the most natural approach, as the regularization acts in the original parameter
space. It preserves the physical meaning of parameters and provides useful information on
the number of identifiable parameters as well as on the ranking of parameters regarding
their linear independence and sensitivity.
However, it has to be noted, that a change in the dimension of the identifiable parameter
subset (in SsS) or the number of well-conditioned singular values (in TSVD) introduces
a discontinuity in the evaluation of the design criterion. This problem is similar to the
inherent non-differentiability in the definition of the E-optimal criterion, where a possi-
ble switching in the smallest eigenvalue introduces this behavior. Generally it might be
worthwhile to note, that any increment in the SVs (promoted by the OED) does not au-
tomatically mean a reduction in the ill-posedness of the problem. On the contrary, large
values on the top section of the SVs (without an adequate increase in the bottom section)
may lead to higher condition number which then give rise to ill-estimated parameters of
unstable PE. This behavior was observed here especially for the D-design.
Notice, that no improvement in identifiability is expected by introducing regularization.
It only enables to reliably estimate the most identifiable parameter. To improve identifia-
bility the experimental data should be increased in quantity and quality. New observable
variables could be included and the frequency and precision of the current measurements
could be also improved.
142
8. Chromatography system: Overcoming
deficiency of scarce experimental data in
online estimation
8.1. Abstract
1 The application of techniques of OED has proved to be an important strategy for model
selection and parameter precision improvement. Once the model structure is defined (e.g.,
by modeler heuristic, using model discrimination based on OED, etc.), the next task is to
recovered the model parameters from the available experimental data by using parameter
estimation techniques. When the first parameter vector is estimated and its estimator
performance is assessed, the next interest is to improve or refine its quality. The quality is
called accuracy if the estimator is unbiased, otherwise it is called precision. The sequential
redesign of experiments aims at reducing the uncertainty of the estimated vector and
in nonlinear models it compensates uncertainties arising, for instance from parameter
initial guesses. In the context of dynamic processes the potential of exploiting the new
experimental information as soon as it is available has recently attracted the attention of
different research groups. In that sense, the online redesign of experiments for improving
parameter precision plays an important role. Several applications of the online redesign
to nonlinear differential equations can be found in literature [39, 43, 62, 136, 102, 107].
However, all listed applications have been only theoretically implemented. Only very few
reports on the demonstration with experimental implementation have been published so
far, i.e., parameter identification of a chromatography system [8], a temperature controlled
tank [133] and a dynamic model of the Baxter Research Robot from Rethink Robotics [128].
In those experimental-based studies, the need of handling the arising ill-conditioned and
identifiability problems has been detected. Usually the assumption is that well-conditioned
PE and OED are solved at each time instant therefore all parameters are identifiable.
That is not true specially at the beginning of the experiment when the data is scarce. The
consequences of ill-conditioned PE and OED are commented in Section 3.2.1. The generate
practical problems such as instabilities and poor convergence in the parameter estimation,
and ineffective and/or meaningless experiment designs [8, 78]. Therefore, adequate actions
need to be taken to ensure the robustness and reliability of the algorithm for changing
1The content of this chapter is reprinted (adapted) with permission from (T. Barz, D. C. Lopez C.,M. N. Cruz B., S. Korkel, and S. F. Walter. Real-time adaptive input design for the determinationof competitive adsorption isotherms in liquid chromatography. Computers & Chemical Engineering,94:104-116, 2016). Copyright (2016) Elsevier. (Publication IV in Appendix A.2).
143
8. Chromatography system: Scarce experimental data in online estimation
experimental information.
In the available literature on the adaptive redesign of experiments ill-posed problems
had not been directly addressed. Instead, preliminary studies were carried out to test for
parameter identifiability and if necessary reduce the number of parameters to be estimated,
in order to guarantee parameter identifiability for all scenarios [21, 62]. Only in Barz et al.
2013 [8] and Yakut et al. [133, 132] the ill-posedness of the PE and corresponding OED
was explicitly mitigated by the application of the identifiable parameter subset selection
(SsS). It is important to be noted that an experimental study considering structural analy-
sis of the parameter estimation and numerical regularization had been missed in literature
till the publication of the peer-reviewed article in Ref. [8]) in which the author of this
thesis made contribution. As an new experimental implementation and an extension of the
findings of this aforementioned work, this chapter presents the online redesign of experi-
ments handling ill-conditioning and identifiability issues applied on a more complex liquid
chromatography system to determine competitive adsorption isotherm parameters. The
solution of regularized PE and OED problems as displayed in the consolidated framework
of Figure 4.1 is considered. Furthermore, the Tikhonov regularization is tested for the first
time in the context of online redesign of experiments. Special attention is paid to handle
scarceness of experimental data and therefore the parameter instability at the beginning of
the experiment. Although some results correspond to the actual experimental implemen-
tation, other theoretical aspects about the effect of several regularization techniques in the
online experimental redesign and selection of the corresponding regularization parameters
are also investigated.
It is important to point out that this chapter also contain new information which has
not been published or used to prepare a new peer-review article. This content is here
marked as “(New)“.
8.2. HPLC chromatography process
Chromatography is a mass transfer process involving adsorption. It is a technique in
analytical chemistry used for mixture separation. The principle in chromatography is
to separate a sample into its constituent parts because of the difference in the relative
affinities of different molecules for the mobile phase and the stationary phase used in the
separation. A sample is placed on a stationary phase, which is either a solid or a liquid,
and then the mobile phase, a gas or a liquid, is allowed to pass through the system. The
components of the sample will be separated based on their varying physical and chemical
properties, imparting different affinities for the two phases. Depending of the affinities
one component will migrate through the column faster than the other. Because molecules
of the same compound will generally move in groups, the compounds are separated into
distinct bands within the column. When the mobile phase is a liquid the chromatography
is called liquid chromatography. The mobile phase also called eluent should be selected
144
8.2. HPLC chromatography process
based on its polarity relative to the sample and the stationary phase. Most applications
are reported from pharmaceutical industry for the separation of fine chemicals at the
preparative scale, e.g. enantiomers, proteins and peptides [47].
The High Performance Liquid Chromatography (HPLC) was developed as method to
solve some of the shortcomings of standard liquid chromatography, e.g., slow separation
time and size of the column packing. It relies on pumps to pass a pressurized liquid
solvent containing the sample mixture through a column filled with a solid adsorbent
material. The use of high pressures in a narrow column allowed for a more effective
separation to be accomplished in much less time than was required for previous forms of
liquid chromatography. The typical hardware employed in HPLC includes a sampler, a
pump, a column and a detector. It has the capability for feedback control, including the
pumps, the mixing of concentration gradients as well as online concentration analysis.
Recently, the model-based optimization, monitoring and control of chromatography pro-
cesses has gained the attention, especially the simulated moving bed (SMB) process, see
e.g. 91, 119. Central to the modeling of these processes are the thermodynamics of the
phase equilibrium as well as dispersion and mass transfer phenomena. The online estima-
tion of the corresponding parameters in a protein separation case study were already the
subject of a recent research in which the author of this thesis was co-author [8].
8.2.1. Experiments
Experimental set-up
All online experiments were achieved in the HPLC system located in the laboratory of the
chair of process dynamics and operation of the Technische Universitat Berlin in Germany.
The HPLC system is shown in Fig. 8.1. Its specific components are a Smartline Man-
ager 5050 with Low Pressure Gradient module operated in combination with a Smartline
Pump 1050 (both from Knauer GmbH, Berlin, Germany), a Vertex Plus Column and the
Smartline ultraviolet (UV) Detector. The Vertex Plus Column 125 x 4.6 mm Euroshper
II 100-5 C18, with a length of 125 mm, an inner diameter of 4.6 mm and a total porosity
of 0.68 [67] is a preparative chromatography column for reversed phase applications in
the analytical as well as preparative range. Smartline UV Detector 2500 equipped with
a deuterium lamp and operated at a wavelength of 284 nm, the analog output signal is
scaled to a maximum value of 1 AU. All instruments and the column are from KNAUER,
Wissenschaftliche Gerate GmbH, Berlin, Germany. A schematic flow sheet is given in Fig.
8.2.
The Smartline Manager and Pump constitute a computer controlled gradient system
which can deliver step concentration gradients from four feed A, B, C and D. Stepwise
concentration changes are delivered at a constant flow rate to the column. The concen-
tration values (hereafter also referred to as input actions) are generated in the MATLAB
programming environment (MathWorks, Natick, MA). At the column outlet, the UV De-
145
8. Chromatography system: Scarce experimental data in online estimation
Figure 8.1.: Experimental set up of the chromatography system. (Figure taken from publicationIV- Barz et al. (2016) in Appendix A.2 - reprinted from Computers & ChemicalEngineering with permission from Elsevier)
tector measures continuously the sum of all concentrations. These concentrations are
processed in MATLAB.
Smartline
UV Detector
Vertex Plus Column
concentration
Smartline Manager and Pump
feed A
feed B
product
computer with Matlab
programming environment
flow-ratio const.
flow
static mixer
feed C
feed D
measurement data
input actions
Figure 8.2.: Schematic flow sheet of the chromatography system in Fig. 8.1. (Figure taken frompublication IV- Barz et al. (2016) in Appendix A.2 - reprinted from Computers &Chemical Engineering with permission from Elsevier)
The communication between Smartline Pump and Manager and MATLAB is established
over a standard RS-232 serial port using MATLAB instrument control toolbox. The
continuous measurement from Smartline UV Detector is first processed in the Freelance
Controller AC 700F using the detectors analog output signal and the Analog Input/Output
Module AX 722F (ABB, Zurich, Switzerland). The communication between Freelance
controller and MATLAB software is established using the Open Process Control (OPC)
software interface.
Measurement standard deviations are σy,i = 0.1 K and each i representing individual
diagonal elements in the measurement standard deviation matrix in Eq. 2.44.
146
8.2. HPLC chromatography process
Operation
Two different feeding strategies (FS-1 and FS-2) have been considered, see Table 8.1. They
differ in the way the feeds A, B, C, D (see Figure 8.1 and 8.2) were prepared. For both,
FS-1 and FS-2, feed A contains eluent only. For FS-1 the feed B contains eluent and a
mixture of all three benzoate. For FS-2, the benozate are individually prepared in feed B,
C, D. Moreover, in FS-1 the ratio of the three species concentrations is kept constant. In
FS-2 the three species concentrations are individually defined, thus, the degree of freedom
is three times higher than in FS-1.
Table 8.1.: Different feeding strategies (FS) for the chromatography system. Abbreviations: ethyB= ethyl benzoate, propB = propyl benzoate, butyB = butyl benzoate. (Table takenfrom publication IV- Barz et al. (2016) in Appendix A.2 - reprinted from Computers& Chemical Engineering with permission from Elsevier)
feeding strategy feed channel concentrations [mol/l] sum concentrationchannel (ratio %) eluent ethyB propB butyB range [mol/l]
A (0· · · 100) yes - - -FS-1 B (0· · · 100) yes 0.212 0.212 0.212 0.0 · · · 0.636
C - - - -D - - - -
A (0· · · 100) yes - - -FS-2 B (0· · · 33.3) yes 0.636 - - 0.0 · · · 0.636
C (0· · · 33.3) yes - 0.636 -D (0· · · 33.3) yes - - 0.636
It has to be noted, that in FS-2 the maximal individual feed concentration delivered
by the manager and pump (i.e. concentration of ethyB, propB, butyB) was restricted to
0.212 mol/L. Accordingly, the maximal sum concentration is 0.636 mol/L, the same value
as for FS-1. This value has been found critical for the miscibility of the benzoate in the
eluent. The specific chemicals used in this case study are:
• eluent: prepared from 80% methanol CH4O (CAS registry number: 67-56-1) and
20% demineralized water CH2O,
• ethyB: ethyl benzoate C9H10O2 (CAS registry number: 93-89-0),
• propB: propyl benzoate C10H12O2 (CAS registry number: 2315-68-6),
• butyB: butyl benzoate C11H14O2 (CAS registry number: 136-60-7).
All online experiments were carried out at constant temperature of 23 C and feed flow
rate of 1.5 mL/min.
8.2.2. Modeling
The HPLC process model comprises three units, namely the manager and the pump, the
chromatography column and the UV sensor. These units and the variables for the PE and
147
8. Chromatography system: Scarce experimental data in online estimation
OED problems are shown in Fig. 8.3. The desired concentration cfeedi is computed by
solving the OED problem online by the personal computer (see 8.3). The manager and the
pump are used to realize this concentration. The output concentration of the consolidated
unit manager and the pump is referred to as cini which is the inlet concentration to the
chromatography column unit. The model of each unit is given in the next sections.
Manager & Pump
Chromatography
column
= ,
= , ,
=
Figure 8.3.: Units of the process model, input/ output variables and unknown model parameters.(Figure taken from publication IV- Barz et al. (2016) in Appendix A.2 - reprintedfrom Computers & Chemical Engineering with permission from Elsevier)
Manager and pump
The experimental data which has been used for the identification of the model for the
manager and pump is shown in Fig. 8.4. A PT2T0 step response model (see the transfer
function in the Laplace domain of Eq. 8.1) was selected to describe the dynamic response
of the chromatography column inlet concentration cini as response to step changes in the
feed concentrations cfeedi .
Gi(s) =K · e−T0s
(1 + T1s)(1 + T2s)(8.1)
∀ i ∈ ethyB, propB, butyB
The same coefficients for all components i have been assumed. The deadtime e−T0s has
been approximated by an PTn of the form Gi(s) = 1/(1 + Tns)n.
The implementation in time domain of the model 8.1 yielded a sparse system of 180
ordinary differential equations of the form:
dcP,i,jdt
=cP,i,j−1 − cP,i,j
τj, (8.2)
∀ i ∈ ethyB, propB, butyB ; j ∈ 1, · · · , Nn
where cP,i,j for j = 0 is defined by the feed concentration to the manager and pump
unit such that cP,i,0 = cfeedi and cP,i,j for j = Nn is defined by the outlet concentration
of the system which is the inlet concentration of the chromatography column such that
148
8.2. HPLC chromatography process
Figure 8.4.: Responses of the manager and pump outlet concentrations (equal to the column inlet
concentrations cini ) for arbitray steps in the feed concentration cfeedi . Flow is keptconstant at 1.5 ml/min. (Figure taken from publication IV- Barz et al. (2016) inAppendix A.2 - reprinted from Computers & Chemical Engineering with permissionfrom Elsevier)
cP,i,Nn = cini .
For a volumetric flow of 1.5 ml/min the following parameter values were used: τj =
4.725 51× 10−3 min, with j = 1, · · · , Nn − 2; τNn−1 = 1.0481× 10−1 min; τNn =
1.0635× 10−3 min and Nn = 60.
Chromatography column
The equilibrium dispersive model is well established in liquid chromatography ([77, 47]).
This model yields convection-diffusion partial differential equations (PDEs) with domi-
nated convective terms. The corresponding differential mass balance for each component
i ∈ ethyB, propB, butyB read:
∂ci(t, z)
∂t+
1− ε
ε
∂qi (c)
∂t+ u
∂ci(t, z)
∂z= Dax,i
∂2ci(t, z)
∂z2(8.3)
where c and q are the concentrations in the liquid and solid phase, respectively, u is the
interstitial velocity of the liquid phase in the column, z is the spatial coordinate in axial
direction and ε is the total porosity. Dax,i is the axial dispersion coefficient which is
assumed to be equal for all components. An equilibrium relationship between liquid and
solid phase is assumed and defined by the nonlinear isotherm of the competitive Langmuir-
type:
qi(c) =Hici(t, z)
1 +∑
iKici(t, z)(8.4)
with Hi being the Henry constants and Ki thermodynamic coefficients. The following
initial conditions at t = 0 assume a clean column at the beginning of each experiment:
ci(0, z) = 0 (8.5)
149
8. Chromatography system: Scarce experimental data in online estimation
The Danckwerts inflow and outflow boundary conditions are given as:
Dax,i∂ci∂z
z=0
= u(ci|z=0 − cini
), Dax,i
∂ci∂z
z=L
= 0 (8.6)
where L is the column length.
The PDEs in Eq. 8.3 were discretized by a high resolution finite volume scheme with
flux-limitation [61]. To accurately capture the discontinuous concentration shock fronts in
axial direction a number of NEls = 60 elements has been chosen which has proved to give
sound results.
For 3 components the complete equation system (Eqs. 8.1, 8.3) consists of Neq =
(Nn +NEls) · 3 = 360 differential equations.
UV sensor
The detectors output signal sUV is given in arbitrary units (AU) and transformed to
concentration values ym given in mol/l using the calibration curve in Eq. 8.7.
ym = a exp
[−(sUV − b
c
)2]− a exp
[−(b
c
)2], (8.7)
with sUV = sUV + d
Note that ym is the sum concentration of all components given in mol/L, see also section
8.2.1. Thus, a measurement of individual components is not possible and in Eq. 8.7 an
average set of calibration parameters a, b, c, d was used, with a = 8.0917× 10−1, b = 1.2277,
c = 5.5536× 10−1, d = 3.47× 10−2. The errors of the output signal are assumed to be
Gaussian and uncorrelated with a standard deviation σ = 0.01 mol/l.
8.3. Results
8.3.1. Assignment of variables
The variables of the discretized model of the HPLC chromatography system described in
Section 8.2.2 assigned to the general problem formulation in Eq. 2.1 are in the following
shown.
The predicted response variables were:
y :=∑i
ci|z=L (8.8)
with i ∈ ethyB, propB, butyB
In Eq. 8.8, y is the sum of all individual concentrations at the column outlet. It corresponds
150
8.3. Results
to the measured concentrations ym in Eq. 8.7.
The unknown model parameters θ are related to the column model in Section 8.2.2 and
taken from Eq. 8.4:
θ := [HethyB, kethyB, HpropB, kpropB, HbutyB, kbutyB]T (8.9)
The experiment design variables/ input actions u are the desired feed concentrations which
are delivered by the manager and pump, see Eq. 8.1.
u :=[cfeedethyB, c
feedpropB, c
feedbutyB
]T(8.10)
8.3.2. Parameters of the time horizon schemes
The parameters of the finite time horizon schemes (see Section 2.7.5) used to the experi-
mental implementation of the online redesign of experimentes were:
• measurement sampling time: tk − tk−1 = 5s,
• prediction horizon: tk+h − tk = 10 min,
• control grid: 20 sec,
• control horizon: 3 min.
The control horizon was defined to reduce the number of future input actions and by this
the computation times.
8.3.3. Computational Issues
All computations were performed on an Intel(R) Core(TM)2 (CPU 6600 @2.40-GHz) com-
puter with 4-GB RAM. Parallel programming was not used. The PE and OED problems
were solved by single shooting and using MATLAB Optimization Toolbox solvers lsqnon-
lin/ trust-region-reflective and fmincon/sqp, respectively. The number of maximum itera-
tions of both solvers was restricted to one in order to restrict the computational effort and
the solvers were restarted at each sampling time point using the last results as initial values.
Model and parameter sensitivity equations were integrated using CVODES with sparse
direct solver from SundialsTB Toolbox [55]. The computed parameter sensitivities were
used to accurately calculate the gradients of the PE problem and the Fisher Information
Matrix (FIM). The gradients of the OED were supplied using finite differences. The Ja-
cobian and directional derivatives of the model equations were generated using Tapenade
[54].
151
8. Chromatography system: Scarce experimental data in online estimation
8.3.4. Online Base Case: PE without regularization (New)
This section summarized unpublished results about the uncertainty of parameter estimates,
stability and ill-conditioning of parameter estimations without regularization in the course
of the online experimentation. Hereinafter, the online parameter determination without
regularization (Reg=None) will be considered the Base Case.
The quantity and quality of experimental data are important aspects to successfully
conduct a parameter estimation. Success means to achieve a stable parameter estimation
and then to compute a meaningful and accurate parameter estimate. In online experi-
mentation the scarceness of informative data special at the beginning of the run imposes
additional constraints to the online parameter determination. These limitations in nonlin-
ear estimation produce ill-conditioned matrices (sensitivity matrix and Fisher-information)
which leads to numerical issues such as instability and convergence problems. Moreover,
if the online (also offline) adaptive input design is also implemented the existence of ill-
conditioning promotes large parameter variability (see Section 3.2.1) which means unrea-
sonably large parameter estimates and variance. The optimal experimental design (see
Section 2.7) based on this parameter information is thus unreliable.
In the sequel the results of conducting online parameter estimation by using online exper-
imental data is shown. Special attention is paid to characterize the relationship between
large parameter variability, parameter estimation instability, and an ill-conditioned sensi-
tivity matrix. Furthermore, it will be demonstrated that even when a parameter estima-
tion reaches convergence its final result (estimated parameter vector) might be meaningless
(bad fitting) and inaccurate (large bias).
0 5 10 15 20 25 30-0.2
0
0.2
0.4
0.6
0.8
Time, tk [min]
Co
nce
ntr
atio
n [
mo
l/l]
cexp
csim
Figure 8.5.: Online Base Case (Reg=None): Model fitting using the parameter estimate θ at tk =30 (shown in Table 8.2) for the online parameter estimation without regularization.The markers show the measured sum outlet concentrations whereas the solid-line showsthe corresponding simulated concentrations.
In Figure 8.5 the system dynamic predicted by the HPLC process model (see Sec-
152
8.3. Results
Table 8.2.: Parameter estimates for the online parameter estimation using different regularizationstrategies. ’True values’ refers to the best estimates obtained from offline estimationwithout iteration limits.
Parameter Regularization θ1 θ2 θ3 θ4 θ5 θ6Initial guess - 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
online NONE 3.0150 9.7200 3.0070 0.9165 0.1226 0.1811estimation SsS 5.9797 4.0984 2.9081 0.6427 0.4727 0.4712@t=30 min Tikh 5.9817 4.0969 2.9077 0.6432 0.4737 0.4683
’true’ values - 5.9950 4.0995 2.9082 0.6396 0.4685 0.4819
tion 8.2.2) using the parameter vector θ which was estimated after 30 min of online
PE/experimentation is displayed. The model fitting in figure suggests low quality in
the parameter estimate vector because the error between the predictions (solid-line) and
the experimental data (markers) is large. However, that is not a definitely evidence of pa-
rameter estimation deficiencies. Therefore, the evaluation of the estimator was conducted
according to the estimator analysis in Section 2.5. The relative bias was used as an accu-
racy metric (see Section 2.5.2). The relative bias was computed with respect to the ’true’
parameter values which were obtained from offline estimations. They corresponded to the
best available estimate with the smallest cost function (residual) value.
Table 8.2 shows the corresponding values. The percentage relative bias exhibited in
Figure 8.6 was calculated for each element of the parameter vector θk estimated without
regularization at each instant time tk. Several points can be highlighted from these results.
First that effectively the parameter estimation was unstable specially when the experimen-
tal information was not enough. At the beginning of the experimentation the limited
quantity of measurements generated ill-conditioned sensitivities matrices and large step
directions υk (see Eq. 2.6). Therefore the parameters jumped to large values being total
inaccurate (i.e., large relative bias in Figure 8.6). Almost two third of the experiment (till
around 20 min) the estimates remained meaningless with large bias. However, as long as
the data collection was increasing the new estimates were refined and they slowly reached
an acceptable convergence from t=20 min.
At this point it is important to point out that the reached convergence for the online
parameter estimation did not mean final accurate parameters as seen in Figure 8.6 neither
good model fitting. This facts were expected according to the ill-conditioning and the
duration of the instability of the online parameter estimation. In order to analyze the
ill-conditioning in the online experimentation the singular value spectrum SVstk of each
sensitivity matrix S−k at instant k are shown in Figure 8.7. In figure is easily observed the
ascendent evolution of the SVs from the beginning till the end of the experiment. Note
that the addition of new data lifted the spectrum at each instant tk and from a time
point (around tk = 20), when the experimental data was sufficient informative, the SVs
exceeded a typical threshold (i.e., ϵ = 6.7 for κmax = 1000 and γmax = 15) to assure
153
8. Chromatography system: Scarce experimental data in online estimation
0 5 10 15 20 25 300.01%
1%
100%
10,000%
Rela
tiv
e B
ias
(%)
Time, tk [min]
θ1
θ2
θ3
θ4
θ5
θ6
1%
100%
Stabilization start
Figure 8.6.: Online Base Case (Reg=None): Relative Bias (%) of each estimated parameter (w.r.t.its corresponding true parameter value in Table 8.2) computed at each sampling timeafter solving parameter estimations without regularization.
a well-conditioned problem (see Section 3.2.1). The apparently “overcoming“ of the ill-
conditioning in that late stage of the experiment although encouraged the convergence
did not tackle the large uncertainty of the previously unstable parameter estimates. That
could be presumably attributed to a possibly remaining ill-conditioning which cannot be
visualized by the magnitude of the considered regularization parameter ϵ. Consequently,
not only the need of regularizing the PE and OED was explicitly manifested but also the
requirement of a tuned regularization parameter. Regularization is here intended to get
rid of the ill-conditioning controlling the instability and parameter variability to smooth
the way to get a final accurate (or at least precise) parameter vector.
Figure 8.7.: Online Base Case (Reg=None): Singular value spectrum (SVsk) of each sensitivitymatrix S−
k computed at each sampling time tk after solving parameter estimationswithout regularization. The horizontal solid plane labeled ϵ = 6.7 (see Eq. 3.3 withγmax = 15 and κmax = 1000) is the typical threshold to selected the ill-conditionedsingular values in the sensitivity method ill-conditioning analysis of Section 4.4.1.
154
8.3. Results
8.3.5. Online Regularized Case: PE with regularization (New)
The usefulness of different regularization strategies was studied in order to increase the ro-
bustness of the online parameter estimation against wrong initial parameter estimates and
scarce measurement information. The aim was to control the instability of the parameter
estimation generated by ill-conditioned sensitivity matrices at the beginning of the exper-
iment. That provided a smooth transition from the initial parameter guess to the final
estimate at the end of the experiment. The desirable online behavior under regularization
of the bias in the parameter estimates should be then monotonously decreasing with a fast
and accurate convergence to the true parameter values. It has to be noted that deviations
from the true parameter values can be promoted not only by erroneous parameter initial
guesses but due to the ill-conditioning of the estimation which leads to instability prob-
lems. These deviations during the course of the parameter estimates do also affect the
experiment redesign (possibly) yielding ineffective or meaningless designs. This situation
makes crucial the effective implementation of the selected regularization technique. To do
so, the choice of an appropriate regularization parameter, namely ϵ-threshold for SsS (see
Section 3.3.1 ) and λ for Tikhonov (see Section 3.3.3) plays an important role. Therefore,
in the following (as new unpublished results in this thesis) the effects and selection of regu-
larization parameters for SsS and Tikhonov regularizations will be treated. Moreover, the
performance of each regularization (using the previously tuned regularization parameter)
in the online context will be discussed.
Regularization parameter selection
This section summarized unpublished results about the selection of an appropriate reg-
ularization parameter to be used in the stabilization of ill-posed parameter estimations.
Because the ill-posedness analyzed in this thesis is generated by ill-conditioned matrices,
regularization parameters are tuned according to the gravity of the ill-conditioning in the
case study.
Subset selection (Reg=SsS) In the regularization by parameter subset selection (see Sec-
tion 3.3.1), the regularization parameter is the ϵ-threshold used to select the ill-conditioned
singular values (see Section 4.4.1) and compute the number of identifiable parameters (see
Section 4.4.2).
As mentioned in Section 3.2.1 ϵ-threshold can be computed by Eq. 3.3 as a function of
κmax and γmax. In this case study the condition numbers from tk = 7 min on were less
than κmax = 1000 therefore the value of γmax was indeed considered the regularization
parameter. Having so, nine γmax, i.e., 1000, 500, 200, 50, 20, 15, 1, 0.2, 0.1 were tested.
For each γmax an online parameter estimation in “silico“ was run. Results are shown in
Figure 8.8 for the percentage relative bias with respect to the true values in Table 8.2
where each point in a curve corresponds to the average of the elements of θk estimated
155
8. Chromatography system: Scarce experimental data in online estimation
0 5 10 15 20 25 300.1%
1%
10%
100%
1000%
10000%
Rel
ativ
e B
ias
(%)
Time, tk [min]
None
γmax
=1000, ∈=0.1
γmax
=500, ∈=0.2
γmax
=200, ∈=0.5
γmax
=50, ∈=2
γmax
=20, ∈=5
γmax
=15, ∈=6.7
γmax
=1, ∈=100
γmax
=0.2, ∈=500
Weak
Reg
Strong
Reg
Figure 8.8.: Online Case SsS (Reg=SsS): Relative Bias (%) of the estimated parameter vector θk(w.r.t. the true parameter vector in Table 8.2) computed at each sampling time tkwith k = 1, · · · , Nm after solving parameter estimations by using subset selection asregularization for several values of the regularization parameter ϵ-threshold. Regular-ization parameters with the best performance in terms of the accuracy of the estimatedparameter vector at the end of the experiment t = 30 min are enclosed in the red box.“None“ makes reference to parameter estimations without regularization (Reg=None).Note that ϵ-threshold is always defined by γmax according to Eq. 3.3.
at each instant tk. Notice that ϵ and γmax have an inverse relationship according to Eq.
3.3, thus small values of ϵ (large values of γmax) yielded weak regularization otherwise
regularization was strong.
In Figure 8.8 it is possible to observe that regularization parameters leading relative
weak regularizations, i.e., ϵ = 0.2, 0.5, 2 generated the most accurate final estimated
parameter vector (at tk = 30) with relative bias around 1%. Another conclusion of this
comparison is that there exists a well-defined interval of the regularization parameter in
where the online estimation is robust and accurate. Values of ϵ outside of this interval
produced either severe unstable estimations (for weak regularizations, e.g., ϵ = 0.1) or
parameter estimations without DoF (for strong regularizations, e.g., ϵ = 500, 1000). In
the former all parameters or many of them are considered identifiable and allowed to
be estimated remaining then the ill-conditioning problems. Whereas, in the latter all
parameters are considered unidentifiable and are fixed to the initial guess, accordingly
their relative bias is around 100%.
A more useful view of the overall performance of each regularization parameter can be
found in Figure 8.9. Therein each point represents the average of each curve in Figure 8.8.
Consequently, the contribution of the whole experiment is compacted in just a number. As
a result, ϵ = 2 had the best performance and it is selected as the appropriate regularization
parameter by using Reg=SsS. Nevertheless, it is clear that there exists an interval of
adequate ϵ which could be between 0.5 and 2.
156
8.3. Results
1000 500 200 50 20 15 1 0.2 0.110%
100%
1000%
Mea
n R
elat
ive
Bia
s (%
)
γmax
∈=0.1
∈=0.2
∈=0.5 ∈=2
∈=5∈=6.7
∈=100∈=500
∈=1000
Weak
RegStrong
Reg
Figure 8.9.: Online Case SsS (Reg=SsS): Mean Relative Bias (%) for the whole experiment dura-
tion of the Nm estimated parameter vectors θk with k = 1, · · · , Nm (w.r.t. the trueparameter vector in Table 8.2) obtained after solving the Nm parameter estimationsby using subset selection as regularization for several values of the regularization pa-rameter ϵ-threshold. Regularization parameters with the best global performance interms of the accuracy of the estimated parameter vectors during the whole experimentare enclosed in the red box.
A special remark should be here made regarding the feasible values of ϵ. They are
delimited by the largest and minimum singular values of the SVs of the sensitivity matrix,
i.e., ς1 and ςNθ, respectively. If ϵ > ς1 then the estimation in Eq. 3.8 does not take place
because all parameters are considered unidentifiable and then fixed. On the other hand,
if ϵ < ςNθthen all parameters are assumed identifiable and the estimation runs as without
having regularization.
This feature can be clearer seen in Figure 8.10 where the singular value spectra
SVstk of the corresponding sensitivity matrices S−k computed at selected instants tk =
3, 5, 10, 15, 20, 25, 30 spanned the whole experiment are displayed. The figure includes
also three values of ϵ-threshold of different magnitude, ϵ = 0.1, 2, 500. The large
value ϵ = 500 did not cut any SVs that meant each singular value ςi was considered
ill-conditioned according to the ill-conditioning procedure outlined in Section 4.4.1. The
regularized PE using this regularization parameter fixed all parameters to their initial
guess, thus the percentage relative bias in Figure 8.8 was all the time fixed to 100%. This
kind of effect is named strong regularization. The opposite case occurred with the small
value ϵ = 0.1. In that case the majority of singular values (and even all ςi) were above
of this threshold yielding a weak regularization in which all parameters were assumed
identifiable. Having done so, there was any ill-conditioning control and the parameter
estimation remained unstable (almost the first half of the experiment) and it never could
recover having the parameters large bias at the end of the experiment (see Figure 8.8).
The other value corresponds to the best found regularization parameter, i.e., ϵ = 2. In this
157
8. Chromatography system: Scarce experimental data in online estimation
1 2 3 4 5 610
-4
10-2
100
102
104
Sin
gula
r val
ue
(ςi)
Singualr value index i
SVs
tk=3
SVstk=5
SVstk=10
SVstk=15
SVstk=20
SVstk=25
SVstk=30
∈=2
∈=500
∈=0.1
Weak Reg
Strong Reg
Figure 8.10.: Online Case SsS (Reg=SsS): Singular value spectrum SVstk of the sensitivity matrixS−k computed at each sampling time tk = 3, 5, 10, 15, 20, 25, 30 after solving param-
eter estimations by using subset selection as regularization for several values of theregularization parameter, i.e., ϵ = 0.1, 2, 500. Large values of ϵ determine strongregularizations of PE otherwise the regularization is weak.
scenario, a strong regularization is applied to PE at the beginning of the experiment (e.g.,
at tk = 3, 5), a moderate version appears when there was not still enough information
(e.g., at tk = 10, 15) and no regularization was active when the problem did not have
small singular values. In online redesign of experiment that would be the characteristics
of a suitable regularization parameter.
Tikhonov regularization (Reg=Tikh) In regularization by Tikhonov explained in Sec-
tion 3.3.3 the regularization parameter is the scalar λ used to balance the contribution of
the penalization term in the cost function of PE in Eq. 3.10. Tikhonov can incorporate a
priori information of the model parameters as known parameters values in the predefined
vector θR and parameter variances in the matrix L or in λ. Here it was used the form of
introducing a priori information of the parameter variances in λ as follows
λ =1
σθ, (8.11)
where σθ denotes the parameter variance beforehand known. If a current parameter
vector is considered precise the value of σθ is small thus λ is large and the regularized
parameter estimation in Eq. 3.10 is highly penalized. It is named a strong regularization.
On the contrary, if the current parameter vector is not reliable σθ is large, λ is small
and the penalization term might be neglected with respect to the least-square criterion.
That is a weak or even no existing regularization. If individual parameter variances are
previously known they might be also included in the diagonal elements of L (that is not
158
8.3. Results
here the case).
Nine priori parameter values were tested, i.e., σθ =
0.001, 0.002, 0.01, 0.02, 0.15, 0.2, 1, 2, 10. For each σθ (or λ) an online parameter
estimation “in silico“ was run. Results are shown in Figure 8.11 for the percentage relative
bias with respect to the true values in Table 8.2 where each point in a curve corresponds
to the average of the elements of θk estimated at each instant tk.
0 5 10 15 20 25 300.1%
1%
10%
100%
1000%
10000%
Rel
ativ
e B
ias
(%)
time [min]
None
σθ=10, λ=0.1
σθ=2, λ=0.5
σθ=1, λ=1
σθ=0.2, λ=5
σθ=0.15, λ=6.7
σθ=0.02, λ=50
σθ=0.01, λ=100
σθ=0.002, λ=500
σθ=0.001, λ=1000
Weak
Reg
Strong
Reg
Figure 8.11.: Online Case Tikh (Reg=Tikh) for λ regularization parameter selection: Relative
Bias (%) of the estimated parameter vector θk (w.r.t. the true parameter vector inTable 8.2) computed at each sampling time tk with k = 1, /cdots,Nm after solvingparameter estimations by using subset selection as regularization (Reg=Tikh) forseveral values of the regularization parameter λ in the course of the online experiment.Regularization parameters with the best performance in terms of the accuracy of theestimated parameter vector at the end of the experiment t = 30 min are enclosed inthe red box. “None“ makes reference to parameter estimations without regularization(Reg=None). Note that λ is here defined by σθ according to Eq. 8.11.
A similar analysis realized for Reg=SsS was also here applied. In Figure 8.11 it is
possible to observe that regularization parameters leading moderate regularizations, i.e.,
λ = 1, 5, 6.7, 50 generated the most accurate final estimated parameter vector (at
tk = 30) with relative bias around 1%. For this regularization a wider interval of the
regularization parameter, in where the online estimation was robust and accurate, i.e.,
1 ≤ λ ≤ 50 was found. Values of λ outside of this interval produced either severe unsta-
ble estimations (for weak regularizations, e.g., λ = 0.1, 0.5) or parameter estimations
without DoF (for strong regularizations, e.g., λ = 100, 500, 1000).A more useful view of the overall performance of each regularization parameter can be
found in Figure 8.12. Therein each point represents the average of each curve in Figure
8.11. Consequently, the contribution of the whole experiment is compacted in just a
number. As a result, λ = 50 (equivalent to σθ = 0.02) had the best performance and it is
selected as the appropriate regularization parameter by using Reg=Tikh. Nevertheless, it
is also clear that there exists an interval of adequate λ which could be between 5 and 50.
159
8. Chromatography system: Scarce experimental data in online estimation
10 2 1 0.2 0.15 0.02 0.01 0.002 0.00110%
100%
1000%
Mea
n R
elat
ive
Bia
s (%
)
σθ|priori
λ=0.1
λ=0.5
λ=1
λ=5 λ=6.7 λ=50
λ=100
λ=500
λ=1000
Weak
RegStrong
Reg
Figure 8.12.: Online Case Tikh (Reg=Tikh) for λ regularization parameter selection: Mean Rela-tive Bias (%) for the whole experiment duration of the Nm estimated parameter vec-
tors θk with k = 1, · · · , Nm (w.r.t. the true parameter vector in Table 8.2) obtainedafter solving Nm parameter estimations by using subset selection as regularization forseveral values of the regularization parameter λ. Regularization parameters with thebest global performance in terms of the accuracy of the estimated parameter vectorsduring the whole experiment are enclosed in the red box. “None“ makes reference toparameter estimations without regularization (Reg=None). Note that λ is definedby σθ according to Eq. 8.11.
It is important to point out that Tikhonov regularization overcomes the ill-conditioning
by replacing the small singular values (those less than the regularization parameter) for
approximately the value of the regularization parameter [78]. Doing so, the dimension of
the problem remains as the original but the new regularized sensitivity matrix is numeri-
cally transformed as a well-conditioned matrix. This transformation although is remarked
at the bottom of the ill-conditioned SVs also affects the large singular values as exhibited
in Figure 8.13. For instance, comparing SVsT ikhtk=30 with SVstk=30 it is possible to observe
that the regularized spectrum SVsT ikhtk=30 did not have any singular value less than 50 and
also that the whole spectrum was lifted at least a bit regarding the original SVstk=30.
A similar final remark as made in the selection of the regularization parameter for SsS
in Section 8.3.5 should be here written. That means, a large λ will be strongly regularized
the parameter estimation because many (or maybe all) singular values could lie under
this value. In that case, all of them will be replaced for the corresponding value of λ,
the problem will be biased and parameter estimates will be fixed to the initial guess
as in Reg=SsS. That is expected taking into consideration that subset selection is the
extreme case of Tikhonov when a priori parameter information is reliable having small
parameter variance. On the other hand, a small λ will have the same effect of a small ϵ
in Reg=SsS. Look at the Tikhonov cost function in Eq. 3.10. If λ is small the weighting
of the penalization term is also small and the original least-square guides the estimation
bringing all structural problems. Then the regularization is weak or even non-existent.
160
8.3. Results
1 2 3 4 5 60
50
100
150
200
250
300
350
Singular value index i
Sin
gu
lar
val
ue
(ςi)
SVs
tk=5
Tikh
SVstk=10
Tikh
SVstk=15
Tikh
SVstk=20
Tikh
SVstk=25
Tikh
SVstk=30
Tikh
SVstk=5
SVstk=10
SVstk=15
SVstk=20
SVstk=25
SVstk=30
Regularized S
Original S
λ=50
Figure 8.13.: Online Case Tikh (Reg=Tikh) for λ regularization parameter selection: Singularvalue spectrum SVsTikh
tkand SVstk of the regularized and original sensitivity matrices
S−,T ikhk and S−
k computed at each sampling time tk = 5, 10, 15, 20, 25, 30 aftersolving parameter estimations by using Tikhonov as regularization for regularizationparameter λ = 50. Notice that singular values ςi ≤ λ are approximated to valuesaround λ.
8.3.6. Online Redesign of Experiments
The results of the optimal input design are presented for FS-1 and FS-2 in Fig. 8.14 and
8.15, respectively.
The initial parameter guess values were set to one which meant relatively large deviations
from the true parameter values, compared with values in Table 8.2. It can be seen, that
during the first 10 - 15 min of the experiment run, relatively large deviations from the
finally estimated values existed (see Figs. 8.14e and 8.15e). Moreover, the available
measurements did not provide sufficient information for the identifiability of the whole
parameter space (see Figures 8.14d and 8.15d).
However, the Tikhonov regularization was able to realize a relatively smooth transition
to the final parameter values indicating robustness and stable convergence. The same
applies for the repeated solution of the ED problem, where the identifiable parameter
subset only was considered in the problem formulation.
161
8. Chromatography system: Scarce experimental data in online estimation
Figure 8.14.: D-optimal adaptive input design for feeding strategy FS-1. Subfigures a, b showthe measured sum and predicted individual outlet concentrations. Subfigure c showsthe input design, i.e. the inlet concentrations. Subfigure d shows the results of theidentifiability analysis for the parameters θ1, · · · , θ6. If a parameter was identifiableand selected by the subset selection (SsS) algorithm, this parameter was active andits activity was indicated by a dot. If a dot was missing, the parameter was notactive and not identifiable. Subfigure e shows the course of the parameter estimates.(Figure taken from publication IV- Barz et al. (2016) in Appendix A.2 - reprintedfrom Computers & Chemical Engineering with permission from Elsevier)
8.3.7. Validation of the parameter estimates by Frontal Analysis
The presented determination of adsorption isotherm parameters by parameter estimation
is usually referred to as inverse or indirect method. In the inverse method (see Section
3.1) model predictions are fitted to experimental data for the determination of parameters
of a preselected isotherm model. An alternative well established method is the so called
Frontal Analysis (FA), which is a direct chromatographic method for the determination of
adsorption isotherms [47]. In the following, FA is used to validate the estimated Langmuir
isotherm model parameters.
In contrast to the inverse method, FA is model independent, i.e. the shape of the de-
termined isotherm is not predefined by a selected isotherm model structure. However, for
the determination of multi-component adsorption isotherms (especially for more than two
components), FA also requires a large amount of experimental data. To limit the corre-
162
8.3. Results
Figure 8.15.: D-optimal adaptive input design for feeding strategy FS-2. Subfigures a, b showthe measured sum and predicted individual outlet concentrations. Subfigure c showsthe input design, i.e. the inlet concentrations. Subfigure d shows the results of theidentifiability analysis for the parameters θ1, · · · , θ6. If a parameter was identifiableand selected by the subset selection (SsS) algorithm, this parameter was active andits activity was indicated by a dot. If a dot was missing, the parameter was notactive and not identifiable. Subfigure e shows the course of the parameter estimates.(Figure taken from publication IV- Barz et al. (2016) in Appendix A.2 - reprintedfrom Computers & Chemical Engineering with permission from Elsevier)
sponding laboratory work, the validation is carried out for single component adsorption
only, i.e. the competitive isotherm behavior is not considered.
In FA, a solution of the studied component, at a known, constant concentration, is
percolated through the column. Successive step changes of increasing concentration are
performed at the column inlet and the breakthrough curves (transient concentration pro-
files) at the column outlet are determined. For each new inlet concentration the mass of
the solute adsorbed at equilibrium is determined from the integral of the breakthrough
curve [47]. In doing so, the adsorption isotherms are directly determined from the available
measurement data, a mass balance equation and geometric data of the column.
The results of the validation are shown in Fig. 8.16 for both feeding strategies, FS-
1 and FS-2. It can be seen, that butyl benzoate shows the strongest adsorption while
propyl and methyl benzoate are weaker adsorbed. Moreover, the model predictions for
the parameter estimates obtained using D-optimal designs are close to the calculated
163
8. Chromatography system: Scarce experimental data in online estimation
equilibrium points from FA. Larger deviations exist for the parameter estimates obtained
from heuristic designs. These results are confirmed by analysis of the accuracy of the
estimated parameter values, i.e. their standard deviations, for different input designs in
Table 8.3.
Figure 8.16.: Validation of the parameter estimates for single component adsorption. Adsorptionisotherms obtained by FA are shown by calculated equilibrium points. Predictedadsorption isotherms using the Langmuir model are shown by lines. Predictionsare made using parameter estimates from D-optimal designs and standard inputdesigns (uniform and pulse). (Figure taken from publication IV- Barz et al. (2016) inAppendix A.2 - reprinted from Computers & Chemical Engineering with permissionfrom Elsevier)
8.3.8. Optimal input designs vs standard input designs
In the following, the optimal input designs are compared to standard input designs in
terms of the achieved parameter precision (see Section 2.5.1). The standard input designs
were generated for FS-1 and FS-2 using the same control grid and concentration range as
for the optimal input design. The following standard designs were considered:
• a sum of sinusoids (sine),
• rectangular pulses (pulse),
• uniformly distributed random signals (uniform),
Table 8.3 shows the results for the D-optimal criterion and the corresponding standard
deviations of the individual parameter values.
As expected, compared to most standard input designs, the D-optimal experiments
can significantly improve the parameter precision. Moreover, the individual feeding of
concentrations (FS-2) provides more informative measurement data than the feeding with
constant and equal concentration ratio. Comparing the results for the D-optimal design
for FS-2 with the widely used standard pulse experiments for FS-1, a reduction by a factor
164
8.4. Conclusions
Table 8.3.: Parameter accuracy for experimental data obtained from different input designs andfeeding strategies FS-1 and FS-2. Input designs marked by a star were realized experi-mentally, all other are ’in silico’ experiments. (Table taken from publication IV- Barzet al. (2016) in Appendix A.2 - reprinted from Computers & Chemical Engineeringwith permission from Elsevier)
Input Design D-optimal Standard Deviation
criterion θ1 θ2 θ3 θ4 θ5 θ6
FS-1: sine 0.0009295 0.0732 0.0539 0.0272 0.0911 0.1088 0.0516FS-1: pulsea* 0.0005713 0.0463 0.0266 0.0349 0.0650 0.0647 0.0804FS-1: pulseb* 0.0005221 0.0342 0.0457 0.0262 0.0785 0.0648 0.0636FS-1: uniforma* 0.0003479 0.0431 0.0316 0.0174 0.0470 0.0585 0.0349FS-1: uniformb* 0.0003476 0.0313 0.0171 0.0426 0.0589 0.0330 0.0491FS-1: D-optimala* 0.0001732 0.0283 0.0217 0.0130 0.0285 0.0396 0.0264FS-1: D-optimalb* 0.0001638 0.0133 0.0245 0.0186 0.0235 0.0323 0.0370
FS-2: sine 0.0003988 0.0496 0.0340 0.0228 0.0363 0.0353 0.0274FS-2: uniform* 0.0002389 0.0182 0.0283 0.0356 0.0203 0.0269 0.0301FS-2: D-optimal* 0.0001100 0.0119 0.0183 0.0238 0.0134 0.0156 0.0187
of five in the D-optimal criterion is achieved, i.e. 0.0005713 to 0.0001100. Figures 8.17
and 8.18 show the results for different input designs for FS-2.
Figure 8.17.: Input design (subfigure c) and outlet concentrations (subfigure a, b) for feeding strat-egy FS-2 and a standard input design generated by a sum of sinusoids; ’in silico’experiment. (Figure taken from publication IV- Barz et al. (2016) in Appendix A.2- reprinted from Computers & Chemical Engineering with permission from Elsevier)
8.4. Conclusions
The numerical results of the experimental realization of an adaptive optimal input de-
sign for parameter determination in liquid chromatography was in this chapter presented.
Competitive Langmuir isotherm parameters of a three component mixture were identified
165
8. Chromatography system: Scarce experimental data in online estimation
Figure 8.18.: Input design (subfigure c) and outlet concentrations (subfigure a, b) for feeding strat-egy FS-2 and a standard input design generated by an uniform sampling; real ex-periment. (Figure taken from publication IV- Barz et al. (2016) in Appendix A.2 -reprinted from Computers & Chemical Engineering with permission from Elsevier)
by fitting model predictions to measure concentrations at the column outlet for a given
experiment.
The accurate and precise determination of the parameters was secured by the treat-
ment of instabilities (ill-conditioning issues) by performing regularized online parameter
estimations and optimal experiment designs. The quality of the estimated parameters was
validated by comparing the predicted isotherms with results from Frontal Analysis (FA).
In liquid chromatography the optimal adaptive experimental input design compared to
other established methods, see e.g. Frontal Analysis [77], proved to be significantly more
efficient regarding the experimental effort. This is especially true if mixtures with more
than two components need to be analyzed. Moreover, for the presented case study, all
adsorption parameters could be identified using a non-selective concentration UV detector
which only provided the sum concentration of all components. This is a big advantage, as
for conventional established methods in which individual concentration measurements are
required but normally not available, see e.g. [77].
It was also discussed that numerical regularization is crucial in order to stabilize the
parameter estimation, i.e. minimize the variations in the parameter estimates during
the online estimation, and to compute meaningful experiment design criteria. This is
especially true at the beginning of an experiment where measurement information is scarce
and the sensitivity matrix is ill-conditioned. However, a careful tuning of corresponding
regularization parameters is required. In this direction, some guidelines to select the
corresponding regularization parameters for subset selection and Tikhonov were given.
Although the regularization parameters are application depending, it was here shown that
the analysis of the singular value spectrum (specifically the largest and smallest singular
value) assists in selecting these parameters. Moreover, the effect of weak and strong
166
8.4. Conclusions
regularizations in the context of online parameter determination by using subset selection
and Tikhonov was addressed. It was also shown that not only a single regularization
parameter value adequately performs in getting rid of the ill-conditioning but there exists
an interval to find them.
In this case study the Tikhonov regularization outperformed the parameter subset se-
lection based regularization in terms of smoothness of the parameter stability and conver-
gence during the online estimation. That is because Tikhonov regularization is based on a
smooth weighting between initial parameter guess and best available estimate (result from
last iteration) as well as identifiable and non identifiable parameters. In contrast, subset
selection completely excludes non identifiable parameters from the problem and thus these
parameters cannot be improved. Accordingly, if the initial guess of these non identifiable
parameters is far away from the true value and also has large parameter variance, it may
stronger contribute to a meaningless experimental redesign of the remaining active param-
eters (although for D-design that fact is not so crucial [78]) . However, parameter selection
methods, such as subset selection are attractive from a practical viewpoint as they provide
useful information for the monitoring of adaptive design strategies, as the importance of
individual parameters and correlations between them.
It is important to highlight that the application of regularization only numerically aids
the ill-conditioning in the estimation. The improvement of parameter identifiability is
only enhanced by the optimization-based experimental redesign (by optimal experimental
design), collection of new informative experimental data and properly selection of the
initial guess when the model structure is reliable. Nevertheless, without regularization the
optimal experimental design could yield inefficient designs affecting also the identifiability.
167
168
9. Summary and Outlook
9.1. Summary
The development of high-quality and validated models of process systems is justified not
only for the research trend to increase the knowledge about the process, but mainly for
the economic benefits generated by model-based product and process design, simulation,
control and optimization. Physical, chemical or biological laws are used to build these
appropriate and much desired mechanistic models which have the important task to repre-
sent the investigated process. These laws substantially impose the model structure which
invariably contains unknown parameters to be determined.
In the identification of these models the parameter estimation problem using e.g., max-
imum likelihood or least-squares estimation, is solved. The parameter estimation is an
inverse problem where unknown parameters have to be inferred from measurement data.
Lack of informative experimental data, highly noisy and correlated measurements, inad-
equate initial guesses, correlated and insensitive parameters among others are common
challenges hampering an accurate model identification and leading to ill-conditioning and
identifiability issues. When that happens, the estimation becomes unstable with possibly
convergence and numerical problems. In that case, it is said that the problem is ill-posed
for the presence of ill-conditioned matrices.
Although the ill-conditioning analysis and identifiability diagnosis have been separately
and widely discussed in literature they had not yet been considered as required components
in the same framework of process model development. Their application had rather been
the matter of special cases, situations or complicated models. However, it was proved
along the thesis that ill-posed problems arising from ill-conditioned matrices more than
special cases could be a normal situation in any stage of the model development.
In this thesis a consolidated computational framework to systematically evaluate ill-
posedness in model-based parameter estimation and experimental design was proposed
and successfully tested. It included the ill-conditioning and identifiability analysis as well
as the implementation of numerical regularization. The computational framework also
considered two major paradigms to evaluate the quality of parameter estimates, namely
the Sensitivity and Monte Carlo methods. The framework may be used as a whole piece
or segregated to work either on parameter estimation or optimal experimental design
of a previously selected model structure. It is also conceived to treat adaptive designs
and online parameterizations. Moreover, it should be noted that the framework might
be applied even to well-posed problems. In fact, it can be applied to general problems
regardless their ill-posedness state. The framework could be easily used to determine if
169
9. Summary and Outlook
other sources of experimental information (including type of measures, measuring error,
sample frequency, etc.) are indeed aiding the ill-conditioning and identifiability of any
process model. This situation was evidenced in the case study of the Lithium battery.
The application of the framework proved to efficiently work on detecting and dealing
with a strongly over-parameterized models (e.g., the case study of Bioethanol). After
successful termination, the number of considered parameters was reduced to a relatively
small subset of the original parameter space in order to regularize the ill-posed problem.
Thus, the most influencing parameters for selected operating conditions were identified
and their uncertainty was significantly decreased.
Ill-conditioning analysis to diagnose identifiability problems. Throughout this thesis
was presented the necessity of formulating well-posed problems for the numerical computa-
tion of stable and unique solutions in model-based parameter estimation and experimental
design. Thus, it is strongly advisable to perform the relatively simple local analysis of the
ill-conditioning of the sensitivity matrix. Moreover, if an ill-posed problem is identified,
its ill-posedness type (either rank deficient or of ill-determined rank) and severity should
be assessed. Important indicators suggested in this thesis are the singular value spectrum,
condition number, and collinearity index of the sensitivity matrix.
In this thesis the connection between the eigenvalues of the Fisher-information matrix
and the parameter covariance matrix with the singular values of the sensitivity matrix to
identify the presence and sources of ill-conditioning was exploited. The rationale was that,
if certain parameters were unidentifiable, some columns of the sensitivity matrix were lin-
early dependent which implied that the Fisher and parameter covariance matrices were
singular or nearly singular [31, 60, 78, 120]. Columns of the sensitivity matrix that are
nearly linear dependent are a typical source of poor estimator performance [12, 78]. Thus,
computing the numerical rank of the sensitivity matrix says if the sensitivity matrix is ill-
conditioned. By using orthogonal decompositions (i.e., SVD and QRP decomposition) the
influence of ill-conditioned singular values on parameter variances can be analyzed. These
procedures ultimately result in parameter rankings that are used to determine unidentifi-
able parameters under different sources of experimental information. These rankings can
be used to provide recommendations for regularization and design of priors [78].
Sensitivity method vs Monte Carlo method. It was here demonstrated that local identi-
fiability methods based on sensitivities (i.e., SVD, variance and QR methods) in presence
of ill-posedness can only detect the presence of the identifiability problems and even the
number and the unidentifiable parameters. This provides an advantage over the most
rigorous but also more expensive Monte Carlo method. However, the computed variances
and therefore the confidence intervals/regions are not reliable in the sensitivity method
because the variances are inflated by the presence of ill-conditioned singular values of the
sensitivity matrix. For a better quantification of the parameter uncertainty the Monte
170
9.1. Summary
Carlo method is highly recommended.
Influence of ill-posed problems in optimal experimental design. In optimal experimen-
tal designs coming from ill-posed parameter estimations it is highly recommendable to do
the numerical analysis and implementation (computation of the alphabetic criteria) based
on the singular values of the sensitivity matrix. The direct application of the alphabetic de-
sign criteria based on the eigensystem of the Fisher-information/covariance matrices may
lead to numerical instabilities and meaningless designs for the next parameter estimation.
Although the computation of the singular values is numerically more stable than the
eigenvalues also for ill-posed problems, small singular values of the sensitivity matrix
(especially those near zero) will have a large influence on the design criteria. They can
produce huge criterion values (specially for A- and E- designs), which will then complicate
or even impede an appropriate optimization. Thus, a further action to control the smallest
singular values of the sensitivity matrix (e.g., the regularization of the sensitivity matrix),
which directly control the most uncertain parameters, is needed.
Additionally, a graphical interpretation of the influence of the alphabetic experimental
design criteria applied to the singular value spectrum was also presented. It turned out
that A- and E-optimal criteria mainly improved (incremented) the smallest singular values
of the sensitivity matrix while D-optimal criterion improved the largest singular values.
Thus, the potential of an experimental design of improving the parameter precision and
the ill-conditioning could be similarly analyzed as the well-known graphical interpretation
of the influence of the alphabetic design criteria applied to the parameter variances.
It is important to point out that during this research it was possible to establish that any
increment in the singular value spectrum (promoted by the optimal experimental design)
does not automatically mean an improvement in the ill-posedness of the problem despite
of the parameter variance reduction. On the contrary, computed optimal designs might
generate singular value spectra with large values on the top section which also may lead
to large condition numbers if an adequate increase in the bottom section is not achieved.
In that case, the next parameter estimation will be again ill-conditioned and its estimates
unstable.
Numerical regularization. In this regard, the thesis was showing that each regularization
technique modified the problem to somehow improve (reduce) its ill-conditioning. This
modification was directly evidenced in the singular value spectrum (specifically in the zone
of the ill-conditioned singular values) of the sensitivity matrix of the regularized problem.
In the regularization techniques here treated, the ill-conditioned singular values were con-
trolled either by elimination or transformation. In the identifiable subset selection, the
problem was transformed excluding the unidentifiable parameters and by this the related
ill-conditioned singular values. In truncated singular value decomposition (TSVD) the
ill-conditioned singular values were substituted by zero and the new problem only consid-
171
9. Summary and Outlook
ered the largest singular values of the original problem. Finally, the effect of Tikhonov
regularization in the singular value spectrum was to fix the ill-conditioned singular values
to the magnitude of the squared regularization parameter (λ2), whilst the large singular
values were not largely modified.
For subset selection and TSVD there exists the tendency to conserve the largest singular
values which are associated to the well-conditioned parameters. From an application
point of view, the subset selection seems the most natural approach, as the regularization
acts in the original parameter space. It preserves the physical meaning of parameters
and provides useful information on the number of identifiable parameters as well as on
the ranking of parameters regarding their linear independence and sensitivity. That is
specially attractive from a practical viewpoint as they provide useful information for the
monitoring of adaptive design strategies as the importance of individual parameters and
linear dependence between them.
However, it has to be noted, that a change in the dimension of the identifiable parameter
subset (in subset selection) or the number of well-conditioned singular values (in TSVD)
introduces a discontinuity in the evaluation of the design criterion. This problem is similar
to the inherent non-differentiability in the definition of the E-optimal criterion, where a
possible switching in the smallest eigenvalue introduces this behavior.
It was shown that the Tikhonov regularization yielded an increment (lift) of the singular
value spectrum of the sensitivity matrix maintaining constant the original parameter space
dimension and therefore the number of singular values. The lift of the singular value
spectrum yielded a transformation of the bottom section of the singular value spectrum
by fixation of the small singular values to λ2. Accordingly, singular values less than the
respective regularization parameter are replaced by λ2.
In terms of the optimal experimental design under regularization, the regularized op-
timal design although sometimes could improve the ill-conditioning of the regularized
problem it rarely reduced the ill-conditioning of the original one. Moreover, although
the regularizations ensured the obtention of a solution they did not provide information
whether the obtained solution was still useful in the original context. Therefore, the ill-
conditioning analysis after conducting the optimal design of ill-posed problems is highly
recommended.
When applying Tikhonov regularization, a computed E-optimal design was not viable if
the new design did not promote a singular value spectrum with the smallest singular value
larger than λ2. Here, the regularization parameter continued fixing the small singular
values to lambda (i.e., ςNθ≈ λ2) and made an optimization impossible fixing the E-
criterion value to 1/(λ2)2 in each iteration.
In this thesis some guidelines to select the regularization parameter for subset selection
and TSVD (i.e., ϵ-threshold), and Tikhonov (i.e., λ) were given. Although these parame-
ters are application-dependent, it was here shown that the analysis of the singular value
spectrum of the sensitivity matrix can assist in their selection. For instance, the regulariza-
172
9.2. Outlook
tion parameters are bounded for the largest and smallest singular values. Regularization
parameters larger than the largest singular value totally regularize the problem letting
it without degrees of freedom, whereas values smaller than the smallest singular value
promote a regularization-free problem. Therefore the regularization parameter candidates
should be taken from the interval spanned by these singular values.
A further hint to select the regularization parameter was that the proper regularization
parameter should be as large as the first singular value which was considered to cause the ill-
conditioning. Accordingly, not only a nominal regularization parameter value adequately
performed in getting rid of the ill-conditioning but instead there was an interval where the
appropriate values could be found.
In the case of online estimation, the Tikhonov regularization outperformed the param-
eter subset selection in terms of smoothness when the parameter estimates were updated
and re-designs were adapted. It was an expected behavior not only in online approaches but
in offline approaches when the prior information is available. That was because Tikhonov
regularization by definition smoothly weights a priori information (initial parameter guess)
and the best available estimate (result from last iteration) in the penalty term. Doing so,
the unstable (unidentifiable) parameters are gently controlled (depending on the regular-
ization parameter value and L matrix) but not fixed/excluded in the parameter estimation.
In contrast, subset selection completely fixed/excluded the unidentifiable parameters in
the estimation and thus these parameters could not be improved. This was specially con-
cerning in early estimations when large errors in the estimated parameters are usual. If the
unidentifiable parameters are kept on very wrong values, it can introduce possibly a large
bias on estimates of the identifiable parameters when parameter correlations exist. The
same can occur to the Tikhonov regularization nevertheless this technique is less sensitive
to it due to the mentioned smooth weighting in its formulation.
The application of regularization only numerically aided the ill-conditioning of the es-
timation. The improvement of parameter identifiability (when the model structure is
reliable) was only enhanced by collection of sufficient informative and precise experimen-
tal data, properly selection of initial guesses and appropriate optimal experimental designs.
Nevertheless, without regularization the available experimental information could not be
satisfactorily exploited in the parameter estimation and the optimal experimental design
could yield inefficient experimental conditions affecting also the parameter identifiability.
9.2. Outlook
Implementation of the computational framework to systematically evaluate ill-posedness
in model-based parameter estimation and experimental design in a general and visible
simulation/optimization platform such as MOSAIC (a modeling environment based on
internet standards XML and MathML developed at the chair of process dynamics and
operation, Technische Universitaet Berlin) [72]. It will expand the usability of the frame-
173
9. Summary and Outlook
work allowing more users to independently apply the consolidated framework on their
corresponding tasks of process model development.
The selection of the regularization parameter is still an heuristic task. Although, it
was here presented some guidelines to start the search and conduct the selection, a more
systematic approach is still missed to reduce the time and effort of getting this parameter.
A deep study of the information given by the parameter variance-decomposition along with
the analysis of the singular value spectrum of the sensitivity matrix could supply better
insights of the suitable value of this parameter. In the case of Tikhonov regularization
the inclusion of the regularization parameter in the optimal experimental design as a new
decision variable (see Ref. 4 for the linear case) is a strategy to be proved.
During this thesis in some case studies (see for instance Chapter 7) was assumed that no
a priori information of the parameters (i.e., nominal values and variances) was available
in the Tikhonov regularization. For instance, the predefined vector θR was set to zero and
no prior individual parameter variances were considered to be known in the estimation.
Accordingly, the exploitation of including a priori information in the Tikhonov scheme
could be still achieved as it was accomplished in the online estimation and redesign of
Chapter 8. That could be done either including the best available information of the
parameters in θR or the known parameter precision in the matrix L as the inverse of the
parameter covariance, or both information simultaneously. In that sense, the formulation
of the Bayesian approach [105] for parameter estimation could be also considered.
Test the adapted experimental design criterion (i.e., Mean Squared Error, MSE) for
nonlinear ill-posed problems proposed by Ref. [49] which considers both, the estimator
variance and its bias. This technique is computationally demanding as it leads to a bi-
level optimization problem, where at each design iteration the inverse problem must be
calculated for quite a large number of training parameter vectors (as true values) and noise
realizations. The implementation of this technique requires special numerical methods and
large computation time therefore it is not suitable for online applications, however it would
be still interesting to compare its performance to that obtained with the alphabetic design
criteria only based on the parameter variance.
Employ hybrid approaches for parameter estimation and optimal experimental design
by combining stochastic optimization techniques (e.g., simulated annealing and genetic
algorithms) [111] and gradient-based methods (e.g., Levenberg-Marquardt and interior
point algorithms). Initial guesses obtained from stochastic optimization could be then
provided to the gradient-based algorithm. Doing so, the benefits of a global search of
the stochastic techniques with the fast converge of the gradient-based methods could be
exploited.
Apply more global techniques to determine identifiable and active parameter subspaces
[32] in order to consider a more reliable output’s variability over the full range of pa-
rameters. Those techniques are based on an augmented sensitivity matrix made of the
sensitivities at random parameter values sampled from an specific probability density.
174
9.2. Outlook
Finally, for large-scale nonlinear models, the development of an implementation tailored
to high-performance computing architectures [135] to accelerate analysis times can be done.
For those cases the use of automatic differentiation to construct the sensitivity equations
could be also tested.
175
176
A. Appendix
A.1. Own publications and presentations
A.1.1. Articles in Journals
1. T. Barz, D. C. Lopez C., M. N. Cruz B., S. Korkel, S. F. Walter. Real-time adaptive
input design for the determination of competitive adsorption isotherms in liquid
chromatography. Computers & Chemical Engineering, 94:104-116, 2016
2. T. Barz, C. Zauner, D. Lager, D. C. Lopez C., F. Hengstberger, M. N. Cruz B., K.
Marx. Experimental analysis and numerical modeling of a shell and tube heat storage
unit with phase change materials. Industrial & Engineering Chemistry Research,
55(29):8154-8164, 2016
3. D. C. Lopez C., G. Wozny, A. Flores-Tlacuahuac, R. Vasquez-Medrano, and V. M.
Zavala. A computational framework for identifiability and ill-conditioning analy-
sis of lithium-ion battery models. Industrial & Engineering Chemistry Research,
55(11):3026-3042, 2016
4. D. C. Lopez C., T. Barz, S. Korkel, and G. Wozny. Nonlinear ill-posed problem
analysis in model-based parameter estimation and experimental design. Computers
& Chemical Engineering, 77:24-42, 2015
5. D. Muller, E. Esche, D. C. Lopez C., and G. Wozny. An algorithm for the iden-
tification and estimation of relevant parameters for optimization under uncertainty.
Computers & Chemical Engineering, 71:94-103, 2014
6. N. Yakut, T. Barz, D. C. Lopez C., and G. Wozny. Online Redesign Technique for
Closed-loop System Identification. AIDIC Conference series; 11: 421-430, 2013
7. T. Barz, D. C. Lopez Cardenas, H. Arellano-Garcia, and G. Wozny. Experimental
evaluation of an approach to online redesign of experiments for parameter determi-
nation. AIChE Journal, 59:1981-1995, 2013
8. D. C. Lopez C., T. Barz, M. Penuela, A. Villegas, S. Ochoa, and G. Wozny. Model-
based identifiable parameter determination applied to a simultaneous saccharification
and fermentation process model for bio-ethanol production. Biotechnology Progress,
29(4):1064-1082, 2013
177
A. Appendix
9. D. C. Lopez C., L. J. Hoyos, C. Mahecha, H. Arellano-Garcia, and G. Wozny. Opti-
mization Model of Crude Oil Distillation Units for Optimal Crude Oil Blending and
Operating Conditions. Industrial & Engineering Chemistry Research, 52(36):12993-
13005, 2013
A.1.2. Oral Presentations and Posters
(∗ presenting author) († with publication in Proceedings)
Oral Presentations
1. D. Muller∗ †, E. Esche, D. C. Lopez C., and G. Wozny. Systematic parameter
selection for optimization under uncertainty. 8th International Conference on Foun-
dations of Computer-Aided Process Design - FOCAPD 2014, Washington, USA,
2014
2. D. C. Lopez C.∗, T. Barz, and G. Wozny. Strategies for dealing with ill-posed prob-
lems in on-line optimum experiment design. AIChE Annual Meeting, San Francisco,
USA, 2013
3. N. Yakut†, D. C. Lopez C.∗, T. Barz, H. Arellano-Garcia, and G. Wozny. Online
model-based redesign of experiments for parameter estimation applied to closed-loop
controller tuning. 11th International Conference on Chemical & Process Engineering
(ICheap11), Milan-Italy, 2-5 June 2013
4. D. C. Lopez C.∗, T. Barz, and H. Arellano-Garcia. Online model-based experimental
design. AIChE Annual Meeting, Pittsburgh, USA, 2012
5. D. C. Lopez C.∗ †, T. Barz, H. Arellano-Garcia, G. Wozny, A. Villegas, and S. Ochoa.
Subset selection for improved parameter identification in a Bio-Ethanol production
process. 19th International Conference of Process Engineering and Chemical Plant
Design, Cracow, Poland, 2012
6. D. C. Lopez∗, T. Barz, H. Arellano-Garcia, and G. Wozny. An improved approach
to parameter subset selection for parameter estimation in online applications. 25th
European Conference on Operational Research (EURO-2012), Vilnius, Lithuania,
2012
Poster
1. D. C. Lopez∗, M. N. Cruz, T. Barz, and P. Neubauer. Parameter estimation,
ill-conditioning and identifiability analysis of the anaerobic digestion model No 1
(ADM1) for biogas production. Society for Industrial and Applied Mathematics
(SIAM) Annual Meeting, Boston, USA, 2016
178
A.2. Own publications used for the cumulative thesis
2. E. Anane∗, C. Reitz, F. V. Ebert, M. N. Cruz, D. C. Lopez, S. Junne, and P.
Neubauer. Modelling dissolved oxygen and glucose gradients in pulse-based fed batch
culture of Escherichia coli. 4th BioProScale Symposium “Bioprocess intensification
through Process Analytical Technology (PAT) and Quality by Design (QbD)“, Berlin,
Germany, 2016
3. D. C. Lopez†, L. J. Hoyos, A. Uribe, S. M. Chaparro, H. Arellano-Garcia∗, and G.
Wozny. Improvement of Crude Oil Refinery Gross Margin using a NLP Model of
a Crude Distillation Unit System. 22nd European Symposium on Computer Aided
Process Engineering (ESCAPE-22), London, UK, 2012
A.1.3. Proceedings
1. D. Muller, E. Esche, D. C. Lopez C., and G. Wozny. Systematic parameter selection
for optimization under uncertainty. Computer Aided Chemical Engineering, 717-722,
2014
2. N. Yakut, D. C. Lopez Cardenas, T. Barz, H. Arellano-Garcia, and G. Wozny. Online
Model-Based Redesign of Experiments for Parameter Estimation Applied to Closed-
loop Controller Tuning. Chemical Engineering Transactions, 32:1195-1200, 2013
3. D. C. Lopez, T. Barz, H. Arellano-Garcia, G. Wozny, A. Villegas, and S. Ochoa.
Subset selection for improved parameter identification in a Bio-Ethanol production
process. Technical Transactions -Mechanics, 109(5):137-147, 2012
4. D. C. Lopez, L. J. Hoyos, A. Uribe, S. Chaparro, H. Arellano-Garcia, and G. Wozny.
Improvement of Crude Oil Refinery Gross Margin using a NLP Model of a Crude
Distillation Unit System. Computer Aided Chemical Engineering, 22nd European
Symposium on Computer Aided Process Engineering, 30:987-991, 2012
A.2. Own publications used for the cumulative thesis
This thesis is based on the following publications. The order is chronological. Please note
that the full text papers are only available via the publisher or via personal communication
with the author.
Publication I:
D. C. Lopez C., G. Wozny, A. Flores-Tlacuahuac, R. Vasquez-Medrano, and V. M. Zavala.
A computational framework for identifiability and ill-conditioning analysis of lithium-
ion battery models. Industrial & Engineering Chemistry Research, 55(11):3026-3042, 2016
179
A. Appendix
Publication II:
D. C. Lopez C., T. Barz, M. Penuela, A. Villegas, S. Ochoa, and G. Wozny. Model-
based identifiable parameter determination applied to a simultaneous saccharification
and fermentation process model for bio-ethanol production. Biotechnology Progress,
29(4):1064-1082, 2013
Publication III:
D. C. Lopez C., T. Barz, S. Korkel, and G. Wozny. Nonlinear ill-posed problem analysis
in model-based parameter estimation and experimental design. Computers & Chemical
Engineering, 77:24-42, 2015
Publication IV:
T. Barz, D. C. Lopez C., M. N. Cruz B., S. Korkel, and S. F. Walter. Real-time
adaptive input design for the determination of competitive adsorption isotherms in liquid
chromatography. Computers & Chemical Engineering, 94:104-116, 2016
A.3. Bio-processes: Parameter variance and
variance-decomposition
In this appendix the variance-decomposition computed according to Section 3.2.2 used to
conduct the identifiability analysis in Section 4.4.2 is summarized. The mentioned decom-
position was performed before OED, i.e., evaluated at the initial design uIG on the sensi-
tivity matrix S(uIG). Double underlined variance-decomposition proportions indicate pa-
rameters preliminary selected as unidentifiable according to SVD method in Section 4.4.2.
For examples E1 and E2, the selection criterion is an individual variance-decomposition
proportion πi greater than 0.5. For Example E3, this criterion considers the sum of πi
associated to the ill-conditioned singular values (i.e. ς8 to ς44) greater than 0.5.
180
A.3. Bio-processes: Parameter variance and variance-decomposition
Table A.1.: Variance-decomposition: E1 - Fed Batch Fermentation. (Figure taken from publicationIII - Lopez et al. (2015) - reprinted from Computers & Chemical Engineering withpermission from Elsevier).
Parameter
Parameter
variance
Variance-decomposition
proportion () associated with
singular value
1 2,361E+01 0,0% 0,6% 7,5% 91,9%
2 7,706E+02 0,0% 0,0% 0,7% 99,3%
3 2,038E+02 0,0% 0,0% 7,9% 92,1%
4 1,047E+01 0,0% 0,2% 26,2% 73,5%
Table A.2.: Variance-decomposition: E2 - Biochemical network. (Figure taken from publicationIII - Lopez et al. (2015) - reprinted from Computers & Chemical Engineering withpermission from Elsevier).
Parameter
Parameter
variance
Variance-decomposition proportion () associated with singular value
1 7,41E+00 0,0% 0,0% 0,0% 0,1% 0,3% 0,4% 1,6% 7,1% 2,6% 87,9% 97,5%
2 1,26E+03 0,0% 0,0% 0,0% 0,0% 0,0% 0,5% 0,9% 7,0% 2,4% 89,2% 98,6%
3 1,33E+03 0,0% 0,0% 0,0% 0,0% 0,0% 0,2% 0,3% 0,0% 0,0% 99,5% 99,5%
4 9,67E+00 0,0% 0,0% 0,0% 0,3% 0,1% 49,6% 47,6% 1,9% 0,1% 0,4% 2,4%
5 2,67E+07 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 35,2% 64,8% 100,0%
6 2,71E+07 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 35,2% 64,8% 100,0%
7 3,06E+03 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 41,2% 1,7% 57,1% 100,0%
8 3,20E+04 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 41,1% 1,6% 57,2% 100,0%
9 1,50E-01 0,0% 0,1% 1,2% 15,6% 52,9% 0,6% 27,3% 0,0% 0,8% 1,4% 2,2%
10 2,41E+11 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 0,0% 100,0% 100,0%
181
A. Appendix
TableA.3.:Variance-decom
position:E3-ASM3.
(Figure
takenfrom
publicationIII-Lopez
etal.
(2015)-reprintedfrom
Computers
&Chem
icalEngineeringwith
permission
from
Elsevier).
Pa
ram
eter
P
ara
met
e
r va
ria
nce
( )
Va
ria
nce
-dec
om
po
siti
on
pro
po
rtio
n (
) a
sso
cia
ted
wit
h s
ing
ula
r va
lue
11
,9E
+0
90
,0%
0,0
%0
,0%
0,0
%0
,0%
0,4
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,1%
2,6
%1
,5%
12
,1%
33
,0%
49
,4%
0,7
%0
,0%
10
0,0
%
21
,4E
+1
40
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
2,0
%0
,8%
11
,6%
28
,1%
56
,3%
1,2
%0
,0%
10
0,0
%
32
,5E
+0
90
,1%
0,0
%0
,0%
0,1
%0
,0%
0,4
%1
,1%
0,5
%0
,0%
0,4
%0
,1%
2,0
%0
,3%
0,8
%1
,1%
8,0
%3
4,0
%5
0,6
%0
,2%
0,0
%1
00
,0%
45
,6E
+0
80
,1%
0,0
%4
,6%
0,9
%0
,5%
0,7
%0
,1%
0,0
%2
,2%
1,3
%0
,0%
1,3
%0
,9%
8,0
%1
,4%
8,0
%8
,9%
3,9
%5
6,3
%0
,0%
10
0,0
%
51
,3E
+1
30
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,3%
0,4
%0
,3%
3,7
%0
,0%
3,1
%0
,2%
6,5
%4
,2%
16
,5%
64
,6%
0,2
%1
00
,0%
68
,8E
+0
72
,6%
0,0
%0
,8%
0,2
%2
,2%
0,0
%0
,1%
0,1
%0
,3%
6,1
%4
,2%
9,8
%6
,4%
11
,0%
1,3
%6
,2%
5,9
%3
2,4
%9
,9%
0,0
%1
00
,0%
72
,1E
+1
10
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,9%
0,7
%2
,7%
2,0
%0
,7%
0,3
%0
,6%
52
,2%
0,0
%3
,4%
36
,7%
0,0
%1
00
,0%
81
,9E
+1
10
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%1
,0%
0,6
%2
,6%
1,8
%0
,5%
0,4
%0
,6%
51
,0%
0,1
%2
,8%
38
,6%
0,1
%1
00
,0%
91
,4E
+0
80
,1%
0,0
%4
,8%
0,1
%0
,0%
16
,2%
4,0
%0
,5%
0,3
%0
,0%
0,7
%2
,2%
2,7
%7
,7%
30
,2%
15
,0%
6,4
%0
,0%
2,1
%0
,1%
10
0,0
%
10
2,4
E+
12
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
1,0
%0
,6%
2,6
%1
,8%
0,5
%0
,3%
0,6
%5
2,4
%0
,1%
2,8
%3
7,2
%0
,0%
10
0,0
%
11
2,5
E+
12
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,9
%0
,7%
2,6
%2
,0%
0,6
%0
,3%
0,6
%4
9,8
%0
,0%
3,8
%3
8,7
%0
,0%
10
0,0
%
12
4,6
E+
11
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,3
%0
,8%
11
,2%
12
,5%
1,0
%0
,3%
0,5
%9
,0%
31
,6%
6,2
%2
6,0
%0
,5%
10
0,0
%
13
4,5
E+
11
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,2
%0
,8%
11
,2%
12
,5%
1,0
%0
,3%
0,5
%9
,0%
31
,7%
6,2
%2
6,0
%0
,5%
10
0,0
%
14
1,2
E+
06
0,9
%0
,1%
0,3
%0
,4%
0,6
%0
,8%
0,2
%0
,1%
0,2
%2
,1%
1,9
%8
,1%
0,1
%1
,5%
3,0
%1
,2%
46
,9%
5,3
%2
5,6
%0
,2%
10
0,0
%
15
8,0
E+
08
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,1%
0,0
%0
,0%
0,7
%0
,2%
11
,5%
43
,7%
28
,4%
11
,4%
3,8
%0
,0%
10
0,0
%
16
1,0
E+
10
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,1%
1,4
%0
,4%
0,5
%8
,6%
1,2
%5
,1%
82
,4%
0,4
%1
00
,0%
17
4,4
E+
11
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,3
%0
,5%
11
,4%
15
,5%
2,6
%0
,1%
0,2
%1
1,3
%2
3,8
%1
,0%
32
,5%
0,6
%1
00
,0%
18
4,9
E+
11
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,3
%0
,4%
10
,6%
14
,0%
2,2
%0
,2%
0,1
%8
,7%
17
,4%
1,9
%4
3,5
%0
,6%
10
0,0
%
19
1,3
E+
10
0,8
%0
,1%
0,2
%0
,1%
0,2
%0
,7%
3,9
%0
,0%
0,9
%4
,5%
3,1
%1
5,9
%7
,1%
3,6
%1
1,5
%3
9,7
%4
,2%
0,5
%2
,9%
0,1
%1
00
,0%
20
4,0
E+
10
0,0
%0
,2%
0,3
%0
,1%
1,0
%0
,2%
0,2
%0
,0%
1,5
%0
,1%
1,3
%0
,3%
7,5
%2
1,0
%4
,7%
0,0
%1
,1%
0,4
%5
9,4
%0
,6%
10
0,0
%
21
9,6
E+
12
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,4%
1,5
%0
,5%
0,7
%0
,7%
0,8
%1
,0%
23
,7%
67
,6%
3,2
%0
,0%
10
0,0
%
22
1,4
E+
13
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,2
%0
,1%
0,5
%0
,9%
0,1
%0
,7%
16
,2%
7,3
%7
3,6
%0
,2%
10
0,0
%
23
3,8
E+
11
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,9
%0
,6%
2,6
%1
,9%
0,6
%0
,3%
0,6
%5
1,7
%0
,0%
3,3
%3
7,5
%0
,0%
10
0,0
%
24
3,6
E+
11
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
1,0
%0
,7%
2,7
%1
,9%
0,6
%0
,3%
0,6
%5
1,1
%0
,0%
3,0
%3
8,1
%0
,0%
10
0,0
%
25
6,7
E+
10
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
5,1
%0
,4%
0,0
%0
,0%
0,0
%0
,0%
2,1
%8
,2%
6,4
%1
2,2
%1
3,2
%2
2,0
%2
9,9
%0
,2%
10
0,0
%
26
6,4
E+
10
0,0
%0
,0%
0,0
%0
,1%
0,4
%0
,0%
1,7
%1
,5%
0,0
%1
,0%
0,6
%2
,9%
5,7
%1
3,7
%3
,4%
1,2
%1
8,3
%2
6,6
%2
2,6
%0
,3%
10
0,0
%
27
2,3
E+
14
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,3%
0,4
%1
,3%
35
,0%
61
,4%
1,5
%0
,0%
10
0,0
%
28
7,1
E+
07
1,6
%0
,0%
0,3
%0
,1%
0,0
%0
,2%
1,0
%0
,2%
0,1
%0
,3%
0,1
%0
,0%
0,0
%0
,6%
6,6
%4
5,8
%2
0,2
%1
6,9
%4
,4%
0,0
%1
00
,0%
29
1,8
E+
06
0,2
%1
2,3
%0
,5%
5,1
%6
,8%
1,2
%5
,8%
2,8
%0
,0%
0,7
%0
,3%
8,6
%0
,1%
2,8
%1
,1%
2,8
%2
5,8
%0
,2%
18
,7%
0,3
%1
00
,0%
30
3,2
E+
07
0,1
%4
,5%
2,5
%1
,5%
0,2
%6
,3%
12
,6%
7,7
%2
,5%
2,8
%6
,0%
0,0
%0
,3%
5,6
%1
3,0
%5
,6%
0,8
%8
,9%
1,7
%0
,3%
10
0,0
%
31
4,5
E+
13
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,1%
0,0
%0
,0%
0,8
%0
,1%
12
,0%
42
,3%
29
,7%
11
,2%
3,7
%0
,0%
10
0,0
%
32
8,8
E+
17
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
99
,7%
0,3
%1
00
,0%
33
1,7
E+
10
0,0
%0
,0%
0,0
%1
,8%
1,9
%1
,9%
5,3
%0
,4%
2,4
%0
,5%
3,3
%1
3,2
%2
1,9
%4
,6%
1,3
%3
2,7
%0
,0%
8,6
%0
,1%
0,0
%1
00
,0%
34
2,0
E+
11
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%1
2,7
%0
,6%
5,8
%3
,7%
22
,5%
5,3
%0
,7%
2,0
%2
2,6
%4
,2%
1,2
%1
8,6
%0
,0%
10
0,0
%
35
1,2
E+
15
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,1
%0
,1%
0,0
%1
,4%
0,1
%0
,5%
97
,4%
0,3
%1
00
,0%
36
2,2
E+
09
0,1
%0
,0%
0,0
%0
,0%
0,0
%0
,1%
0,0
%0
,0%
0,1
%0
,1%
0,3
%0
,9%
1,1
%2
,1%
0,0
%1
,1%
0,4
%5
,8%
87
,3%
0,2
%1
00
,0%
37
1,4
E+
07
0,0
%0
,0%
0,5
%0
,0%
0,0
%4
,3%
26
,7%
0,1
%0
,2%
2,2
%6
,6%
16
,7%
0,6
%0
,0%
3,8
%0
,1%
3,9
%1
9,2
%0
,4%
0,0
%1
00
,0%
38
3,0
E+
08
0,1
%2
,6%
2,1
%1
,3%
0,9
%4
,8%
10
,4%
5,9
%3
,4%
5,5
%6
,0%
0,0
%0
,0%
4,2
%1
3,3
%4
,5%
1,2
%1
7,4
%1
,1%
0,2
%1
00
,0%
39
8,9
E+
14
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
16
,4%
25
,5%
57
,9%
0,2
%1
00
,0%
40
8,8
E+
13
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,1%
0,4
%1
,8%
5,6
%1
4,7
%0
,0%
10
,4%
66
,7%
0,3
%1
00
,0%
41
1,0
E+
13
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%1
,2%
0,0
%3
,7%
10
,7%
1,2
%1
,6%
3,4
%4
9,7
%2
3,9
%4
,5%
0,0
%1
00
,0%
42
1,2
E+
31
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%1
00
,0%
10
0,0
%
43
4,4
E+
29
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%0
,0%
0,0
%1
00
,0%
10
0,0
%
182
A.4. Implication of structural properties of the sensitivity matrix on Fisher-information matrix
A.4. Implication of structural properties of the sensitivity matrix
on Fisher-information matrix
Here it will be shown how the structural problems of the sensitivity matrix S are directly
reflected on the Fisher-information matrix F . This fact is illustrated demonstrating that
a full-rank sensitivity matrix implies a nonsigular Fisher matrix. Therefore, analyzing the
structural properties of S the structural problems of F may be detected.
Facts and implications:
• Let S ∈ RNy ·Nm·Ne×Nθ be of rank equal Nθ, i.e., r(S) = Nθ. As Ny ·Nm ·Ne ≥ Nθ
then S is of full-rank and S has Nθ linearly independent columns (see Definition
A.5.4 and its implications).
• Then the Nθ singular values of S are all positive, i.e., ςi > 0, i = 1, · · · , Nθ (see
Definition A.5.9).
• Taking into account that F is a symmetric matrix formed by the cross-product of S
(see Eq. 2.13 and Proposition A.5.3) and the relationship between the eigendecom-
position and SVD (see Proposition A.5.9), the eigenvalues of F are also all positive,
i.e., λi(F ) = ςi(S)2 > 0, i = 1, · · · , Nθ.
• That means that zero is not an eigenvalue of F and then F is nonsingular according
to Theorem A.5.6.
A.5. Matrices notions
In this section the notation and the most important linear algebra definitions used through-
out this thesis are declared. It is here assumed that all matrices and vectors in this work
belong to the real numbers (R), however most definitions are already extended to complex
numbers (C).
A.5.1. Matrices and vectors
Definition A.1 Matrix.
A matrix is simply a rectangular array of real or complex numbers. A m× n matrix A is
an array having m rows and n columns, such as A ∈ Km×n, where K = R or C ⇔ A =
[aij ] =
⎛⎜⎜⎝a11 · · · a1n
.... . .
...
am1 · · · amn
⎞⎟⎟⎠.
If m = n then A is called an square matrix of degree n.
A column matrix is usually called a vector. The sets of all n × 1 real and complex
column matrices (or vectors) are denoted by Rn and Cn, respectively. It will be used the
183
A. Appendix
notation (a1, a2, · · · , an)T to express the column matrix
⎛⎜⎜⎜⎜⎝a1
a2...
an
⎞⎟⎟⎟⎟⎠ (A.1)
in a more compact form. The subscript T is the matrix transpose which will be later
defined, and a1, a2, · · · , an are scalars, that means real numbers or elements of R.
Definition A.2 Linear combination.
If u1, u2, u3, · · · , um are vectors in Cn and if a1, a2, a3, · · · , am are scalars, that is ele-
ments of R, then the vector a1u1, a2u2, a3u3, · · · , amum is called a linear combination of
u1, u2, u3, · · · , um.
Definition A.3 Column space of a matrix.
Suppose that A is a m×n matrix with columns A1, A2, A3, · · · , An. Then the column space
of A, written C(A), is the subset of Cm containing all linear combinations of the columns
of A,
C(A) = ⟨A1, A2, A3, · · · , An⟩ (A.2)
Definition A.4 Row space of a matrix.
Suppose A is a m× n matrix. Then the row space of A, R(A), is the column space of AT ,
i.e., R(A) = C(AT ).
Definition A.5 Linear independence of vectors.
Given the vectors u1, u2, u3, · · · , un ∈ V where V denotes a not necessarily finite dimen-
sional vector space over an arbitrary field F . Then it is said that u1, u2, u3, · · · , un arelinearly independent (or, simply, independent) if and only if the only linear combination
a1u1 + a2u2 + a3u3 + · · ·+ anun = 0 (A.3)
with a1, a2, a3, · · · , an ∈ F is the trivial combination a1 = 0, a2 = 0, a3 = 0, · · · , an = 0. If
Eq. A.3 has a solution where some ai = 0, it is said that u1, u2, u3, · · · , un are linearly
dependent (or, simply, dependent or collinear). It will be also said that a finite subset S
of V is independent if the vectors contained in S are independent. Notice that any finite
set of vectors in V containing 0 is dependent.
184
A.5. Matrices notions
Another form to see the Definition A.5.1 is considering the vectors u1, u2, u3, · · · , unas the columns of the matrix U , then they are said to be linearly dependent or collinear
if there exists a vector A = (a1, a2, a3, · · · , an)T with ∥A∥ = 0 such that UA = 0. If this
equation holds approximately, the columns uj , j = 1, · · · , n are said to be nearly linearly
dependent or nearly collinear [112].
Definition A.6 Orthogonal set of vectors.
Suppose that S = u1, u2, u3, · · · , un is a set of vectors from Cm. Then S is an orthogonal
set if every pair of different vectors from S is orthogonal, that is ⟨ui, uj⟩ whenever i = j.
Remember that orthogonal sets are linearly independent.
Definition A.7 Orthonormal set of vectors.
Suppose that S = u1, u2, u3, · · · , un is an orthogonal set of vectors such that ∥ui∥ = 1
for all 1 ≤ i ≥ n. Then the set of vectors S is an orthonormal set.
Definition A.8 Dimension.
Suppose that V is a vector space and v1, v2, v3, · · · , vt is a basis of V . Then the dimension
of V is the number of elements in the basis of V , that means dim(V ) = t. If V has no
finite bases, it is said V has infinite dimension.
A.5.2. Matrix operations
In this subsection the sum of two matrices, the matrix-scalar, the matrix-vector product,
the matrix product, along with the transpose, inverse and differentiation of a matrix are
summarized.
Let A ∈ Rm×n, B ∈ Rm×n, C ∈ Rm×n, D ∈ Rn×p, E ∈ Rm×p, F ∈ Rn×m, G ∈ Rn×n
and H ∈ Rn×n be eight matrices of different dimensions x ∈ Rn and y ∈ Rn two vectors
and α ∈ R a scalar.
Definition A.9 Matrix addition.
A matrix sum A + B of two matrix A and B is defined to be the matrix C such that
C = A+B with cij = aij + bij , ∀ i = 1, · · · ,m, j = 1, · · · , n.
Definition A.10 Matrix-scalar product.
A matrix-scalar product αA of the matrix A by a real number (or scalar) α is defined to
be the matrix C such that C = αA with cij = αaij , ∀ i = 1, · · · ,m, j = 1, · · · , n.
Definition A.11 Matrix-vector product.
A matrix-vector product Ax of the matrix A by the (column) vector x is defined to be the
vector y such that y = Ax with yi =∑n
k=1 aikxk, ∀ i = 1, · · · ,m.
185
A. Appendix
Definition A.12 Matrix product.
A matrix product AD of two matrix A and D is only defined when the number of columns
of A equals the number of rows of B and is the matrix E such that E = AD with eij =∑nk=1 aikdkj , ∀ i = 1, · · · ,m, j = 1, · · · , n.
Definition A.13 Matrix transpose.
The transpose AT of the matrix A is defined to be the matrix F such that F = AT with
fij = aji, ∀ i = 1, · · · ,m, j = 1, · · · , n.
Definition A.14 Complex conjugate of a matrix.
Suppose A ∈ Cm×n. Then the conjugate of A, written A is an m × n matrix defined by
aij = aij, where aij is the conjugate of aij ∈ A.
Definition A.15 Adjoint of a matrix.
Suppose A ∈ Cm×n. Then its adjoint is A∗ = (A)T . The matrix A∗ is obtained from A
by taking the transpose and then taking the complex conjugate of each entry (i.e., negating
their imaginary parts but not their real parts).
The adjoint of a matrix with only real entries is equal to the transpose of that matrix.
A.5.3. Some special matrices
Definition A.16 Symmetric matrix.
Suppose G ∈ Cn×n. Then the matrix G is symmetric if it is equal to its transpose, i.e.,
G = GT
The subspace of symmetric matrices, in the space Rn×n of all square matrices of degree
n, which are not necessarily symmetric, is here denoted as Sym(n).
When any matrix is multiplied by itself its product (so-called cross-product) has the
interesting property to be symmetric. This result is summarized in the next proposition.
Proposition A.1 The cross-product of a matrix is symmetric.
Suppose A ∈ Rm×n, then AAT and ATA are symmetric.
Definition A.17 Positive, negative definite and indefinite matrices.
Suppose A ∈ Rn×n is symmetric and let q(x) = xAxT be its associated quadratic form.
Then it is said that A is positive-definite if and only if q(x) > 0 whenever x = 0. Similarly,
it is said that q is negative-definite if and only if q(x) < 0 whenever x = 0. Otherwise, it
is said that q is indefinite.
186
A.5. Matrices notions
A symmetric real matrix A is called positive semi-definite if its quadratic form q satisfies
q(x) ≥ 0 for all x ∈ Rn. In the subspace Sym(n), the subsets of positive definite, positive
semi-definite and negative definite matrices are denoted as PD(n), PSD(n) and ND(n),
respectively.
The generalization of a symmetric matrix in the space of the complex numbers is the
Hermitian matrix:
Definition A.18 Hermitian matrix.
Suppose A ∈ Cn×n. Then A is Hermitian (or self-adjoint) if A = A∗.
Definition A.19 Unitary matrix.
Suppose U ∈ Cn×n such that UU∗ = In. Then it is said U is unitary.
If a matrix A has only real number entries (it is a real matrix) then the defining prop-
erty of being unitary simplifies to AAT = In. The matrix A is then called orthogonal.
Moreover, unitary matrices have easily computed inverses. They also have columns that
form orthonormal sets.
Definition A.20 Diagonal matrix.
Suppose D ∈ Cm×n such that D = (dij). Then D is called diagonal if and only if dij = 0
whenever i = j
Definition A.21 Upper-triangular matrix.
Suppose A ∈ Cn×n is a square matrix such that A = (aij). Then A is called upper-
triangular if aij = 0 whenever i > j
Definition A.22 Permutation matrix.
A permutation matrix is a matrix P ∈ Rn×n such that there are row swap matrices
S1, · · · , Sk ∈ Rn×n for which P = S1, · · · , Sk. (Remember that a row swap matrix is
by definition an elementary matrix obtained by interchanging two rows of In.)
A.5.4. Matrix rank
Definition A.23 Suppose that A ∈ Cm×n. Then the rank of A is the dimension of the
column space of A, r(A) = dim(C(A)).
Taking into account the definitions of column space of a matrix in Definition A.5.1 and
the dimension of a vector space in Definition A.5.1, it is easily to conclude that the rank
of A is the number of linearly independent column vectors in A. It should be noted that
if m < n, then the maximum rank of A is m. In contrast, if m > n, then the maximum
rank of A is n. If A = 0, then the rank of A is 0. If A had even one non-zero element, its
minimum rank would be one. When all of the vectors in a matrix are linearly independent,
the matrix is said to be full rank.
187
A. Appendix
A.5.5. Eigenvalues and eigenvectors
Before to define the factorization of a matrix into a canonical form by using the eigendecom-
position, it is important to define the fundamental theory of eigenvalues and eigenvectors
of a matrix and some of their properties.
Definition A.24 Eigenvalues and eigenvectors of a matrix.
Suppose A ∈ Cn×n, x ∈ Cn with x = 0 and λ ∈ C. Then λ is called an eigenvalue of A,
if Ax = λx. The vector x is called an eigenvector corresponding to λ. The pair (λ, x) is
called an eigenpair of A.
The next theorem establish the connection between the eigenvalues of a matrix and
whether or not the matrix is nonsingular.
Theorem A.1 A singular matrix has a zero eigenvalue.
Suppose A ∈ Cn×n. Then A is singular if and only if λ = 0 is an eigenvalue of A.
The eigenvalues are not scale-invariant therefore when a square matrix is scaled by an
scalar its eigenvalues are also scaled as will be seen in the next theorem.
Theorem A.2 Eigenvalues of a scaled matrix.
Suppose A ∈ Cn×n, (λ, x) an eigenpair of A and α ∈ Cn. Then (αλ, x) is an eigenpair of
αA.
The effect of inverting and transposing a matrix on its eigenvalues is described in the
next two theorems.
Theorem A.3 Eigenvalues of the inverse of a matrix.
Suppose A ∈ Cn×n is a nonsingular matrix and (λ, x) an eigenpair of A. Then (λ−1, x) is
an eigenpair of A−1.
Theorem A.4 Eigenvalues of the transpose of a matrix.
Suppose A ∈ Cn×n and (λ, x) an eigenpair of A. Then (λ, x) is an eigenpair of AT .
Other interesting properties of the eigenvalues and eigenvectors take place when the
matrix is Hermitian (see Definition A.5.3). In that case, the eigenvalues of that matrix are
real numbers and the eigenvectors of two different eigenvalues are orthogonal. Remember
that a matrix A whose entries are all real numbers, being Hermitian is identical to be
symmetric (see Definition A.5.3). That is here especially important because the Fisher-
information matrix (see 2.10) and the parameter covariance matrix (see 2.22) belong to
the space of symmetric matrices Sym(Nθ). The mentioned properties are justified by the
following two theorems:
Theorem A.5 A Hermitian matrix has real eigenvalues.
Suppose A ∈ Cn×n is an Hermitian matrix and λ is an eigenvalue of A. Then λ ∈ R
188
A.5. Matrices notions
Theorem A.6 A Hermitian matrix has orthogonal eigenvectors.
Suppose A ∈ Cn×n is an Hermitian matrix and x and y are two eigenvectors of A for
different eigenvalues. Then x and y are orthogonal vectors.
In terms of eigenvalues the subspaces PD(n), ND(n) and PSD(n) of the subsets of
positive definite, negative definite and positive semi-definite matrices, respectively, may
be defined as follows:
Proposition A.2 Definiteness of a symmetric matrix by using eigenvalues.
Suppose A ∈ Rn×n is a symmetric matrix. Then the associated quadratic form q(x) =
xAxT is positive definite (or A ∈ PD(n)) if and only if all the eigenvalues of A are
positive and negative definite (or A ∈ ND(n)) if and only if all the eigenvalues of A are
negative. Moreover, A is positive semi-definite (or A ∈ PSD(n)) if and only if every
eigenvalue of A is non-negative.
In other words, only analyzing the smallest eigenvalue of a matrix it may be characterized
as positive definite or positive semi-definite in PD(n) and PSD(n), respectively according
to the next Lemma:
Lemma A.1 Definiteness of a symmetric matrix by using eigenvalues 2.
Suppose A ∈ Rn×n is a symmetric matrix with smallest eigenvalue λmin(A). Then
A ∈ PD(n)⇔ λmin(A) > 0 (A.4)
A ∈ PSD(n)⇔ λmin(A) ≥ 0 (A.5)
A.5.6. Inverse of a matrix
Definition A.25 Matrix inverse.
Let G ∈ Rn×n and H ∈ Rn×n be square matrices of degree n. Then suppose that G and
H have the property that GH = HG = In, where In is the identity matrix of degree n.
Then G is said the inverse of H and is denoted by H−1 (and H is the inverse of G and is
denoted by G−1). A matrix with an inverse is said to be invertible.
Proposition A.3 If the square matrix G of degree n is nonsingular, there exists a square
matrix H of degree n such that HG = In which is called the left inverse of G.
Theorem A.7 Let G ∈ Rn×n. Then the following statements are equivalent:
1. G is nonsingular if and only if A has left inverse;
189
A. Appendix
2. if G has left inverse H, then G is invertible and H is the unique inverse of G (that
is, HG = In, then GH = In and so H = G−1);
3. G is nonsingular if and only if it is invertible.
The next theorem relates all equivalent statements which define what it means for a
matrix to be nonsingular. Some of those statements will be used in this thesis in order to
connect ill-conditioned matrices with identifiability and ill-posed parameter estimations.
Theorem A.8 The invertible matrix theorem.
Suppose G ∈ Rn×n. Then the following statements are equivalent:
1. G is nonsingular;
2. G is an invertible matrix;
3. the rank of G is n, i.e., G is of full-rank;
4. G is row-equivalent to the identity matrix In;
5. G is column-equivalent to the identity matrix In;
6. G has n pivot positions;
7. the equation Gx = 0 has only the trivial solution;
8. the columns of G form a linearly independent set;
9. the linear transformation x→ Gx is one-to-one;
10. the equation Gx = b has at least one solution for each b ∈ Rn;
11. the columns of G span Rn;
12. the linear transformation x→ Gx is a surjection;
13. there is a n× n matrix H such that HG = In;
14. there is a n× n matrix H such that GH = In;
15. the transpose matrix GT is invertible;
16. the columns of G form a basis for Rn;
17. the column space of G is equal to Rn, C(G) = Rn;
18. the dimension of the column space of G is n;
19. the null space of G is 0;
190
A.5. Matrices notions
20. the dimension of the null space of G is 0;
21. the determinant of G is not zero, det(G) = 0;
22. λ = 0 is not an eigenvalue of G;
23. the orthogonal complement of the column space of G is 0;
24. the orthogonal complement of the null space of G is Rn;
25. the row space of G is equal to Rn, R(G) = Rn;
Moreover, the following properties hold for an invertible matrix G:
• G−1 is invertible;
• (G−1)−1 = G;
• (kG−1) = k−1G−1 for nonzero scalar k;
• (GT )−1 = (G−1)T ;
• for any invertible G,H ∈ Rn×n, (GH)−1 = H−1G−1. More generally, if G1 · · ·Gk
are invertible n-by-n matrices, then (G1G2 · · ·Gk−1Gk)−1 = G−1
k G−1k−1 · · ·G
−12 G−1
1 ;
• det(A−1) = (det(A))−1
A.5.7. Matrix decompositions
In this section several methods to factorize a matrix into a product of matrices are de-
scribed. The most applied decomposition for square matrices, namely the eigendecompo-
sition is firstly defined. Then the most accurate rank-revealing factorization, that means
the singular value decomposition is presented. The QR-factorization with column pivoting
as another important matrix decomposition and an alternative economic method to reveal
the rank of a matrix is finally defined.
A.5.8. Eigendecomposition
The decomposition of a matrix into matrices composed of its eigenvectors and eigenvalues
is called eigendecomposition. This decomposition is also known as matrix diagonalization.
The definition of the eigendecomposition reads as follows
Definition A.26 Eigendecomposition.
Suppose A ∈ Cn×n is a square matrix with eigenvalues λ1, λi, · · · , λn and n corresponding
linearly independent eigenvectors, qi (i = 1, · · · , n). Then A can be factorized as
A = QΛQ−1 (A.6)
191
A. Appendix
where Q ∈ Cn×n is the square matrix whose i-th column is the eigenvector qi of A and Λ ∈Cn×n is the diagonal matrix whose diagonal elements are the corresponding eigenvalues,
i.e., Λii = λi.
A.5.9. Singular value decomposition (SVD)
This factorization is known as an proper orthogonal decomposition or rank-revealing fac-
torization [52, 75]. The SVD as a proper orthogonal decomposition aims at obtaining
low-dimensional approximations of a high-dimensional problem. On the other hand, tak-
ing into account the intimate relationship of SVD with matrix rank it is the most accurate
factorization to reveal this property of a matrix. In addition, SVD allows to analyze the
difficulties associated with the ill-conditioning of a matrix because it is also related to its
condition number.
The definition of the SVD reads as follows:
Definition A.27 The singular value decomposition (SVD).
For any A ∈ Rm×n (m ≥ n) there are unitary matrices U ∈ Cm×m and V ∈ Cn×n such
that
A = U
(Sv
0
)V T =
r∑i=1
ςiuivTi (A.7)
where Sv ∈ Rm×r is a diagonal matrix Sv = diag (ς1, ς2, · · · , ςr) with nonnegative diagonal
elements ς1 ≥ ς2,≥ · · · ,≥ ςr > 0 which are the singular values of A. The number r ≤min(m,n) is equal to the rank of A, and the triplet (U, Sv, V ) is called the singular value
decomposition (SVD) of A. The columns of vi ∈ V with i = 1, · · · , n and uj ∈ U with
j = 1, · · · ,m are the right and left singular vectors of A, respectively.
The set of singular values of A will be here called as the singular value spectrum (SVs).
It is important to notice that the SVD in Eq. A.7 does not have scale invariance, therefore
if A is scaled by a positive scalar α the corresponding SVs of the scaled matrix αA is also
scaled by α (i.e., αA =∑Nθ
i=1(αςi)uivTi , where αςi is the i-th singular value of αA). The
SVs in this thesis will be represented in the semi-logarithmic graphics. Therein, the effect
of α is seen as a spectrum shift such that the new SVs of αA is a parallel curve to the SVs
of A.
Finally, note that the singular value decomposition can be seen as an extension of the
eigendecomposition for rectangular matrices. Nevertheless, there exist relations between
these decompositions which are details in the next proposition.
Proposition A.4 Relation between SVD and eigendecomposition.
Suppose A ∈ Rm×n whose SVD described by Eq. A.7 is given by (U, Sv, V ). Then the
192
A.5. Matrices notions
following two relations are true:
ATA = V (STv Sv)V
T =r∑
i=1
ς2i vivTi (A.8a)
AAT = U(SvSTv )U
T =r∑
i=1
ς2i uiuTi (A.8b)
where the right-hand sides of these relations are the eigendecompositions of the left-hand
sides.
Accordingly:
• The columns of V (right-singular vectors of A) are the eigenvectors of ATA.
• The columns of U (left-singular vectors of A) are the eigenvectors of AAT .
• The non-zero elements of Sv (non-zero singular values of A) are the square roots of
the non-zero eigenvalues of ATA or AAT :
ςi(A) =√
λi(ATA) (A.9a)
ςi(A) =√
λi(AAT ) (A.9b)
A.5.10. QR-factorization with column pivoting (QRP)
It is along with SVD another important matrix decomposition. QRP is also a rank-
revealing decomposition which is a more practical (and cheaper) method although less
accurate than the SVD which has been specially used in solving rank-deficiency least
squares methods [45, 58]. In order to build a complete panorama of QRP as a matrix
factorization method and even as a rank-revealing decomposition, the definition of a QR-
factorization is firstly established.
Definition A.28 QR-factorization.
Every A ∈ Rm×n of rank n can be factored A = QR, where Q ∈ Rm×n has orthonormal
columns and R ∈ Rn×n in invertible and upper triangular.
Having the definition of a QR-factorization of a full-rank matrix the next step is to
consider the extended method for a matrix A singular or near singular. That means
another method to calculate the numerical rank of A.
Definition A.29 QR-factorization with column pivoting.
Suppose A ∈ Rm×n. If there exists a permutation matrix Π ∈ Rn×n such that AΠ = QR is
193
A. Appendix
the QR-factorization of AΠ, with Q ∈ Rm×n an orthogonal matrix and R ∈ Rm×n an upper-
triangular matrix with decreasing diagonal elements partitioned as R =
(R11 R12
0 R22
),
where R11 ∈ Rr×r and R22 is (hopefully) small in norm. If say ∥R22∥2 = O(µ), then
from the fact that ςr+1 ≤ ∥R22∥2 (see Lema A.5.10), it is concluded that the original
matrix A is guaranteed to have at most numerical rank r.
In order to justify QRP as a rank-revealing factorization of a matrix A, the next lemma
relating the QR-factorization and the numerical rank of A is now established.
Lemma A.2 QR-factorization and numerical rank.
Suppose A = QR is the QR-factorization of A ∈ Rm×n (m ≥ n), with
R =
(R11 R12
0 R22
)(A.10)
and R11 ∈ Rr×r. If
ςmin(R11) >> ∥R22∥2 = O(µ), (A.11)
where µ is the machine precision, then A has numerical rank r.
Finally, the subsequent definition of the rank-revealing QR-factorization with column
pivoting (RRQRP) of a matrix A is here given.
Definition A.30 RRQRP factorization.
Suppose that a matrix A ∈ Rm×n (m ≥ n) has numerical rank r (< n). If there exists a
permutation matrix Π ∈ Rn×n such that AΠ has a QR-factorization AΠ = QR, with R =(R11 R12
0 R22
), R11 ∈ Rr×r satisfying the inequality A.11, then the factorization AΠ = QR
is called a Rank-Revealing QR factorization with column pivoting of A
Note that the main point in this definition is to use the column pivoting strategy to
determine the permutation matrix Π which yields a small R22.
A.5.11. Numerical rank
The definition in A.5.4 refers to the ordinary rank (also called the exact rank), which is a
discontinuous function of the elements of a matrix. However, in the context of computer
arithmetic the existence of an exact rank is rare, therefore the definition of a numerical
rank is here introduced. Mathematically, in terms of the singular values a matrix has
rank r (in the sense of Definition A.5.4) if and only if ςr > 0 and ςr+1 = 0. Nevertheless,
computationally speaking and in many real-world applications ςr+1 is not exactly equal to
zero but close to zero or even just much smaller than ςr. In this scenario, when a matrix
194
A.5. Matrices notions
has a cluster of small singular values with a well-determined gap between large and small
singular values limited by a ϵ-threshold, the number of these largest singular values reveals
the numerical rank rϵ of the matrix.
Definition A.31 Numerical rank of a matrix.
Suppose A ∈ Rm×n (m ≥ n)). Then A is said to have ϵ-numerical rank rϵ if
ς1, ς2, · · · , ςrϵ ≥ ϵ ≥ ςrϵ+1, · · · ςn (A.12)
where ς1, · · · , ςn ≥ 0 are the singular values of A and ϵ ∈ R is an scalar defining the cluster
of the small singular values.
In this definition is still considered that the matrix A has repsilon column linearly
independent. Note that a matrix with a single large gap in its singular value spectrum
is the appropriate candidate to compute the numerical rank. However, when a low rank
approximation of a matrix is needed this concept is again reasonable but its application
should be carefully taken.
A.5.12. Condition number
The condition number measures how small perturbations in the data affect the solution,
or how sensitive is the solution with respect to errors in the data. The formal definition
of the condition number reads as follows.
Definition A.32 Condition number.
Suppose A a nonsingular matrix. Then the condition number is
κ(A) = ∥A∥2A−1
2
(A.13)
where ∥·∥2 is the the spectral norm, that is, the matrix norm induced by the Euclidean
norm of vectors.
The condition number of A is always greater than one, and it is invariant when A is
multiplied by a nonzero constant [112]. If A is singular then κ(A) = ∞. In numerical
analysis the condition number of a matrix A is a way of describing how well or bad the
system Ax = b could be approximated. If κ(A) is small the problem is well-conditioned and
if κ(A) is large the problem is rather ill-conditioned. Furthermore, κ(A) is also recognized
to be a near-linear dependence (collinearity) measure.
Another expression for the condition number is
κ(A) = ςmax/ςmin, (A.14)
195
A. Appendix
where ςmax and ςmin are the maximum and minimum singular values of A. If A is a
symmetric matrix then
κ(A) = λmax/λmin, (A.15)
where λmax and λmin denote the largest and smallest eigenvalues of A.
In terms of the singular values, an ill-conditioned matrix has a large ratio of the maxi-
mum ς1 to the minimum singular value ςNp . That means a large condition number κ. This
condition checks how small the minimum singular value ςNp is relative to the maximum
singular value ς1. In some matrices this smallness is defined by comparing ςNp to the
standard zero [12] (if ς1 has moderate value) whilst in other cases a large ratio might occur
for singular values being all far from zero.
A.5.13. Collinearity index
In Definition A.5.1 is established the sense of exact and near-linear dependence of vectors.
In practical cases the exact linear dependence is not commonly found and instead of the
vectors (or columns of a matrix) can be only near-linearly dependent. An indicator of
the degree of near-linear dependence of the columns of a matrix is the collinearity index
[18, 17, 46].
Definition A.33 Collinearity index.
Suppose A a nonsingular matrix. Then the collinearity index is defined as
γ(A) = 1/ςmin, (A.16)
where ςmin is the smallest singular value of A.
The origin of this definition goes back to the fact that to measure the near collinearity
of the matrix A it should be looked for the minimal norm of the linear combination Aβ
under the constraint ∥β∥. Therefore, the smallest singular value of A equals the minimal
norm ∥Aβ∥ [11, 18] and its inverse is the collinearity index.
A.5.14. Covariance Matrix
The linear relationship between the random variables x1, x2, · · · , xn is commonly mea-
sured by the covariance. If the random variables are contained in a random vector
x = (x1, x2, · · · , xn)T its covariances are stored in the covariance matrix C which is defined
as follows:
Definition A.34 Covariance matrix.
Suppose x ∈ Rn is a random vector of random variables of finite variance. Then the
196
A.5. Matrices notions
covariance matrix
C := E[(x− E[x])(x− E[x])T
](A.17)
is a matrix whose (i, j)-element cij = E[(xi − E(xi))(xj − E(xj))T
]= σ2
ij is the covariance
of the i-th random variable xi with respect to the j-th random variable xj. The diagonal
element cjj = E[(xj − E(xj))2
]= σ2
j is the variance of the j-th random variable xj.
Note that E(xi) is the expected value (or mean) of the i-th entry of the vector x. The
covariance matrix is also known as dispersion matrix or covariance matrix.
The graphical illustration of the variance of an estimator is displayed in Fig. A.1,
where the probability distribution of the estimator Θ1 (with small variance) and of the
estimator Θ2 (with large variance) are shown. Note that the distribution of each estimator
is centered at the expected value E[Θ] or the mean of the probability distribution of Θi
which is (hopefully) close to the true parameter value θ∗. In the figure, the estimator Θ2
is more precise than Θ1 because of its estimates from several sample data sets will lay on
the narrowest probability distribution.
Figure A.1.: Probability distribution of the estimators Θ1 and Θ2
Properties
For the covariance matrix C defined by Eq. A.17 the following basic properties apply:
• C is symmetric positive semi-definite, i.e., C ∈ PSD(n) ⊂ Sym(n);
• The trace of C is positive, i.e, Tr(C) =∑n
j=1 cjj > 0;
• The eigenvalues of C are all real and positive and the eigenvectors that belong to
distinct eigenvalues are orthogonal;
• The determinant of C is nonnegative, i.e., det(C) =∏n
j=1 λj(C) ≥ 0.
197
A. Appendix
A.5.15. Fisher-information Matrix
Before defining the Fisher-information matrix, it is important to know that the Fisher-
information describes the (maximum) information that may be extracted from an ob-
servable random variable y ∈ Rn to estimate the unknown parameter θ ∈ Rm with an
specified distribution. The probability density function pdf for y given θ, which is also the
log-likelihood function for θ, is here p(y; θ).
Definition A.35 Fisher-information Matrix.
Suppose y ∈ Rn is a random vector denoting the available data vector and θ ∈ Rm is the
unknown parameter vector determined by y. Also assume the probability density function
p(y; θ) and θ as an estimator based on y. Then the Fisher-information matrix (FIM) is
the covariance of the partial derivative with respect to θ of the natural logarithm of the
likelihood function:
F = E[∆∆T ], ∆ =∂ ln p(y; θ)
∂θ(A.18)
where F ∈ Rn×n is a positive semi-definite symmetric matrix.
Assuming a probability density function following a Gauss distribution
p(y; θ) = (2π)−n/2 |Cy|−1/2 exp[−1
2(y − ym)TC−1
y (y − ym)], (A.19)
where n denotes the number of parameters and ym is the mean. Considering this assump-
tion the Fisher-information matrix reduces to
F = STC−1y S (A.20)
where S = ∂y/∂θ is the sensitivity matrix and C−1y is the measurement covariance matrix
(measurement error).
198
Bibliography
[1] S. P. Asprey and S. Macchietto. Statistical tools for optimal dynamic model building.
Computers & Chemical Engineering, 24(2):1261–1267, 2000.
[2] S. P. Asprey and S. Macchietto. Designing robust optimal dynamic experiments.
Journal of Process Control, 12(4):545–556, 2002.
[3] Y. Bard. Nonlinear parameter estimation. Academic Press, New York, 1974.
[4] A. Bardow. Optimal experimental design of ill-posed problems: The meter approach.
Computers & Chemical Engineering, 32:115–124, 2008.
[5] A. Bardow and W. Marquardt. Incremental and simultaneous identification of reac-
tion kinetics: methods and comparison. Chemical engineering science, 59(13):2673–
2684, 2004.
[6] T. Barz, H. Arellano-Garcia, and G. Wozny. Handling uncertainty in model-
based optimal experimental design. Industrial & Engineering Chemistry Research,
49(12):5702–5713, 2010.
[7] T. Barz, S. Kuntsche, G. Wozny, and H. Arellano-Garcia. An efficient sparse ap-
proach to sensitivity generation for large-scale dynamic optimization. Computers &
Chemical Engineering, 35(10):2053–2065, 2011.
[8] T. Barz, D. C. Lopez C., H. Arellano-Garcia, and G. Wozny. Experimental evalua-
tion of an approach to online redesign of experiments for parameter determination.
AIChE Journal, 59:1981–1995, 2013.
[9] T. Barz, D. C. Lopez C., M. N. Cruz B., S. Korkel, and S. F. Walter. Real-time
adaptive input design for the determination of competitive adsorption isotherms in
liquid chromatography. Computers & Chemical Engineering, 94:104–116, 2016.
[10] D. M. Bates and D. G. Watts. Nonlinear Regression Analysis and its Applications.
Wiley, Toronto, Canada, 1988.
[11] D. A. Belsley. Conditioning diagnostics: Collinearity and weak data in regression.
John Wiley & Sons, New York, USA, 1991.
[12] D. A. Belsley, E. Kuh, and R. E. Welsch. Regression diagnostics: Identifying influ-
ential data and sources of collinearity. John Wiley & Sons, New York, USA, 1980.
199
Bibliography
[13] M. Bertero, T. A Poggio, and V. Torre. Ill-posed problems in early vision. Proceedings
of the IEEE, 76(8):869–889, 1988.
[14] E. Bingham and H. Mannila. Random projection in dimensionality reduction: ap-
plications to image and text data. In Proceedings of the seventh ACM SIGKDD
international conference on Knowledge discovery and data mining, pages 245–250.
ACM, 2001.
[15] S. Bitterlich and P. Knabner. Experimental design for outflow experiments based on
a multilevel identification method for material laws. Inverse Problems, 19:1011–30,
2003.
[16] Cruz Bournazou, H Arellano-Garcia, G Wozny, G Lyberatos, and C Kravaris. Asm3
extended for two-step nitrification–denitrification: a model reduction for sequencing
batch reactors. Journal of Chemical Technology and Biotechnology, 87(7):887–896,
2012.
[17] R. Brun, M. Kuhni, H. Siegrist, W. Gujer, and P. Reichert. Practical identifiability
of asm2d parameters - systematic selection and tuning of parameter subsets. Water
Research, 36(16):4113–4127, 2002.
[18] R. Brun, P. Reichert, and H. R. Kunsch. Practical identifiability analysis of large
environmental simulation models. Water Resources Research, 37(4):1015–1030, 2001.
[19] M. Burth, G. C. Verghese, and M. Velez-Reyes. Subset selection for improved pa-
rameter estimation in on-line identification of a synchronous generator. IEEE Trans-
actions on Power Systems, 14(1):218–225, 1999.
[20] G. Buzzi-Ferraris. New trends in building numerical programs. Computers & Chem-
ical Engineering, 35(7):1215–1225, 2011.
[21] G. Calafiore, Marina Indri, and Basilio Bona. Robot dynamic calibration: Optimal
excitation trajectories and experimental parameter estimation. Journal of robotic
systems, 18(2):55–68, 2001.
[22] E. F. Camacho and C. Bordons. Model Predictive Control (Advanced Textbooks in
Control and Signal Processing). Advanced Textbooks in Control and Signal Process-
ing. Springer, 2nd edition, 2004.
[23] C. A. Cardona and O. J. Sanchez. Fuel ethanol production: process design trends
and integration opportunities. Bioresource technology, 98(12):2415–2457, 2007.
[24] D. Carlson. Minimax and interlacing thoerems for matrices. Linear Algebra and its
Applications, 54:153–172, 1983.
200
Bibliography
[25] J. Carrera and S. P. Neuman. Estimation of aquifer parameters under transient
and steady state conditions: 1. maximum likelihood method incorporating prior
information. Water Resources Research, 22(2):199–210, 1986.
[26] P. Chandrakant and V. Bisaria. Simultaneous bioconversion of cellulose and hemi-
cellulose to ethanol. Critical Reviews in Biotechnology, 18:295–331, 1998.
[27] O. T. Chis, J. R. Banga, and E. Balsa-Canto. Structural identifiability of systems
biology models: a critical comparison of methods. PloS one, 6(11):e27755, 2011.
[28] Y. Chu and J. Hahn. Parameter set selection for estimation of nonlinear dynamic
systems. AIChE journal, 53(11):2858–2870, 2007.
[29] Y. Chu and J. Hahn. Parameter set selection via clustering of parameters into
pairwise indistinguishable groups of parameters. Industrial & Engineering Chemistry
Research, 48(13):6000–6009, 2008.
[30] Y. Chu and J. Hahn. Generalization of a parameter set selection procedure based on
orthogonal projections and the d-optimality criterion. AIChE Journal, 58(7):2085–
2096, 2012.
[31] C. Cobelli and J. J. DiStefano. Parameter and structural identifiability concepts
and ambiguities: a critical review and analysis. American Journal of Physiology-
Regulatory, Integrative and Comparative Physiology, 239(1):R7–R24, 1980.
[32] Paul G Constantine, Eric Dow, and Qiqi Wang. Active subspace methods in theory
and practice: applications to kriging surfaces. SIAM Journal on Scientific Comput-
ing, 36(4):A1500–A1524, 2014.
[33] M. Doyle, T. F. Fuller, and J. Newman. Modeling of galvanostatic charge and dis-
charge of the lithium/polymer/insertion cell. Journal of the Electrochemical Society,
140(6):1526–1533, 1993.
[34] M. Doyle, J. Newman, A. S. Gozdz, C. N. Schmutz, and J.M. Tarascon. Comparison
of modeling predictions with experimental data from plastic lithium ion cells. Journal
of the Electrochemical Society, 143(6):1890–1903, 1996.
[35] R. E. T. Drissen, R. H. W. Maas, J. Tramper, and H. H. Beeftink. Modelling ethanol
production from cellulose: separate hydrolysis and fermentation versus simultaneous
saccharification and fermentation. Biocatalysis and Biotransformation, 27(1):27–35,
2009.
[36] R. E. T. Drissen, R. H. W. Maas, M. J. E. C. Van Der Maarel, M. A. Kabel, H. A.
Schols, J. Tramper, and H. H. Beeftink. A generic model for glucose production
from various cellulose sources by a commercial cellulase complex. Biocatalysis and
Biotransformation, 25(6):419–429, 2007.
201
Bibliography
[37] R. Faber, P. Li, and G. Wozny. Sequential parameter estimation for large-scale
systems with multiple data sets: 1. computational framework. Ind. Eng. Chem. Res.,
42:5850–5860, 2003.
[38] J. A. Fessler. Mean and variance of implicitly defined biased estimators (such as
penalized maximum likelihood): Applications to tomography. IEEE Transactions
on Power Systems, 5:493–506, 1996.
[39] S. Flila, P. Dufour, and H. Hammouri. Optimal input design for on-line identification:
a coupled observer-MPC approach. In International Federation of Automatic Control
(IFAC) World Congress, pages Paper 1722, pp. 11457–11462, Seoul, South Korea,
July 2008.
[40] I. Ford, D. M. Titterington, and C. P. Kitsos. Recent advances in nonlinear experi-
mental design. Technometrics, 31(1):49–60x, 1989.
[41] G. Franceschini and S. Macchietto. Model-based design of experiments for parameter
precision: State of the art. Chemical Engineering Science, 63(19):4846–4872, 2007.
[42] G. Franceschini and S. Macchietto. Model-based design of experiments for parameter
precision: State of the art. Chemical Engineering Science, 63:4846–4872, 2008.
[43] F. Galvanin, M. Barolo, and F. Bezzo. Online model-based redesign of experiments
for parameter estimation in dynamic systems. Industrial & Engineering Chemistry
Research, 48:4415–4427, 2009.
[44] F. Galvanin, M. Barolo, G. Pannocchia, and F. Bezzo. Online model-based redesign
of experiments with erratic models: a disturbance estimation approach. Computers
& Chemical Engineering, 42:138–151, 2012.
[45] G. H. Golub and C. F. Van Loan. Matrix computations, volume 3. Johns Hopkins
University Press, 1996.
[46] A. Grah. Entwicklung und Anwendung modularer Software zur Simulation und Pa-
rameterschatzung in gaskatalytischen Festbettreaktoren. PhD thesis, Martin Luther
University Halle-Wittenberg, 2004.
[47] G. Guiochon, A. Felinger, D. G. Shirazi, and A. M. Katti. Fundamentals of prepara-
tive and nonlinear chromatography. Elsevier Inc., San Diego, USA, 2 edition, 2006.
[48] E. Haber, L. Horesh, and Tenorio L. Numerical methods for experimental design of
large-scale linear ill-posed inverse problems. Inverse Problems, 24:055012, 2008.
[49] E. Haber, L. Horesh, and Tenorio L. Numerical methods for the design of large-scale
nonlinear discrete ill-posed inverse problems. Inverse Problems, 26:025002, 2010.
202
Bibliography
[50] J. Hadamard. Lectures on cauchy’s problem in linear partial differential equations,
1923.
[51] J. M. Hammersley and D. C. Handscomb. Monte carlo methods, volume 1. Springer,
1964.
[52] P. C. Hansen. Rank-deficient and discrete ill-posed problems. Numerical aspects of
linear inversion. SIAM, Philadelphia, USA, 1998.
[53] P. C. Hansen. A matlab package for analysis and solution of discrete ill-posed prob-
lems. Technical report, Technical University of Denmark, www.netlib.org/numeralgo
and www.mathworks.com/matlabcentral/fileexchange, September 2007.
[54] L. Hascoet and V. Pascual. The tapenade automatic differentiation tool. ACM
Transactions on Mathematical Software, 39(3):1–43, 2013.
[55] A. C. Hindmarsh, P. N. Brown, K. E. Grant, S. L. Lee, R. Serban, D. E. Shumaker,
and C. S. Woodward. Sundials: Suite of nonlinear and differential/algebraic equation
solvers. ACM Transactions on Mathematical Software (TOMS), 31(3):363–396, 2005.
[56] A. E. Hoerl and R. W. Kennard. Ridge regression: applications to nonorthogonal
problems. Technometrics, 12(1):69–82, 1970.
[57] A. E. Hoerl and R. W. Kennard. Ridge regression: Biased estimation for nonorthog-
onal problems. Technometrics, 12:55–67, 1970.
[58] Y. P. Hong and C. T. Pan. Rank-revealing qr factorizations and the singular value
decomposition. Mathematics of Computation, 58(197):213–232, 1992.
[59] L. Horesh, E. Haber, and L. Tenorio. Optimal experimental design for the large-scale
nonlinear ill-posed problem of impedance imaging. Large-Scale Inverse Problems and
Quantification of Uncertainty, John Wiley & Sons, Ltd, pages 273–290, 2010.
[60] E. A. Jacobson. A statistical parameter estimation method using singular value
decomposition with application to Avra Valley aquifer in southern Arizona. PhD
thesis, The University of Arizona, 1985.
[61] S. Javeed, S. Qamar, A. Seidel-Morgenstern, and G. Warnecke. Efficient and ac-
curate numerical simulation of nonlinear chromatographic processes. Computers &
Chemical Engineering, 35(11):2294–2305, 2011.
[62] B. Jayasankar, B. Huang, and A. Ben-Zvi. Receding horizon experiment design
with application in sofc parameter estimation. In Mayuresh Kothare, Moses Tade,
Alain Vande Wouwer, and Ilse Smets, editors, 9th International Symposium on Dy-
namics and Control of Process Systems (DYCOPS 2010), pages 527–532, 2010.
203
Bibliography
[63] T. A. Johansen. On tikhonov regularization, bias and variance in nonlinear system
identification. Automatica, 33:441–446, 1997.
[64] SI Kabanikhin. Definitions and examples of inverse and ill-posed problems. Journal
of Inverse and Ill-Posed Problems, 16(4):317–357, 2008.
[65] D. Kaelin, R. Manser, L. Rieger, J. Eugster, K. Rottermann, and H. Siegrist. Ex-
tension of asm3 for two-step nitrification and denitrification and its calibration and
validation with batch tests and pilot scale data. Water research, 43:1680–1692, 2009.
[66] B. Kaltenbacher, A. Neubauer, and O. Scherzer. Iterative regularization methods for
nonlinear ill-posed problems, volume 6. Walter de Gruyter, 2008.
[67] KNAUER, Wissenschaftliche Gerate GmbH, Support. Personal communication
(2014-05-01).
[68] S. Korkel and H. Arellano-Garcia. Online experimental design for model valida-
tion. In Rita Maria de Brito Alves, Claudio Augusto Oller do Nascimento, and
Evaristo Chalbaud Biscaia Jr., editors, 10th International Symposium on Process
Systems Engineering - PSE 2009, volume 27, pages 1–8, Salvador-Bahia, Brazil,
2009. Elsevier.
[69] S. Korkel, I. Bauer, H. G. Bock, and J. P. Schloder. A sequential approach for
nonlinear optimum experimental design in dae systems. Scientific Computing in
Chemical Engineering II: Simulation, Image Processing, Optimization, and Control,
page 338, 1999.
[70] C. Kravaris, J. Hahn, and Y. Chu. Advances and selected recent developments in
state and parameter estimation. Computers & Chemical Engineering, 51:111–123,
2013.
[71] A. Kremling, S. Fischer, K. Gadkar, T. Doyle, F. J.and Sauter, E. Bullinger, F. All-
gower, and E. D. Gilles. A benchmark for methods in reverse engineering and model
discrimination: problem formulation and solutions. Genome research, 14:1773–1785,
2004.
[72] S. Kuntsche, T. Barz, R. Kraus, H. Arellano-Garcia, and G. Wozny. Mosaic a
web-based modeling environment for code generation. Computers & Chemical Engi-
neering, 35(11):2257–2273, 2011.
[73] T. Lahmer. Optimal experimental design for nonlinear ill-posed problems applied
to gravity dams. Inverse Problems, 27:125005 (20pp), 2011.
[74] L. Lennart and P. E. Caines. Asymptotic normality of prediction error estimators
for approximate system models. Stochastics, 3(1-4):29–46, 1980.
204
Bibliography
[75] Y. C. Liang, H. P. Lee, S. P. Lim, W. Z. Lin, K. H. Lee, and C. G. Wu. Proper
orthogonal decomposition and its applications -part i: Theory. Journal of Sound
and Vibration, 252(3):527–544, 2002.
[76] Y. Lin and S. Tanaka. Ethanol fermentation from biomass resources: current state
and prospects. Applied microbiology and biotechnology, 69(6):627–642, 2006.
[77] O. Lisec, P. Hugo, and A. Seidel-Morgenstern. Frontal analysis method to determine
competitive adsorption isotherms. Journal of Chromatography A, 908(1-2):19–34,
2001.
[78] D. C. Lopez C., T. Barz, S. Korkel, and G. Wozny. Nonlinear ill-posed problem
analysis in model-based parameter estimation and experimental design. Computers
& Chemical Engineering, 77:24–42, 2015.
[79] D. C. Lopez C., T. Barz, M. Penuela, A. Villegas, S. Ochoa, and G. Wozny. Model-
based identifiable parameter determination applied to a simultaneous saccharifica-
tion and fermentation process model for bio-ethanol production. Biotechnology
Progress, 29(4):1064–1082, 2013.
[80] D. C. Lopez C., G. Wozny, A. Flores-Tlacuahuac, R. Vasquez-Medrano, and V. M.
Zavala. A computational framework for identifiability and ill-conditioning analy-
sis of lithium-ion battery models. Industrial & Engineering Chemistry Research,
55(11):3026–3042, 2016.
[81] T. Maly and L. R. Petzold. Numerical methods and software for sensitivity analysis
of differential-algebraic systems. Applied Numerical Mathematics, 20(1):57–79, 1996.
[82] D. W. Marquardt. An algorithm for least-squares estimation of nonlinear parameters.
Journal of the Society for Industrial & Applied Mathematics, 11(2):431–441, 1963.
[83] D. W. Marquardt. Generalized inverses, ridge regression, biased linear estimation,
and nonlinear estimation. Technometrics, 12:591–612, 1970.
[84] S Marsili-Libelli, S Guerrizio, and N Checchi. Confidence regions of estimated pa-
rameters for ecological systems. Ecological Modelling, 165(2):127–146, 2003.
[85] E. Martınez-Rosas, R. Vasquez-Medrano, and A. Flores-Tlacuahuac. Modeling and
simulation of lithium-ion batteries. Computers & Chemical Engineering, 35(9):1937–
1948, 2011.
[86] M. D. McKay, R. J. Beckman, and W. J. Conover. Comparison of three methods for
selecting values of input variables in the analysis of output from a computer code.
Technometrics, 21(2):239–245, 1979.
205
Bibliography
[87] K. A. P. McLean, S. Wu, and K. B. McAuley. Mean-squared-error methods for select-
ing optimal parameter subsets for estimation. Industrial & Engineering Chemistry
Research, 51:6105–6115, 2012.
[88] R. K. Mehra. Optimal input signals for parameter estimation in dynamic systems
-survey and new results. Automatic Control, IEEE Transactions on, 19:753–768,
1974.
[89] D. C. Montgomery and G. C. Runger. Applied statistics and probability for engineers.
John Wiley & Sons, 2010.
[90] J. J. More and D. C. Sorensen. Computing a trust region step. SIAM Journal on
Scientific and Statistical Computing, 4(3):553–572, 1983.
[91] S. Natarajan and J. H. Lee. Repetitive model predictive control applied to a sim-
ulated moving bed chromatography system. Computers & Chemical Engineering,
24(2-7):1127–1133, 2000.
[92] S. Ochoa, A. Yoo, J. U. Repke, G. Wozny, and D. R. Yang. Modeling and parameter
identification of the simultaneous saccharification-fermentation process for ethanol
production. Biotechnology progress, 23(6):1454–1462, 2007.
[93] F. OSullivan. A statistical perspective on ill-posed inverse problems. Statistical
science, pages 502–518, 1986.
[94] K. Palmer and K. L. Tsui. A minimum bias latin hypercube design. IIE Transactions,
33(9):793–808, 2001.
[95] M. Penuela V. Desenvolvimento de processo de hidrolise enzimatica e fermentacao
simultaneas para a producao de etanol a partir de bagaco de cana-de-acucar. PhD
thesis, Universidade Federal de Rio de Janeiro, 2007.
[96] M. Penuela V., J. Nascimento C., M. Bezerra, and N. Pereira Jr. Enzymatic hy-
drolysis optimization to ethanol production by simultaneous saccharification and
fermentation. In Applied Biochemistry and Biotecnology, pages 141–153. Springer,
2007.
[97] G. P. Philippidis and C. Hatzis. Biochemical engineering analysis of critical process
factors in the biomass-to-ethanol technology. Biotechnology progress, 13(3):222–231,
1997.
[98] G. P. Philippidis, T. K. Smith, and C. E. Wyman. Study of the enzymatic hydrolysis
of cellulose for production of fuel ethanol by the simultaneous saccharification and
fermentation process. Biotechnology and Bioengineering, 41(9):846–853, 1993.
206
Bibliography
[99] G. P. Philippidis, D. D. Spindler, and C. E. Wyman. Mathematical modeling of
cellulose conversion to ethanol by the simultaneous saccharification and fermentation
process. Applied biochemistry and biotechnology, 34(1):543–556, 1992.
[100] W. H. Press. Numerical recipes 3rd edition: The art of scientific computing. Cam-
bridge university press, 2007.
[101] F. Pukelsheim. Optimal design of experiments. Wiley, New York, 1 edition, 1993.
[102] J. Qian, P. Dufour, and M. Nadri. Observer and model predictive control for on-line
parameter identification in nonlinear systems. In IFAC International Symposium
on Dynamics and Control of Process Systems (DYCOPS), pages 571–576, Mumbai,
India, December 2013.
[103] V. Ramadesigan, V. Boovaragavan, M. Arabandi, K. Chen, H. Tsukamoto, R. Braatz,
and V. Subramanian. Parameter estimation and capacity fade analysis of lithium-
ion batteries using first-principles-based efficient reformulated models. ECS Trans-
actions, 19(16):11–19, 2009.
[104] V. Ramadesigan, K. Chen, N. A. Burns, V. Boovaragavan, R. D. Braatz, and
V. R. Subramanian. Parameter estimation and capacity fade analysis of lithium-
ion batteries using reformulated models. Journal of The Electrochemical Society,
158(9):A1048–A1054, 2011.
[105] C. E. Rasmussen and C. K. I. Williams. Gaussian processes for machine learning.
MIT press, Cambridge, USA, 2006.
[106] J. G. Reid. Structural identifiability in linear time-invariant systems. Automatic
Control, IEEE Transactions on, 22(2):242–246, 1977.
[107] R. Schenkendorf and M. Mangold. Online model selection approach based on un-
scented kalman filtering. Journal of Process Control, 23(1):44–57, 2013.
[108] A. P. Schmidt, M. Bitzer, A. W. Imre, and L. Guzzella. Experiment-driven electro-
chemical modeling and systematic parameterization for a lithium-ion battery cell.
Journal of Power Sources, 195(15):5071–5080, 2010.
[109] J. C. Schoneberger, H. Arellano-Garcia, G. Wozny, S. Korkel, and H. Thielert. Model-
based experimental analysis of a fixed-bed reactor for catalytic so2 oxidation. Indus-
trial & Engineering Chemistry Research, 48(11):5165–5176, 2009.
[110] D. C. Sorensen. Newton’s method with a model trust region modification. SIAM
Journal on Numerical Analysis, 19(2):409–426, 1982.
[111] James C Spall. Introduction to stochastic search and optimization: estimation, sim-
ulation, and control, volume 65. John Wiley & Sons, 2005.
207
Bibliography
[112] G. W. Stewart. Collinearity and least squares regression. Statistical Science, pages
68–84, 1987.
[113] J. D. Stigter, D. Vries, and K. J. Keesman. On adaptive optimal input design: a
bioreactor case study. AIChE journal, 52(9):3290–3296, 2006.
[114] P. Stoica and T. L. Marzetta. Parameter estimation problems with singular infor-
mation matrices. Signal Processing, IEEE Transactions on, 49(1):87–90, 2001.
[115] V. R. Subramanian, V. Boovaragavan, and V. D. Diwakar. Toward real-time simu-
lation of physics based lithium-ion battery models. Electrochemical and Solid-State
Letters, 10(11):A255–A260, 2007.
[116] V. R. Subramanian, V. Boovaragavan, V. Ramadesigan, and M. Arabandi. Math-
ematical model reformulation for lithium-ion battery simulations: Galvanostatic
boundary conditions. Journal of The Electrochemical Society, 156(4):A260–A271,
2009.
[117] R. C. Thompson. Principal submatrices ix: Interlacing inequalities for singular
values of submatrices. Linear Algebra and its Applications, 5(1):1–12, 1972.
[118] A.N. Tikhonov and V.Y. Arsenin. Solutions of ill-posed problems. V. H. Winston,
Washington, USA, 1977.
[119] A. Toumi and S. Engell. Optimization-based control of a reactive simulated moving
bed process for glucose isomerization. Chemical Engineering Science, 59(18):3777–
3792, 2004.
[120] S. Vajda, H. Rabitz, E. Walter, and Y. Lecourtier. Qualitative and quantitative
identifiability analysis of nonlinear chemical kinetic models. Chemical Engineering
Communications, 83:191–219, 1989.
[121] V. S. Vassiliadis, E. B. Canto, and J. R. Banga. Second-order sensitivities of general
dynamic systems with application to optimal control problems. Chemical Engineer-
ing Science, 54(17):3851–3860, 1999.
[122] M. Velez-Reyes. Decomposed algorithms for parameter estimation. PhD thesis, Mas-
sachusetts Institute of Technology, 1992.
[123] M. Velez-Reyes and G. C. Verghese. Subset selection in identification, and applica-
tion to speed and parameter estimation for induction machines. In Proceedings of
the 4th IEEE Conference on Control Applications, pages 991–997, Albany, 1995.
[124] E. Walter and L. Pronzato. On the identifiability and distinguishability of nonlinear
parametric models. Mathematics and Computers in Simulation, 42:125–134, 1996.
208
Bibliography
[125] E. Walter and L. Pronzato. Identification of parametric models. Springer, New-York,
USA, 1997.
[126] B. A. Walther and J. L. Moore. The concepts of bias, precision and accuracy, and
their use in testing the performance of species richness estimators, with a literature
review of estimator performance. Ecography, 28(6):815–829, 2005.
[127] S. R. Weijers and P. A. Vanrolleghem. A procedure for selecting best identifiable
parameters in calibrating activated sludge model no. 1 to full-scale plant data. Water
science and technology, 36:69–79, 1997.
[128] A. D. Wilson, J. A. Schultz, A. Ansari, and T. D. Murphey. Real-time trajectory
synthesis for information maximization using sequential action control and least-
squares estimation. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS),
2015.
[129] S. Wu, K. A. P. McLean, T. J. Harris, and K. B. McAuley. Selection of optimal
parameter set using estimability analysis and mse-based model-selection criterion.
International Journal of Advanced Mechatronic Systems, 3:188–197, 2011.
[130] W. Wu, D. L. Massart, and S. De Jong. The kernel pca algorithms for wide data.
part i: theory and algorithms. Chemometrics and Intelligent Laboratory Systems,
36(2):165–172, 1997.
[131] P. Xu. Truncated svd methods for discrete linear ill-posed problems. Geophysical
Journal International, 135:505–514, 1998.
[132] N Yakut, T Barz, DC Lopez Cardenas, H Arellano-Garcia, and G Wozny. Online
model-based redesign of experiments for parameter estimation applied to closed-loop
controller tuning. Chemical Engineering Transactions, 32(June):1195–1200, 2013.
[133] N. Yakut, T. Barz, D.C. Lopez Cardenas, and G. Wozny. Online redesign technique
for closed-loop system identification. In Selected papers of the 11th International
Conference on Chemical and Process Engineering, volume 11 of AIDIC Conference
Series, pages 421–430, Milano, Italy, 2013. AIDIC.
[134] K. Z. Yao, B. M. Shaw, B. Kou, K. B. McAuley, and D. W. Bacon. Modeling
ethylenebutene copolymerization with multi-site catalysts parameter estimability
and experimental design. Polymer Reaction Engineering, 11:563–588, 2003.
[135] V. M. Zavala, C. D. Laird, and L. T. Biegler. Interior-point decomposition ap-
proaches for parallel solution of large-scale nonlinear parameter estimation problems.
Chemical Engineering Science, 63(19):4834–4845, 2008.
209
Bibliography
[136] Y. Zhu and B. Huang. Constrained receding-horizon experiment design and parame-
ter estimation in the presence of poor initial conditions. AIChE Journal, 57(10):2808–
2820, 2011.
210