Laurenz Peleman - Universiteit Gent
Transcript of Laurenz Peleman - Universiteit Gent
Laurenz Peleman
and Thin Film Solar Cells Performancein Sustainable Energy Production: Biodiesel SynthesisDevelopment of an Adequate Modelling Methodology
Academic year 2014-2015Faculty of Engineering and Architecture
Chairman: Prof. Marc VanhaelstDepartment of Industrial Technology and Construction
Chairman: Prof. dr. ir. Guy MarinDepartment of Chemical Engineering and Technical Chemistry
Master of Science in Chemical EngineeringMaster's dissertation submitted in order to obtain the academic degree of
Counsellor: Kenneth TochSupervisors: Prof. dr. ir. Joris Thybaut, Prof. dr. Johan Lauwaert
Preface
As May quietly passes by and this master thesis enters the stage of completion, it is getting
hard to deny that my five-year journey as a student at Ghent University is coming to an end.
During this period, I have been able to further explore the interesting fields of science and
technology while experiencing joyful moments and encountering opportunities which I
could not possibly have imagined at the moment it all started, on that sunny late afternoon
in August 2010 when I had finally bitten the bullet and decided to enroll at the engineering
faculty. Therefore, I would like to reserve this part to acknowledge all those who have,
whether or not directly, assisted and supported me throughout this thesis year and without
whom the past five years would not have been the same.
First of all, I would like to thank prof. dr. ir. Guy Marin and all staff members of the
Laboratory for Chemical Technology for the highly qualitative education and the
professional environment in which I have been trained as a chemical engineer. Thanks to my
promotors, prof. dr. ir. Joris Thybaut and prof. dr. Johan Lauwaert, for this challenging and
many-sided master thesis subject, for the interesting discussion sessions and the highly
appreciated willingness to clarify whatever was unclear. Special thanks go to my coach,
Kenneth, for his thorough support throughout the entire year, his valuable advice where
needed and his enthusiasm to seeing ‘challenge, not problems’ when results turned out once
more to be not that satisfactory. More personally, I wish him and his young family the very
best with the nearing welcoming of their third child.
I am very grateful to dr. Samira Khelifi of the Electronics and Information Systems
department for her help and time during the solar cell experiments, and to dr. ing. Evelien
Van de Steene of the Polymer Chemistry and Biomaterials Group, for her ever willing aid
and suggestions to tackle the variety of issues that were encountered during the
measurement on the chemical case study.
Writing this master thesis would not have been as straightforward without a pleasant
ambience, for which I want to thank all fellow-students. A special mention goes to
Alexandra, Brigitte and Tine, with whom I had the honor to share the control room for one
year. The small talks were a pleasant alternation on the efforts for the thesis, as well as on the
charming sounds from tube replacement, alarm signals and pressure washers arising from
the room next-door, while the dinner event earlier this year clearly demonstrated that you all
have outstanding culinary talents as well. Furthermore, I cannot get around the
unforgettable moments I have shared with Jenoff, Jens, Jeroen, Joeri, Lysander, Thomas and
Yoshi. Although, at the moment, it is hard to estimate how strongly the outcome of our no-
one-ever-died-because-of-a-healthy-dose-of-surrealism talks during regular, and additional, coffee
breaks will truly contribute to a better world, well, at least we made an attempt.
At last, I want to thank my friends and family, for their heartwarming and continual support
and belief in whatever I try to do. More specifically, I owe nothing but gratitude to my
parents, for having given me the chance to start this study and to create the optimal
conditions for me to accomplish it, time after time.
Laurenz Peleman
21 may 2015
FACULTY OF ENGINEERING AND ARCHITECTURE
Department of Chemical Engineering and Technical Chemistry
Laboratory for Chemical Technology
Director: Prof. Dr. Ir. Guy B. Marin
Laboratory for Chemical Technology • Technologiepark 914, B-9052 Gent • www.lct.ugent.be
Secretariat : T +32 (0)9 33 11 756 • F +32 (0)9 33 11 759 • [email protected]
Laboratory for Chemical Technology
Declaration concerning the accessibility of the master thesis Undersigned, Laurenz Peleman .......................................................................................................... Graduated from Ghent University, academic year 2014-2015 and is author of the master thesis with title: Development of an Adequate Modelling Methodology in Sustainable Energy Production: Biodiesel Synthesis and Thin Film Solar Cells Performance .........................................
......................................................................................................................................
......................................................................................................................................
......................................................................................................................................
The author gives permission to make this master dissertation available for consultation and to copy parts of this master dissertation for personal use.
22 May 2015
Development of an Adequate Modelling Methodology
in Sustainable Energy Production:
Biodiesel Synthesis and Thin Film Solar Cells Performance
Laurenz Peleman
Promotors: prof. dr. ir. Joris Thybaut
prof. dr. Johan Lauwaert
Counsellor: dr. ir. Kenneth Toch
Master’s dissertation submitted in order to obtain the academic degree of Master of Science in
Chemical Engineering
Department of Chemical Engineering and Technical Chemistry
Chairman: prof. dr. ir. Guy Marin
Faculty of Engineering and Architecture
Academic year 2014-2015
Summary
A crucial step in the construction of theoretical models to describe physicochemical
phenomena is in the accurate estimation of unknown model parameters. Typically, a statistical
analysis based on ordinary least squares regression is typically applied for this purpose as it
has already yielded promising results in the kinetic modeling of chemical reactions.
Nevertheless, the theoretical framework of this technique requires a specific regularity of the
experimental errors which is not necessarily fulfilled for practical measurements. As a
consequence, the accuracy of the estimates is not assured. The robustness and overall
applicability of this methodology are therefore evaluated on two case studies that are rooted in
sustainable energy production: the estimation of kinetic parameters for a transesterification
reaction, relevant for biodiesel production and the modelling of the electrical characteristics of
thin-layer solar cells. Additionally, alternative parameter estimation procedures were
searched, for which Matlab routines were written. Their potential was benchmarked by
application to a simple model.
Keywords
Least squares regression; weighted regression, correlated errors; Bayesian estimation; MCMC
sampling; transesterification; kinetic modeling, biodiesel, CIGS solar cells; sustainable energy
Development of an Adequate Modeling
Methodology in Sustainable Energy Production:
Biodiesel Synthesis and
Thin Film Solar Cells Performance
Laurenz Peleman
Supervisors: dr. ir. K. Toch, prof. dr. ir. J.W. Thybaut, prof. dr. J. Lauwaert
Abstract: A crucial step in the construction of theoretical
models to describe physicochemical phenomena is in the accurate
estimation of unknown model parameters. Typically, a statistical
analysis based on ordinary least squares regression is typically
applied for this purpose as it has already yielded promising
results in the kinetic modeling of chemical reactions.
Nevertheless, the theoretical framework of this technique
requires a specific regularity of the experimental errors which is
not necessarily fulfilled for practical measurements. As a
consequence, the accuracy of the estimates is not assured. The
robustness and overall applicability of this methodology are
therefore evaluated on two case studies that are rooted in
sustainable energy production: the estimation of kinetic
parameters for a transesterification reaction, relevant for
biodiesel production and the modelling of the electrical
characteristics of thin-layer solar cells. Additionally, alternative
parameter estimation procedures were searched, for which
Matlab routines were written. Their potential was benchmarked
by application to a simple model.
Keywords: Least squares regression; weighted regression,
correlated errors; Bayesian estimation; MCMC sampling;
transesterification; kinetic modeling, biodiesel, CIGS solar cells;
sustainable energy
I. INTRODUCTION
Classical regression schemes are a powerful tool to obtain
optimal point values for unknown model parameters by fitting
a limited set of experimental data and allow for a complete
statistical assessment of the validity of these results [1, 2]. The
underlying theory relies on regularity conditions on the
experimental error of data. The noise is assumed to obey to a
normal distribution, centered at 0, having a constant variance
and no mutual correlation. Unfortunately, the fulfillment of
these theoretical requirements is not guaranteed in practice,
and as a consequence, the accuracy and performance of the
regression routine is in general not ensured as well.
In this work, this methodology was critically reviewed. The
evaluation is two-fold. At first, the performance of the current
methodology was assessed, by testing 1) its robustness
towards the experimentally used reactor setup for a
transesterification reaction, relevant for biodiesel production
processes, and 2) its applicability in scientific fields beyond
chemical kinetics, by the estimation of electrical properties of
thin-layer solar cells (TLSC’s) from current-voltage (I-V)
L. Peleman is with the Chemical Engineering Department, Ghent
University (UGent), Gent, Belgium. E-mail: [email protected].
measurements. Secondly, three alternative parameter
estimation procedures are benchmarked on a simple linear
model. The first two are extensions of classical regression,
correcting it for violation of the theoretical restrictions. The
third method, the Bayesian approach, starts from a
fundamentally different view on parameter estimation and
allows theoretically for attractive features like optimized
weighing of the data and the inclusion of prior knowledge on
the parameter values.
II. CASE STUDY 1: KINETIC MODELING OF CONTINUOUS-FLOW
DATA ON TRANSESTERIFICATION
A. Assessing the robustness of the methodology
A kinetic model for the transesterification of methanol and
ethyl acetate on the ion-exchanging catalyst Lewatit K2629
has been developed recently [3]. The estimation of the kinetic
parameters was performed on data from a batch reactor setup.
Repeating the estimation based on continuous-flow data was
considered as a valuable way to evaluate its robustness.
B. Procedures
The reactor setup consisted of a double-layer glass tube,
filled up with Raschig rings, with a plug of 2.0 g of freeze-
dried catalyst placed halfway. Thermal oil flows through the
outer shell of the tube, which allows for stable temperature
control of the reaction environment. The feed mixture,
consisting of methanol (MeOH), ethyl acetate (EtOAc) and n-
decane as internal standard, was sent through the setup at 5, 8
and 15 ml/min. The temperature was varied from 30 over 45
to 60 °C, while feeds were made up for molar MeOH to
EtOAc ratios of 1:1, 5:1 and 10:1. Measurements at each of
these conditions hence yielded a set of 27 samples. These
were then put in a GC-analyzer to determine their exact
composition.
C. Results
The calculated concentrations of the different reactants and
qualitatively showed the expected trends as function of the
varied process variables. Conversion of the reactants
increased for increasing temperature and for a higher excess
of MeOH, while lower space time had a negative impact on
the conversion, see Figure 1. Unfortunately, the quantitative
results were more doubtful, as the calculated concentrations
were not consistent with the stoichiometry of the reaction, for
an unidentified reason.
Nevertheless, simulations were run on the collected data in
the simulation software Athena Visual Studio (AVS), using an
adapted version of the original kinetic model that suited the
continuous reactor setup.
In a first step, the kinetic parameters were set at the values as
reported in the original research. Comparison of the calculated
model predictions and the observations revealed a
considerable misfit. Subsequently, an attempt was made to
estimate new values for the kinetic parameters, i.e., those
which fit the continuous-flow data optimally. Unfortunately,
the routine did not yield significant estimates and hence, a
solid comparison with the reported values is not possible.
Figure 1Conversion of EtOAc (%) for varying temperature,
molar feed composition and flow rates (♦ 5, ■ 8 and ▲ 15 ml/min)
III. CASE STUDY 2: MODELING OF CURRENT-VOLTAGE
CHARACTERISTICS OF THIN-LAYER SOLAR CELLS
A. Assessing the overall applicability
The proper description of the dark I-V characteristics of
TLSC’s was chosen to evaluate the ability of the current
methodology to model physical phenomena as well.
The application of photovoltaic cells as thin films on both
rigid and flexible carriers is considered as a promising route to
extend the production of solar energy. CIGS-based cells,
made of a copper-indium-gallium-selenium semiconductor,
yield attractive efficiencies, but limited knowledge about the
power loss mechanisms hinders effective scale-up of this
technology. These parasitic pathways causing current leakage
inside the cell are attributed to imperfections in the cell
structure and therefore often very localized.
Equivalent electronic networks are available to model this
behavior. The most extended model counts 8 parameters [4].
B. Procedures
Two pieces of 0.5 cm² surface area were cut out of the same
CIGS mother panel and fixed in the measuring device. The
voltage over and the current through the cell were measured
by two separate wires to minimize the bias of the results due
to the potential fall over the wires. The device is cooled with
liquid nitrogen to ‘freeze in’ the leakage mechanisms and
allowed for measurements at temperatures ranging from 300
down to 110 K in steps of 10 K. The voltages over the cells
were varied between -1 and 1V by 0.01V increments.
C. Results
Parameter estimates were calculated for both cells at all
temperatures while both their significance and physical
meaning were ensured. This allowed for a current pathway
analysis like shown in Figure 2, giving the contributions of
each model term at each temperature. It was observed that not
all leakage effects were contributing equally in both cells,
which demonstrated the small-scale differences in parasitic
effects.
Figure 2 Contribution of current pathways for both cells (%)
Main junction Shunt resistance SCLC
Shunt tunneling
IV. BENCHMARK ANALYSIS OF ALTERNATIVE PARAMETER
ESTIMATION PROCEDURES
A. Adapted regression schemes and Bayesian estimation
Three alternative parameter estimation techniques were
retained from a literature survey. The first performs a data-
based weighing of the residuals to correct for experimental
error with non-constant variance [5]. The weighing
factors 𝑤𝑖 , 𝑖 = 1. . 𝑛, 𝑛 the number of experiments, are
calculated according to (1) after the introduction of the
transformation parameter 𝜙, which has to be estimated from
the data.
𝑤𝑖 ∝ |𝑦�̂�|2𝜙−2 (1)
Secondly, the violation of the assumption on zero-
correlation was taken into account, which is particularly
relevant for time series experiments [6]. The elected method
captures the error in a first-order autoregressive model, i.e.:
𝜀(𝑡) = 𝜌𝜀(𝑡 − 1) + 𝑢(𝑡) (2)
which requires the calculation of the autocorrelation factor 𝜌.
Lastly, Bayesian estimation was considered. For this
approach, all statistical inference on the parameters 𝜷 of the
model 𝑦𝑖𝑗 = 𝑓𝑗(𝒙𝒊, 𝜷) is extracted from the posterior density
function 𝑝(𝜷|𝒚):
𝑝(𝜷|𝒚) ∝ ∏ | 𝒗(𝑖)(𝜷)|−(𝑚+2) 2⁄
𝑛
𝑖=1
(3)
where 𝒗(𝑖)(𝜷) = {[𝑦𝑖𝑗 − 𝑓𝑗(𝒙𝒊, 𝜷)][𝑦𝑖𝑙 − 𝑓𝑙(𝒙𝒊, 𝜷)]}𝑗=1..𝑚
𝑙=1..𝑚
and 𝑚 the number of responses for each experiment. Affine
invariant MCMC sampling was used to evaluate (3) efficiently throughout parameter space [7, 8].
B. Procedures
Routines were encoded in Matlab for all techniques. To
compare their performance compared to classical regression,
parameter estimation was performed on the single-response
linear model 𝑦 = 5𝑥 + 1. An experimental data set was
simulated by adding Gaussian noise 𝜀~𝒩(0, 𝑉) to the exact
response value, with the covariance matrix 𝑉 to be specified.
This way, it is allowed to impose a specific correlation
structure on the ‘experimental’ data and test, or break through,
the theoretical limits of ordinary regression. For each of the
techniques, 10 subsequent runs are performed to evaluate the
consistency of their outcome.
0
20
40
30 60
1:1
0
20
40
30 60
5:1
0
20
40
30 60
10:1
0
100
-1 0 1
Voltage [V]
0
100
-1 0 1
Voltage [V]
C. Results
1) Data-based weighing
Each experimental data set was simulated by explicitly
assuming that the error variance scales with the square of the
corresponding response. In principle, optimal weighing
factors correspond to 𝜙 = 0 in equation (1) for this situation.
A comparison of the resulting point estimates resulting for
both the weighted and classical regression are shown in Figure
3. It follows that, for all runs, the optimal parameter values are
more accurately estimated by the weighing technique.
Nevertheless, the calculated confidence intervals on the
estimates, which are not depicted for clarity, turned out to be
almost equally broad for both methods. Hence, the weighted
estimation is actually equally informative as the classical
regression. Moreover, the weighing routine was found to
perform considerably worse for a lower quality of the data, as
the optimal fit to highly scattered observations was poor.
Figure 3 Estimates for the model parameters A (left) and B (right)
for 10 separate runs from the weighted regression (filled markers)
and ordinary least squares (open markers). The dotted line
corresponds to the true parameter values.
2) Correction for serial correlation
To evaluate the benefits of accounting for correlation
between observations, the experimental data set was designed
to obey a specified mutual dependence. A comparison
between the results from adapted and ordinary regression is
depicted in Figure 4. Although clearly demonstrating the
considerable impact of correlation on the performance of
ordinary regression, it shows the rather limited gain in
accuracy of the adapted procedure as well. Moreover, when
simulating experimental data with slightly or highly positive
mutual correlation, the outcome of the corrected and classical
regression were almost identical.
Figure 4 Estimates for the model parameters A (left) and B (right)
for 10 separate runs from explicit correction for serial correlation
(filled markers) and ordinary least squares (open markers). The
dotted line corresponds to the true parameter values.
3) Bayesian estimation with MCMC sampling
In contrast to regression routines, Bayesian estimation
offers statistical inference on unknown model parameters in
the form of a so-called posterior density distribution over
parameter space, rather than point estimates and
corresponding confidence intervals. The marginalized
versions of this distribution are shown in Figure 5. Clearly,
both functions attain their global maximum in the close
vicinity of the actual model parameter values. Nevertheless,
the calculated 95% probability density interval was found to
be broader than its analogue from ordinary least squares
regression.
Figure 5 Marginalized posterior density function
for the model parameters
V. CONCLUSIONS
In this thesis, the currently applied methodology to obtain
statistical inference on unknown model parameter is critically
reviewed. The quality of the data from continuous-flow
transesterification experiments, performed with the aim of
evaluate the robustness of the estimation procedure, did not
allow for a quantitative assessment. Nevertheless,
quantitatively the results did show the expected trends. On the
other hand, the application of the methodology to the
modeling of thin-layer solar cell performance proved to be
successful, opening perspectives for future research in this
field. At last, the benchmark analysis of the adapted
regression techniques on the rudimentary model demonstrated
that the offered gains in performance were not always
consistent. Bayesian estimation yielded in turn accurate, yet
less precise results than classical regression.
REFERENCES
1. Thybaut, J.W., Kinetic Modeling and Simulation - University
Course. 2014, Ghent University.
2. Toch, K., An intrinsic kinetics based methodology for multi-scale modeling of chemical reactions. 2014.
3. Van de Steene, E., Kinetic study of the (trans)esterification
catalyzed by gel and macroporous resins. 2014.
4. Williams, B.L., et al., Identifying parasitic current pathways in
CIGS solar cells by modelling dark J–V response. Progress in
Photovoltaics: Research and Applications, 2015. 5. Pritchard, D.J., J. Downie, and D.W. Bacon, Further
Consideration of Heteroscedasticity in Fitting Kinetic Models.
Technometrics, 1977. 19(3): p. 227-236. 6. Seber, G.A.F. and C.J. Wild, Nonlinear Regression. 2003: Wiley.
7. Stewart, W.E. and M. Caracotsios, Computer-Aided Modeling of
Reactive Systems. 2008: Wiley. 8. Goodman, J. and J. Weare, Ensemble samplers with affine
invariance. Communications in Applied Mathematics and
Computational Science, 2010. 5(1): p. 65-80.
4
5
6
0 5 100
1
2
0 5 10
4.9
5
5.1
0 5 10
0.9
1
1.1
0 5 10
0
1
2
0 5 10
A
0
1
2
-2 0 2
B
�̂� �̂�
�̂� �̂�
i
Table of Contents
Table of Contents i
List of Figures iii
List of Tables vi
List of Symbols vii
Chapter 1 The noble art of model building 1
1.1 Aim of the thesis 3
1.2 Outline of this work 4
1.3 References 4
Chapter 2 Case study: kinetic modeling of continuous-flow transesterification 5
2.1 Biofuels: a promising alternative to fossil feedstocks? 5
2.1.1 Catalytic pathways to biodiesel production 8
2.1.2 Outline of this chapter 11
2.2 Reactor-scale kinetic modeling of the transesterification reaction on macroporous resins 11
2.2.1 Reaction network for catalytic transesterification of methanol and ethyl acetate 11
2.2.2 Model for continuous reactor configuration 13
2.3 Experimental setup and procedures 15
2.3.1 Reactants and catalyst 15
2.3.2 Tubular reactor setup 16
2.4 Discussion of the experimental data 20
2.5 Parameter estimation and statistical analysis 23
2.6 References 26
Chapter 3 Case study: modeling of current-voltage characteristics of thin layer solar cells
27
3.1 Solar energy: towards a bright, sustainable future? 27
3.1.1 General working principle of a photovoltaic device 30
3.1.2 Thin film solar cell technology 33
3.1.3 Outline of this chapter 35
3.2 Model for the dark current-voltage characteristic of CIGS solar cells 36
3.2.1 Ideal versus non-ideal electric behavior of solar cells 36
3.2.2 Modelling parasitic current pathways and non-idealities in a CIGS heterojunction solar
device 37
ii
3.3 Experimental setup and procedures 41
3.3.1 Overview of the statistical analysis 42
3.4 Analysis of the results 44
3.4.1 Results of the statistical assessment 44
3.4.2 Physical interpretation of the results 50
3.5 References 56
Chapter 4 Literature review on alternative parameter estimation techniques 58
4.1 Tackling the heteroscedasticity issue: towards a proper handling of heterogeneous variance
of the experimental error 58
4.1.1 Data-based weighing of the residuals 58
4.1.2 Robust estimation and outlier detection 63
4.2 Accounting for serial correlation of the error 66
4.2.1 Explicit modelling of the serial correlation of the error term 67
4.2.2 Second-order statistical regression 70
4.3 Bayesian statistical assessment 72
4.3.1 A Bayesian view on parameter estimation 72
4.3.2 Bayesian parameter estimation 74
4.3.3 Posterior density distribution for relevant scenarios in kinetic parameter estimation 77
4.3.4 Posterior inference on model parameters 82
4.3.5 Including insights by an informative prior function 87
4.4 References 89
Chapter 5 Benchmark analysis of alternative parameter estimation techniques 91
5.1 Data-based weighted regression 92
5.2 Explicit modelling of serial correlation in the data 96
5.3 Bayesian estimation by MCMC posterior sampling 100
5.4 References 108
Chapter 6 Conclusions and future work 108
Appendix 110
A.1 Matlab routines for alternative techniques to estimate model parameters 110
A.1.1 Data-based weighing 110
A.1.2 Correcting for serial correlation 111
A.1.3 Bayesian estimation with affine invariant MCMC 113
A.2 Lab journal: table of contents 116
iii
List of Figures
Figure 1-1 Reaction network for the catalytic hydrogenation of benzene to
cyclohexane, demonstrating the high degree of complexity of kinetic model building
for even a limited number of reactants and products [1] .................................................. 2
Figure 2-1 EU28 greenhouse gas emissions by sector in 2012 ......................................... 6
Figure 2-2 Evolution of the biodiesel (dark columns) and total biofuel (light columns)
consumption and the share of biodiesel (line) in the EU transport sector [12, 13] ........ 8
Figure 2-3 General scheme for the transesterification of natural fat materials to
produce methyl esters, the actual biodiesel components, with R1,2,3 the carbon chain
of the fatty acid ........................................................................................................................ 9
Figure 2-4 Microscopic structure of gel-type (left) and macroporous resins (right) [22]
.................................................................................................................................................. 10
Figure 2-5 Reaction mechanism for the transesterification of ethyl acetate and
methanol on an ion-exchange resin .................................................................................... 12
Figure 2-6 Original catalyst (left) and the material after dialyzing (right) ................... 16
Figure 2-7 Tubular reactor setup as used during the continuous-flow experiments .. 17
Figure 2-8 Experimentally obtained conversions of ethyl acetate for varying flow
rates, temperatures and initial molar feed composition .................................................. 22
Figure 2-9 Comparison of the observed (markers) and the simulated (line) conversion
of ethyl acetate concentration based on the reported kinetic parameter values for the
studied temperature, feed composition and volumetric flow (♦ 5, ■ 8 and ▲ 15
ml/min) .................................................................................................................................... 24
Figure 3-1 Evolution of the global energy demand by fuel over the last 50 years and
outlooks for the near future [1] ........................................................................................... 28
Figure 3-2 Recent evolution of the total photovoltaic production capacity worldwide
.................................................................................................................................................. 29
Figure 3-3 Transfer of electrons (black) and holes (white) across a p-n
(hetero)junction by means of diffusion ( ) and drift ( ). Recombination
induces the formation of a neutral depletion zone around the interface ...................... 31
Figure 3-4 Current-voltage characteristic of a typical CIGS solar cell in the dark (
) and under illumination ( ) .............................................................................. 32
Figure 3-5 Schematic overview of the multilayer structure of the CIGS thin layer
solar cell .................................................................................................................................. 35
iv
Figure 3-6 Ideal (dashed line) vs. non-ideal (solid) current-voltage profiles, showing
the different non-idealities to be explained ....................................................................... 36
Figure 3-7 Typical parallel current pathways proposed for explaining the non-ideal
behavior of real solar cells and the equivalent electric circuit [22] ................................ 38
Figure 3-8 Experimental setup in real life and schematically, showing the two-wire
configuration .......................................................................................................................... 42
Figure 3-9 Observed current-voltage characteristics of the first cell ............................. 44
Figure 3-10 Best fitting curve when considering resistance of non-ideal contacts only
.................................................................................................................................................. 45
Figure 3-11 Best fitting curve when considering non-ideal contacts and shunt
resistance for the first cell at 290 K ...................................................................................... 46
Figure 3-12 Best fitting curve when considering non-ideal contacts, shunt resistance
and space charge limited current leakage for the first cell at 290 K............................... 46
Figure 3-13 Parity diagram for the first cell at 290 K ....................................................... 48
Figure 3-14 Residual plot for the first cell at 290 K ........................................................... 49
Figure 3-15 Parameter estimates and corresponding 95% individual confidence
intervals for the first cell ....................................................................................................... 51
Figure 3-16 Parameter estimates and corresponding 95% individual confidence
intervals for the second c ell ................................................................................................ 52
Figure 3-17 Contribution of the suggested leakage pathways for both solar cells, for
different temperatures. First cell: left, second cell: right Main junction Shunt
resistance SCLC Shunt tunneling ....................................................................... 55
Figure 4-1 Illustrative example of residual plots for a case with constant variance
(left) and strong heteroscedastic experimental errors (right) ......................................... 59
Figure 4-2 Typical lag-plots of the residuals of an uncorrelated (left) and positively,
first-order correlated (right) data set .................................................................................. 68
Figure 4-3 Typical likelihood distribution for the parameters in a linear (left) and
non-linear model (right) with two parameters ................................................................. 76
Figure 4-4 Illustration of the convergence of a MCMC sampling routine towards an
unknown probability distribution (full line) [32] ............................................................. 86
Figure 4-5 MCMC sampling from a one-dimensional target distribution for different
variances of the normal proposal distribution: 0.05 (left), 1 (middle) and 100 (right)
Top: actual distribution (red line) and sample histogram (blocks) Bottom: trace-plot
of the sampled parameter value as function of the iteration number ........................... 87
Figure 5-1 Estimates for the model parameters A (left) and B (right) for different
values of the scaling factor σ2 for 10 subsequent runs. Filled symbols denote the
results of the weighted regression, open markers correspond to classical least squares
v
estimation. The dotted line corresponds to the true parameter values. Remark the
varying scaling of the y axis................................................................................................. 93
Figure 5-2 Optimal values for the transformation parameter for all three considered
scaling factors and for all runs ............................................................................................ 95
Figure 5-3 Weighted residuals for the ordinary (open symbols) and weighted least
squares estimation (filled markers) .................................................................................... 96
Figure 5-4 Estimates for the model parameters A (left) and B (right) for different
values of the autocorrelation ρ as function of the run number. Filled symbols denote
the results of the Two Stage Iterative regression, open markers correspond to
classical least squares estimation. The dotted line corresponds to the true parameter
values. Remark the varying scaling of the y axis. ............................................................. 98
Figure 5-5 Point estimates for the autocorrelation values for all runs as tinted markers
with the corresponding, true values given by the dotted lines ...................................... 99
Figure 5-6 Residual (left) and lag plot (right) for ordinary (open symbols) and
corrected regression (filled markers) for a run with ρ having a pre-specified value of -
0.99. The solid line in the lag plot is given by ecorr, i = −0.99ecorr, i − 1 .................. 100
Figure 5-7 Results of the Metropolis-Hastings procedure for c = 10, giving the
marginal probabilities from the sampled posterior (above) and the sampled values
throughout the iteration for the model parameters A (left) and B (right) .................. 102
Figure 5-8 Results of the Metropolis-Hastings procedure for c = 1, giving the
marginal probabilities from the sampled posterior (above) and the sampled values
throughout the iteration for the model parameters A (left) and B (right) .................. 102
Figure 5-9 Results from the Adaptive Metropolis algorithm, giving the marginal
probabilities from the sampled posterior (above) and the sampled values throughout
the iteration for the model parameters A (left) and B (right) ........................................ 104
Figure 5-10 Stretch move of walker Xk along the line through walker Xj, yielding
candidate sample Y. All other walkers (grey dots) do not participate. ....................... 105
Figure 5-11 Results from the affine invariant MCMC algorithm, giving the marginal
probabilities from the sampled posterior (above) and the sampled values throughout
the iteration for the model parameters A (left) and B (right) ........................................ 106
vi
List of Tables
Table 2-1 Mass of reactants and internal standard used to make up each of the
studied feed compositions, in gram ................................................................................... 15
Table 2-2 Settings of the FID ................................................................................................ 18
Table 2-3 Retention time for the reactants, products and internal standard ................ 18
Table 2-4 Relative sensitivities of the reactants, products and internal standard........ 20
Table 2-5 Reported kinetic parameters for the transesterification of ethyl acetate and
methanol on Lewatit K2629, as originally estimated from batch-reactor data [22] ..... 23
Table 3-1 Binary correlation matrix of the model parameters for the first cell at 290 K
.................................................................................................................................................. 49
vii
List of Symbols
Roman symbols
𝐴 exponent of the diode equation [1/V]
𝐴𝑖 peak surface from GC analysis for component 𝑖 [cm²]
𝑎𝑖 activity of component 𝑖 [mol/l]
𝐶𝑖 concentration of component 𝑖 [mol/l]
𝑑 Durbin-Watson test criterion
𝐸𝑎 activation energy [J/mol]
𝑒𝒊 residual on the 𝒊’th observation
𝐹𝑖 molar flow rate of component 𝑖 [mol/s]
𝐼 current [A]
𝐽 current density [A/cm²]
𝐽0 saturation current density [A/cm²]
𝐾𝑒𝑞 equivalent equilibrium constant of the catalyzed transesterification
𝐾𝐸𝑡𝑂𝐴𝑐 equilibrium constant for adsorption of ethyl acetate on the catalyst
𝐾𝑀𝑒𝑂𝐴𝑐 equilibrium constant for adsorption of methyl acetate on the catalyst
𝐾𝑆𝑅 equilibrium constant for the surface reaction on the catalyst
𝑘 prefactor of SCLC current density [A/cm²]
𝑘𝐵 Boltzmann constant [J/K]
𝐿(𝜷|𝒚) likelihood function of the model parameters
𝑚 number of responses
- Chapter 4: SCLC exponent []
𝑛 number of experiments
- Chapter 4: ideality factor of a diode [-]
𝑝 number of model parameters
𝑝(𝜷|𝒚) posterior density function of the model parameters
𝑝(𝜷) prior density function of the model parameters
viii
𝑝(𝜷, 𝜮) prior density function of the model parameter values and the covariance
matrix of the experimental error
𝑞(𝜷) proposal distribution
𝑅 Universal gas constant [J/mol/K]
𝑅𝑠 series resistance [Ω/cm²]
𝑅𝑠ℎ shunt resistance [Ω/cm²]
𝑅𝑤,𝑖 net specific reaction rate of formation of component 𝑖 [mol/kg/s]
𝑟𝑤 specific reaction rate [mol/s/kgcat]
𝑇 absolute temperature [K]
𝑢 white noise
𝑽 covariance matrix of the experimental errors
𝑉 voltage [V]
�̇� volumetric flow rate (l/s)
𝑾 weighing matrix
𝑊 catalyst weight [g]
𝑤𝑖 weighing factor for the i’th observation
𝑋𝑖 conversion of component 𝑖 [-]
𝑥𝑖 𝑖’th process condition
𝑦𝑖𝑗 j’th observed response for the i’th experiment
�̂�𝑖 model prediction for responses yi
Greek symbols
𝛽 model parameter
�̂� estimated parameter value
𝜀 experimental error
𝛾𝑖 activity coefficient of component 𝑖 [-]
𝜙 Box-Cox transformation parameter
𝜌 autocorrelation function
ix
𝜮𝒊 covariance matrix of the responses of 𝑖’th experiment
𝜎 standard deviation
List of abbreviations
AR(1) first-order autoregressive model
CIGS copper-indium-gallium-selenium
FID flame ionization detector
MCMC Markov-Chain Monte-Carlo
PTBS Power-Transform-Both-Sides
SCLC space-charge limited current
TLSC thin-layer solar cells
The noble art of model building 1
Chapter 1
The noble art of model building
Ever since man’s ability to comprehend the phenomena that occur in his surroundings had
reached a sufficient level of sophistication, he has strived for a thorough understanding of
the fundamental mechanisms which underlie them. This intrinsic drive for profound insights
in the complex biological, physical or chemical principles that control the environment have
triggered scientific research and enabled the development of revolutionary technologies in
turn. Indeed, the talent to derive quantitative cause-and-consequence relationships between
a particular outcome of the process and the conditions at which it took place, allows for
precise predictions on the behavior of the phenomenon in a broader range of situations. In
particular, the circumstances can then be manipulated, or tuned, in such a way that the
process functions exactly, at least to some extent, as it was stipulated beforehand.
At this point, the need emerges for an exact and accurate expression that fully describes the
dynamics of the process under study. This role is played by the theoretical model, which
tries to capture the effect of all relevant process conditions on its outcome, in a
mathematically closed way. Starting from fundamental principles that are believed to be
applicable to the observed system, a number of parameters will be introduced during the
construction of this relation to reflect the specific characteristics of the studied process.
Although exact values for these parameters are required to allow for an exact model-based
prediction of the process outcome for a specific set of conditions, their direct measurement
may be non-trivial or even practically infeasible, in particular when these parameters
correspond to properties of the system at a very small, micro- to even nanometer scale.
The modeling of chemical reactions is a type example of a field of study which is confronted
almost continuously with a load of unknown and not directly measurable characteristics
during model building. Being an inevitable part when assessing the performance of a
chemical reactor, the accurate description of the chemical kinetics, i.e. the net rate at which
reactants are consumed and products are formed, is crucial for both the design of well-
behaving and stable reactor setups, as well as for the search for innovative, high-
performance catalysts or improvements in reactor technology, allowing to operate the
reactions more safely, at milder conditions or at a lower cost. Moreover, a more efficient
operation of chemical reactors will result in a lower environmental impact of the production
process and is hence highly attractive from an ecological and sustainable point of view as
well.
Unfortunately, as chemical reactions, and especially catalytic systems, tend to occur step-
wise, i.e. by producing a series of intermediate components, which can undergo multiple
side reactions, a complex reaction network arises, see Figure 1-1. The combination of a
potentially very high number of elementary steps and their strong mutual interplay results
2 The noble art of model building
in a global reaction rate which is a highly complex function of both process variables and the
kinetic parameters of the different steps in the network.
Figure 1-1 Reaction network for the catalytic hydrogenation of benzene to cyclohexane,
demonstrating the high degree of complexity of kinetic model building for even a limited
number of reactants and products [1]
Moreover, certainly in catalytic systems where adsorption of some reactants on the catalyst
surface has to occur before the actual reactions take place, the strength of the interaction
between the catalyst and the chemicals and the (un)stabilizing effect of the adsorption on
certain elementary steps, has to be accounted for as well. Since a direct monitoring of these
parameters during the reaction is not possible, different techniques have to be applied to
unravel their actual value indirectly.
An often applied methodology to obtain quantitative values for the unknown kinetic
parameters in a reaction mechanism relies on the regression of the model to a finite set of
experimental data. This way, the observed values for a certain response variable, e.g. the
concentration of a component in the reactor outlet flow, are fitted optimally to the model-
based predictions for that response. Those values for the unknown kinetic parameters that
minimize the objective function of the regression, typically the sum of the squared deviations
between the predicted and measured values, are assumed to be the best possible parameter
estimates. Although intuitively attractive, the validity of this procedure is only assured
under some stringent conditions on the structure of the error on the measured response,
arising from random, unintentional irregularities during the experiment or analysis. At first,
assuming that the actual value for the response at conditions 𝒙 is given by the model
function 𝑓(𝒙), then it follows for the observed response value 𝑦:
𝑦 = 𝑓(𝒙) + 𝜀 (1-1)
which introduces the experimental error 𝜀 as an additive term. Moreover, the following
requirements have to be met concerning the mutual dependence of the errors of different
experiments:
The noble art of model building 3
1. The random experimental error takes a normal distribution;
2. All experimental errors have an expected value of 0;
3. All experimental errors have the same variance or, stated differently, are
homoscedastic;
4. The experimental errors are uncorrelated.
When these conditions are met, a solid mathematical framework allows for the assessment
of the uncertainty on the values of the best fitting parameter estimates, together with
inference on the significance of the estimated parameters individually, i.e. whether they
differ substantially from 0 and hence contribute significantly to the model, and on the
significance, adequacy and quality of the regression as a whole [2]. This way, besides a set of
best performing values for the unknown parameters in the kinetic model, this regression
methodology allows for a full statistical analysis and evaluation of the outcome as well.
Nevertheless, the need for a set of stringent assumptions to ensure the correctness of the
regression procedure forms a considerable pitfall, as there is no a priori certainty that these
conditions will be fulfilled for a certain set of experimental data. Hence, applying the
methodology unwittingly to observations that violate the requirements will potentially yield
inaccurate results. Additionally, any statistical analysis that relies on it or any quantitative
conclusions that are drawn from it may lack any mathematical foundation as well, although
the regression would wrongly point out it does not. Logically, this introduces a considerable
degree of uncertainty on the validity of the outcome of the statistical assessment. Therefore,
the reliability of the estimation procedure is more a case of a good appraisal by the
researcher, or his experience with regression techniques. Despite these disadvantages, the
classical approach towards parameter estimation as described above, finds its application in
almost all in-house kinetic modeling efforts.
1.1 Aim of the thesis
In this work, the performance of the currently used methodology towards kinetic parameter
estimation will be critically evaluated and reviewed where necessary.
The evaluation will be two-fold. At first, the potential for its overall applicability in other
scientific fields beside chemical kinetics will be assessed by implementing the methodology
on another scientific field, different from chemical kinetics. Following the remark made
above on the significant contribution of model building to gains in sustainability of chemical
technology, the modeling of the electrical characteristics of thin-layer solar cells, a truly
physical process which is a promising candidate for the generation of green energy, has been
elected as a suited candidate. Strong technological advances in this segment of photovoltaic
technology have to often been based on trial-and-error in the past. Recent breakthroughs in
its fundamental modeling of the physics of the system have resulted in an increased interest
for the theoretical approach as a helpful tool to guide further research. By applying the
modeling methodology that has already proved to be successful in the modeling of complex
chemical processes, it is hoped to contribute to this important evolution.
4 The noble art of model building
On the other hand, the robustness of the modeling methodology will be quantified by
comparing the estimated model parameters for a type reaction in both a batch and
continuous-flow configuration. Up to this moment, it is common practice to base kinetic
modeling on experimental data from continuous reactor setups, primarily because it is
believed that following a certain batch reaction in time is detrimental for the non-correlation
between the different data points. By testing the impact of the reactor type on the estimated
model parameters for a transesterification reaction, which is a key step in the production
process of biodiesels, the importance of this concern will be determined.
Besides this twofold evaluation of the currently used methodology, the potential of three
alternative estimation procedures will be benchmarked. By offering small adaptations to the
classical regression framework, these candidate routines try to guard the performance of the
parameter estimation against violation of the theoretical conditions on the experimental
error, as stated above. After implementation of the routines in Matlab, their added value will
be evaluated on a simple linear model, and suggestions will be made about the use of their
incorporation in the current routines.
1.2 Outline of this work
The evaluation of the current statistical methodology by means of the two case studies is
covered in the upcoming two chapters. The outcome of the transesterification experiments
and of the subsequent assessment of the robustness of the estimation procedures will be
discussed in chapter 2. Subsequently, the overall applicability of the methodology will be
evaluated on the modeling of the thin-layer solar cells, which is presented in chapter 3. The
benchmark analysis of alternative estimation techniques is covered in the second part of the
thesis. In chapter 4 the results of the literature survey on alternative techniques towards
model parameter estimation will be presented. Special attention will be paid to the
underlying mathematical theory and the corresponding algorithms, given that current in-
house knowledge about the suggested methods is limited and in view of their final encoding
in Matlab. The results of the actual benchmark analysis from their application on a
rudimentary, linear test model are presented in Chapter 5. Finally, the overall conclusions
and outlooks for further research in this field will be the subject of chapter 6.
1.3 References
1. Tapan, B., Hydrogenation of aromatics: single-event microkinetic (SEMK) methodology and scale-up. 2012.
2. Thybaut, J.W., Kinetic Modeling and Simulation - University Course. 2014, Ghent University.
Case study: kinetic modeling of continuous-flow transesterification 5
Chapter 2
Case study: kinetic modeling of
continuous-flow
transesterification
2.1 Biofuels: a promising alternative to fossil feedstocks?
A growing demand for personal transport and a gradual transition towards a globalized
economy over the last decades have increased the needs for transporting people and
goods. A vast majority of current engines is powered through the combustion of
conventional, carbon-rich fuels, i.e. gasoline, diesel and kerosene. Because of these
increasing transport requirements there is a continuous search for new sources of crude
oil. For decades, both by discovering new oilfields and the expansion of crude
exploitation rates at existing facilities have been able to meet these growing demands.
Although the interest in technology relying on alternative power sources, with electricity
driven engines as the most promising candidate, has grown and resulted in an onset for
commercialization, the strong dependence of the transport sector on fossil feedstock is
expected to remain constant at least for the upcoming years.
Over the last decade, global awareness has grown about the downside of this situation.
As for any carbon-relying energy source, the combustion of fossil fuels for transport
purposes generates significant emissions of CO2 into the atmosphere, a greenhouse gas
known contribute strongly to global climate change [1]. In 2012, transport represented
24.3 % of the total greenhouse gas emissions in the European Union, this way being the
second biggest contributor as it is only preceded by energy production facilities, as
shown in Figure 2-1 [2]. In contrast to other sectors, e.g. industry, households and
agriculture, which showed an average decrease of 15% in greenhouse gas emissions from
1990 to 2012, the emissions associated with transport have seen a strong growth of 36 %,
despite improved vehicle efficiencies. Although road traffic is holding the highest share
in transport emissions, aviation and maritime sectors show the fastest growth of all.
6 Case study: kinetic modeling of continuous-flow transesterification
Figure 2-1 EU28 greenhouse gas emissions by sector in 2012
The supplies of crude oil are known to be finite. Although the prediction about the exact
moment of depletion is not unambiguous, even nowadays the gradual diminishing of
high-quality and easily accessible crude oil reserves has already resulted in an increased
use of heavier and less pure feedstocks. This requires more complex cracking and
refining operations and hence higher production costs to obtain the same product quality
[3]. Logically, the future exploitation of less accessible oilfields will require higher
investment and operating costs, which will induce an additional challenge for the
profitability of the process as a whole. At last, the distinctly unequal spread of crude oil
production facilities over the world has rendered its abundant availability highly
vulnerable to local conflicts and geopolitical motives. Multiple crises in the past have
already demonstrated that long-term stability of crude supply is never ascertained [4].
A growing concern about these potential issues of a continued dependence of economical
and societal needs on crude oil has driven the search for attractive alternative energy
sources, which preferably show both sustainability and long-term secured supply.
Moreover, the ratification of international protocols to counter climate change by
building down global emissions of greenhouse gases, e.g., the Kyoto protocol (1997)
which stipulated a decrease with at least 5% compared to the situation in 1990 by 2012 [5]
and even the more stringent engagement of the European Union member states to reduce
their emissions with at least 20% by 2020 [6] have resulted in an active energy policy to
stimulate the research and development of sustainable energy carriers [7].
One of the most promising outcomes of this quest was found in the field of liquid
biofuels, and more specifically in the counterparts of currently used fuels but originating
from biomass instead of fossil feedstocks. Theoretically, such technology combines a
bunch of advantages that makes it very attractive as a competitor to conventional fuels.
Indeed, since biofuel compounds are chemically identical to the components found in
29.20%
24.30% 17.70%
11.30%
12.50%
5.00%
Heat and power generation
Transport
Industry & Construction
Agriculture
Residential & Commercial
Others
Case study: kinetic modeling of continuous-flow transesterification 7
conventional fuels, nowadays engines require few or no adaptation, depending on
whether the biofuels are fed in pure state or blended with a conventional fossil fuel.
When produced out of materials from vegetable origins, biofuels are intrinsically CO2
neutral, as all carbon dioxide which is released upon combustion of these compounds
was captured out of the atmosphere by photosynthesis during the growth of the original
plants. Hence, when the cycle is completed, no net CO2 has been released into the
environment. In this regard, it is important to remark that multiple steps in the
production process of biofuels are energy-consuming, so that the environmental benefit
of the biofuel pathways is, at least partly, cancelled. The high production costs are often
reported as the major impediment for this technology to expand to a fully mature
alternative to fossil fuels, that is capable of being cost-effective and profitable without the
requirement for governmental allowance and lucrative taxation measures and subsidiary
programs [8].
A second advantage of bio-based alternatives is its relatively abundant availability, as it
only requires additional cultivation to meet growing demands, biofuels offer the
perspective on a potentially inexhaustible energy source. This picture has proven to be
too optimistic, as the main feedstocks for biofuel production nowadays are found in
high-quality crops, especially sugar-rich or oily vegetables which need highly-qualitative
agricultural land for proper growth. This way, the production of sufficient amounts of
feedstocks will potentially compete with food supply. The need for sufficient agricultural
areas will be compensated at the expense of forested territories. As a consequence, the
unambiguous transition to such biofuels, belonging to the so-called first generation as they
are produced out of specific crops, raises important ethical questions, which makes its
feasibility as a stand-alone alternative to fossil fuels at least questionable [9].
Nevertheless, over the last decade, the consumption of conventional biofuels has seen a
distinct increase, primarily booming in the early and mid-2000’s and somewhat
stabilizing more recently, as is concluded from Figure 2-2, which shows the recent trends
for the European Union. Due to the variety of fuel products that need to be replaced,
ranging from short-chain gasoline components over kerosene to heavier diesel fractions,
different biomass conversion processes have been developed, each starting from a
specific type of feedstock. Gasoline fractions have been successfully replaced by bio-
ethanol, being produced by the direct fermentation of sugar-rich biomass, primarily
sugarcane, sugar beet and starch [10]. The production of bio-ethanol is a well-established
technology and is still the subject of extensive research efforts to further improve energy
requirements and product yields. Active governmental policies have stimulated the
market expansion over the last years with Brazil, the United States and the European
Union as the largest customers [11].
8 Case study: kinetic modeling of continuous-flow transesterification
Figure 2-2 Evolution of the biodiesel (dark columns) and total biofuel (light columns)
consumption and the share of biodiesel (line) in the EU transport sector [12, 13]
However, the largest share of nowadays biofuel consumption in the European Union is
taken by biodiesels, the actual subject of this work. Because diesel fractions consist of
much longer hydrocarbon chains compared to gasoline, different feedstocks and
production processes are required. The most relevant feedstocks which are mentioned in
literature are fatty materials, including vegetable oils and animal greases, which are
transformed into useful hydrocarbon compounds by transesterification process,
discussed more extensively in the next section. The importance of the biodiesel sector in
the field of renewable fuels is depicted in Figure 2-2 as well and shows its vast and
steady dominance on the total biofuel consumption in the transport sector.
2.1.1 Catalytic pathways to biodiesel production
The most attractive biomaterial feedstock that offers the organic building blocks to
produce the relatively long hydrocarbon chains that characterize diesel fractions are fatty
substances. Indeed, vegetable oils and greases from animal origins are in fact tri-ester
combinations of a glycerol molecule and three fatty acids. Depending on the type of fatty
acids that are built in, a different triglyceride or fat type, arises. Fatty esters are carboxylic
acids with an aliphatic carbon chain, typically ranging between 12 and 24 for natural
sources. Most vegetable oils consist of unsaturated fatty esters, so that at least one double
bond is present in the chain. On the other hand, animal fats are typically completely
saturated, which explains why these latter are found in solid form at room temperature.
The direct use of raw vegetable oils as fuels in ordinary diesel engines is hindered by an
excessive viscosity, an insufficient volatility and a limited stability of the oil, the latter
primarily due to the unsaturation of the chain [14]. Therefore, reforming of the fatty
feedstock into actual diesel-like components prior to its application for fueling is
required.
0
1
2
3
4
5
0
2
4
6
8
10
12
14
16
2005 2006 2007 2008 2009 2010 2011 2012
Share in energy
consumption [%]
Yearly
consumption
[Mtoe]
Case study: kinetic modeling of continuous-flow transesterification 9
The most preferable industrial pathway towards biodiesel production comprises a
transesterification of the fatty compounds with methanol, to obtain both glycerol and a
methyl ester blend, being the final biodiesel product, see Figure 2-3. As for all
transesterification reactions, this process requires base or acid catalysis. Common
practice makes use of a homogeneous catalyst, with NaOH and KOH as the most eligible
candidates. Unfortunately, the choice for a homogeneous base catalyst during the
production process strongly suffers from the requirement for subsequent purification. To
ensure sufficient biodiesel quality, the catalyst has to be removed from the reaction blend
product by washing it away, which produces high amounts of wastewater to be treated
and hinders the complete recuperation of the catalyst simultaneously. It goes without
saying that any separation step is strongly reflected in higher operation costs as well [15].
Moreover, raw biodiesel feedstock tends to contain a relatively high concentration – more
than 5% – of free fatty acids. These compounds will react with strong base catalysts to
form soaps, which are detrimental for the quality of the product and require a specific
separation consequently. Trace amounts of water in the feedstock are known to favor the
saponification as well [16]. An esterification of the fatty acids with methanol under acid
catalysis prior to the actual transesterification is an attractive pathway to overcome this
issue.
Figure 2-3 General scheme for the transesterification of natural fat materials to produce
methyl esters, the actual biodiesel components, with 𝑅1,2,3 the carbon chain of the fatty acid
Because of these strong disadvantages associated with homogeneous base catalysis,
interest has grown in heterogeneous alternatives that do not show the abovementioned
issues. One potential candidate that has received above average research interests is the
field of heterogeneous inert carriers, typically metallic oxides, with base or acid
functionalities, which allow for circumventing the need for purification and separation
[17, 18]. It has to be noted that this reduction of the process complexity is at the expense
of a loss of reaction rate [19].
A relatively novel approach towards the heterogeneous catalysis of biodiesel production
processes has emerged from the field of ion-exchange resins [20, 21]. Given some major
advantages compared to homogeneous catalysis, including an straightforward
10 Case study: kinetic modeling of continuous-flow transesterification
separation from the product stream, only mild corrosion and a favorable long-term
stability allowing for re-use, ion-exchange catalysis by acid cation exchanger has been
identified as a promising alternative. Additionally, the undesired soap formation does
not occur upon acid catalysis.
Suited materials consist of an inert matrix carrying acid functions which are sufficiently
loose-bound to allow for interchange with ions in the surrounding solution. Although
this technique shows some similarities to sorption of solutes on a solid, ion-exchange is
intrinsically stoichiometric with respect to the electrolyte concentration of the solution:
for any ion that is attached to the material, another ion, with the same charge, is released.
Two types of acid ion-exchange materials are distinguished, gel-type as well as
macroporous resins. The inert carrier is formed by a network of styrene-polymers and
divinylbenzene crosslinkers, while acid sulfonic functionalities serve as the actual active
sites. When submerged in polar solvents, part of the liquid molecules interacts with the
acid functions and are protonated. Due to solvation effects, more solvent molecules are
attracted into the material and swelling takes place. The resulting volume expansion is
crucial for the effective functioning of the catalyst as it increases the accessibility of the
active sites. In contrast to gel-type materials, macroporous resins have an intrinsic open
structure, see Figure 2-4, allowing for an enhanced accessibility of the active sites, even in
non-polar solvents. Consequently, the catalytic performance of macroporous resins is less
dependent on the swelling degree of the material as well.
Figure 2-4 Microscopic structure of gel-type (left) and macroporous resins (right) [22]
The potential application of ion-exchange resins for the catalysis of transesterification
reactions has triggered the expansion of research efforts on this alternative approach, and
a proper understanding of the chemical phenomena taking place on the micro-scale has
been the object of extensive modeling efforts. Recently, a complete kinetic model has
been developed, combining the chemical interplay between the reactants and the resin,
the effect of the morphology of the catalyst and the impact of the swelling degree [22]. In
Section 2.2, the most relevant aspects of this model will be briefly discussed.
Case study: kinetic modeling of continuous-flow transesterification 11
2.1.2 Outline of this chapter
The assessment of the robustness of the current in-house modeling methodology will be
carried out on the basis of the kinetic model introduced below. The reported values of the
corresponding kinetic parameters have been determined based on experimental data for
the catalytic transesterification of methanol and ethyl acetate in a batch-type reactor
setup. For the actual estimation procedure, a classical nonlinear regression has been
performed in the statistical software package Athena Visual Studios. By repeating these
experiments in a tubular reactor setup under the same process conditions, continuous-
flow data have been collected and in turn subjected to a regression procedure. By
comparing the outcome of this estimation with the original values from the batch
experiments, this case study will provide a quantitative assessment of both the accuracy
and the robustness of the modeling methodology towards variable reactor.
In what follows, first the kinetic model under study will be discussed into the relevant
details. After a thorough description of the experimental setup, the results of the kinetic
parameter estimation and the subsequent analysis of the performance of the regression
methodology will be given.
2.2 Reactor-scale kinetic modeling of the
transesterification reaction on macroporous resins
2.2.1 Reaction network for catalytic transesterification of methanol and ethyl
acetate
The use of ion-exchange resins rather than the more common, “inert-carrier based”
catalysts which introduces some additional degrees of complexity in the kinetic modeling
effort. Indeed, the coupling of a reactive component on the resin involves an exchange
process instead of the purely sorptive adherence encountered in the majority of
heterogeneous catalysis. Additionally, the impact of the resin’s swelling on its
effectiveness to accelerate the transesterification reaction will have to properly described
and translated mathematically, in order to allow for a model that combines accuracy and
completeness. The kinetic model was originally developed for the transesterification of
ethyl acetate with an excess of methanol, so that the latter will act as the polar solvent
required for swelling of the ion-exchange resin.
Basically, the proposed kinetic model is an extension of the typical adsorption-based
behavior which has already been repeatedly described for heterogeneous catalysis.
Because of the strong interaction of the resin with the polar methanol, it is assumed that
the latter will occupy all available active sites of the catalyst upon contact with the
reaction mixture. By reacting with the acid functions, the adsorbed methanol becomes
protonated, and the resin starts to swell. The uptake of solvent will stop once
thermodynamic equilibrium is reached. The reaction mechanism is graphically
12 Case study: kinetic modeling of continuous-flow transesterification
represented in Figure 2-5. It is important to remark that all steps in this mechanism, both
the actual surface reaction and the different exchange reactions, are explicitly assumed to
be equilibrated.
The first step in the actual transesterification reaction is the adsorption of the ester, in this
case ethyl acetate, on the active sites of the material. Since all sites are occupied by
solvent molecules, this boils down to an exchange between the protonated methanol and
the ester in the liquid phase inside the resin, depicted as 1 in the figure. The now
protonated, and hence activated, ester molecule is now readily available for an Eley-
Rideal type surface reaction with a free, i.e. unbounded, solvent molecule from the liquid
phase, step 2. The actual transesterification step occurs in the adsorbed species, depicted
as step 3, and was found to be rate-determining. After the release of an ethanol molecule
Figure 2-5 Reaction mechanism for the transesterification of ethyl acetate and methanol on
an ion-exchange resin
1
2 4
3
5
exchange
transesterification
𝐾𝑆𝑅
𝐾𝑀𝑒𝑂𝐴𝑐
𝐾𝐸𝑡𝑂𝐴𝑐
Case study: kinetic modeling of continuous-flow transesterification 13
into the solvent, the desired, yet still adsorbed, methyl acetate end-product is formed, see
step 4. Eventually, the transesterification is completed by a second exchange step, shown
as step 5, in which the protonated methyl acetate is again substituted by a methanol
molecule from the liquid phase.
2.2.2 Model for continuous reactor configuration
Since a tubular reactor will be used to perform the continuous-flow experiments,
discussed in detail in Section 2.3.2, the setup is modelled as ideal plug flow. Therefore,
the following mass balance holds for each component at every point along the catalyst
bed in the tube:
𝑑𝐹𝑖
𝑑𝑊= 𝑅𝑤,𝑖 (2-1)
where 𝐹𝑖 is the molar flow rate of component 𝑖 through the reactor, 𝑊 is the catalyst mass
𝑅𝑤,𝑖 is the net specific rate of formation of component 𝑖. Except for the internal standard,
which does not participate in the transesterification reaction, material balances like (2-1)
are applied to the reactants methanol and ethyl acetate, as well as for the products
ethanol and methyl acetate. Assuming a constant volumetric flow, i.e. neglecting any
density changes of the mixture due to reaction, the material balance is rewritten in terms
of the concentration of the components, as:
𝑑𝐶𝑖
𝑑𝑊=
1
�̇�𝑅𝑤,𝑖 (2-2)
where �̇� is the total volumetric flow rate through the tube. When adopting these reactor
model equations, it is implicitly assumed that the operation occurs in the intrinsic kinetic
regime, in accordance with the findings on negligible transport limitations as stated in
the original research.
Since the net specific rate of formation 𝑅𝑤,𝑖 strongly depends on the reaction mechanism
underlying the reaction under study, it is in general a complex function of both the
concentrations of the reactive components and of the reaction temperature. The
derivation of a useable analytical expression for this term requires some assumptions on
the chemical kinetics of the system. As was mentioned in Section 2.2, the surface reaction
denoted as step 3 was elected as the rate-determining step. Consequently, the rate of the
global reaction equals that of the surface reaction, as the rates of all other steps in the
mechanism as depicted in Figure 2-5 will instantaneously adjust to it according to what
equilibrium requires. Additionally, the frequently applied quasi-steady state
approximation is used for all absorbed species on the catalyst surface. Combining both
assumptions results in a closed relation between the rate 𝑟𝑆𝑅 of the surface reaction and
the activities 𝑎𝑖 of the involved species:
14 Case study: kinetic modeling of continuous-flow transesterification
𝑟𝑤 ≡ 𝑟𝑆𝑅,𝑤 =
𝑘𝑆𝑅𝐾𝐸𝑡𝑂𝐴𝑐 (𝑎𝐸𝑡𝑂𝐴𝑐 −1
𝐾𝑒𝑞
𝑎𝐸𝑡𝑂𝐻𝑎𝑀𝑒𝑂𝐴𝑐𝑎𝑀𝑒𝑂𝐻
)
1 +𝐾𝐸𝑡𝑂𝐴𝑐𝑎𝐸𝑡𝑂𝐴𝑐
𝑎𝑀𝑒𝑂𝐻+
𝐾𝑀𝑒𝑂𝐴𝑐𝑎𝑀𝑒𝑂𝐴𝑐𝑎𝑀𝑒𝑂𝐻
(2-3)
where the rate coefficients and equilibrium constants are in accordance to Figure 2-5. The
equivalent equilibrium constant 𝐾𝑒𝑞 of the global reaction obeys:
𝐾𝑒𝑞 =𝐾𝐸𝑡𝑂𝐴𝑐𝐾𝑆𝑅
𝐾𝑀𝑒𝑂𝐴𝑐 (2-4)
The link between the activities 𝑎𝑖 of the components and their concentrations 𝐶𝑖 is given
by:
𝑎𝑖 = 𝛾𝑖 ∙ 𝐶𝑖 (2-5)
where the proportionality constant 𝛾𝑖 is the activity coefficient of that particular
component. Since the strongly differing polarity of the different species in the reaction
mixture, the original kinetic model explicitly accounted for non-ideal thermodynamic
behavior of the liquid phase by applying the UNIFAC method relying on group
contribution theories. This allows for estimating the thermodynamic properties of a
specific component in a complex mixture by splitting up the molecule in its functional
groups. Subsequently, by means of theoretical correlations available in literature, the
interaction of the different groups is modelled, yielding values for the activity of the
different components after summation. The specific net rate of formation of each
component is then given by:
𝑅𝑤,𝑖 = 𝜈𝑖𝑟𝑤 (2-6)
where 𝜈𝑖 equals −1 for the reactants methanol and ethyl acetate and 1 for the ethanol and
methyl acetate products. Combining equations (2-2) to (2-6) yields a differential equation
for the composition of the mixture along the reactor. This way, for each of the reactants
and the product components, the concentration at the end of the catalyst bed can be
calculated once values for the set of unknown rate coefficients 𝑘𝑆𝑅, 𝐾𝐸𝑡𝑂𝐴𝑐 and 𝐾𝑀𝑒𝑂𝐴𝑐 are
available. The temperature dependence of the rate coefficient of the surface reaction is
given by an Arrhenius relation:
𝑘𝑆𝑅 = 𝐴𝑆𝑅exp (−𝐸𝑎,𝑆𝑅𝑅
𝑇) (2-7)
which is typically reparametrized for parameter estimation as:
𝑘𝑆𝑅 = 𝑘𝑆𝑅,𝑇𝑚exp (−𝐸𝑎,𝑆𝑅𝑅 (
1
𝑇−
1
𝑇𝑚)) (2-8)
with 𝑇𝑚 the average experimental temperature, and 𝑘𝑆𝑅,𝑇𝑚 the rate coefficient at that
temperature. For the equilibrium constants, the temperature dependence is neglected, as
suggested for the original model. This way, a set of 4 unknown parameters remains,
Case study: kinetic modeling of continuous-flow transesterification 15
i.e. 𝐴𝑆𝑅, 𝐸𝑎,𝑆𝑅, 𝐾𝐸𝑡𝑂𝐴𝑐 and 𝐾𝑀𝑒𝑂𝐴𝑐, which have to be estimated based on an optimal fitting
of the collected experimental data.
2.3 Experimental setup and procedures
2.3.1 Reactants and catalyst
The transesterification reaction was run with ethyl acetate, purity of 99+%, Chemlab, and
methanol, purity 99.8+%, Chemlab, as reactants. N-decane, with 99+% purity from Acros
Organics, was used as internal standard. Experiments have been performed for feed
mixtures with an initial methanol to ethyl acetate ratio of 1:1, 5:1 and 10:1. Table 2-1 lists
the weights of reactants and internal standards that were added for each of these
situations. The amount of n-decane was chosen so that a mass fraction of 5% in the initial
mixture was obtained.
Table 2-1 Mass of reactants and internal standard used to make up
each of the studied feed compositions, in gram
1:1 5:1 10:1
methanol 1474 2563 3557
ethyl acetate 4053 1410 978
n-decane 291 209 239
The macroporous ion-exchange resin Lewatit K2629, produced by Lanxess, was used to
catalyze the transesterification reaction. It consists of small beads with a diameter of
about 1 mm, consisting of polystyrene and divinylbenzene crosslinkers. The active acid
sites are formed by sulphonic functional groups. The resin is thermostable up to 398 K;
for higher temperatures a loss in catalytic activity is observed [22].
Because of the strong interaction of the acid sites with polar components, water tends to
adsorb instantaneously upon exposure of the catalyst to the air. Therefore, to ensure an
optimal activity of the resin, all adsorbants have to be removed prior to any experiments
by freeze-drying the beads under vacuum conditions. However, some impurities other
than water may be adsorbed on the catalyst surface as well. Because of the sensitivity of
the freeze-drier to primarily acid components, any polluting substance on the resin has to
be removed in advance. To this end, the catalyst was first put in a dialysis setup for two
days. A semipermeable membrane of 15 kDa was used as a container for the resin and
pended in a vessel of water, which was refreshed every day. The visual difference
between the clean catalyst and initial material was remarkable. As is readily seen from
Figure 2-6, the color of the initial material was heterogeneous, where some of the beads
were deep brown, while after the dialysis all of the catalyst has the same beige tint.
Subsequently the catalyst was put in the freeze-drier, where the material was cooled
16 Case study: kinetic modeling of continuous-flow transesterification
down to -73 °C under a pressure of 0.370 bar. In these conditions, the sublimation of the
adsorbed water is favored, allowing for a freeing of the active sites. Any formed water
vapor is then captured by deposition on a cold disk at the bottom of the installation.
Figure 2-6 Original catalyst (left) and the material after dialyzing (right)
After two days, the dry catalyst was evacuated from the vacuum and sealed from the air
with parafilm immediately, waiting for its addition to the reactor setup.
2.3.2 Tubular reactor setup
The entire reactor configuration is depicted in Figure 2-7. The reactor core consists of a
glass, double-walled tube, placed vertically and connected with the feed reservoir at the
bottom and with a drain at the top. The inner tube is filled half with Raschig rings, with a
tuft of glass wool which acts as the support for the catalyst plug on top. By means of an
adjustable volumetric pump, which manages flow rates ranging from 1 to 20 ml per
minute, the reactor is then filled with methanol up to a few centimeters above the glass
wool. Subsequently, the parafilm seal of the catalyst holder is broken, and, after having
weighed 2.00 gram of the resin, it is added into the methanol layer. After having waited
for a quarter of an hour to allow for swelling of the resin, another layer of glass wool is
applied on top of the catalyst plug. The tube is then further filled with Raschig rings and
made leak-proof meticulously. Afterwards, the reactor is flushed with methanol.
Case study: kinetic modeling of continuous-flow transesterification 17
Figure 2-7 Tubular reactor setup as used during the continuous-flow experiments
The outside ring of the tube is connected to a heater using thermal oil as heating liquid.
This way, the temperature of the catalyst bed can be changed as required and kept
almost constant during the reaction, for each experiment. Experiments were performed at
30, 45 and 60°C, well below the critical temperature for the stability of the catalyst and
therefore limiting its deactivation.
Besides temperature also the feed flow rate and feed composition are process conditions
that are easily manipulated. Feed mixtures with a molar ratio ranging from 1:1 over 5:1 to
10:1 have been prepared, and sent through the reactor at 5, 8 and 15 ml/min. Together
with the varying reactor temperature, varying these parameters creates a collection of 27
different experimental conditions, and all of them have been investigated. It was
observed that, after addition of the polar reactants and apolar internal standard to the
feed reservoir, a thin layer of n-decane floated on the bulk. To break this phase separation
and ensure the homogeneity of the feed composition, the feed container was stirred prior
to its implementation in the setup. Experiments have been performed successively to
minimize time loss. To avoid contamination of subsequent measurements, the reaction
had to be run at the new process conditions for a sufficiently long time before the new
sample was taken from the reactor effluent. The optimal run time in between to samples
was determined by means of a stability test of the reactor, i.e. the reactor was fed with a
mixture of methanol, ethanol and internal standard of known composition. Samples of
the reactor effluent were taken every 30 minutes and injected in a GC analyzer, see below
for the technical details and applied during this analysis. This way, the composition of
the reactor effluent was retrieved. Comparison of the results over time learnt that the
concentrations of reaction products stabilized after 60 minutes and therefore, the time
18 Case study: kinetic modeling of continuous-flow transesterification
between to samples was set at this value. Moreover, as running the reaction for more
than 4 hours did not result in any change in the effluent composition, it was concluded
that the catalyst stability sufficed to .
To determine the concentrations of the different components in the reactor effluent,
which will be used as the response values for the kinetic parameter estimation, the
samples were subjected to a GC analysis. The gas chromatograph is a 6850 Network GC
from Agilent Technology. The capillary column, 1.6 m long and with an internal
diameter of 0.25 mm, is covered with a 25 µm layer of polydimethylsiloxane. Helium is
used as carrier gas with an initial flow rate of 1.00 ml/min. The actual detection takes
place in a flame ionization detector, with settings as listed in Table 2-2.
Table 2-2 Settings of the FID
Temperature 300 °C
Hydrogen flow 60 ml/min
Air flow 450 ml/min
An autosampler was used to inject 1.2 µl of the samples in the GC analyzer. In between
two subsequent injections, the syringe was washed 6 times with hexane and thereafter 6
times with the reaction mixture itself. To limit the loading of the column during the
analysis, the split ratio was set at 20:1. By applying a tuned temperature program for the
GC analysis, it was ensured that all component peaks were nicely separated in time,
allowing for an accurate integration afterwards. The temperature is first hold at 40°C for
10 minutes, followed by a constant increase of 10°C/min during 11 minutes. The
integration of the peaks of the chromatogram was carried out by the EZChrom Elite
client/server software. The retention time for the different components in the reaction
mixture is given in Table 2-3.
Table 2-3 Retention time for the reactants, products and internal standard
Component Retention time [s]
Methanol 5.0
Ethanol 5.3
Methyl acetate 5.9
Ethyl acetate 7.3
n-decane 20.7
Case study: kinetic modeling of continuous-flow transesterification 19
To calculate the actual concentrations of the different components from their integrated
peak areas, calibration factors are required. Prior to the actual experiments, multiple
attempts have been made to determine these values experimentally. Therefore, sets of
five samples with a well-known composition of all reactants, products and the internal
standard were made up and subjected to a GC-analysis. Normally, a linear relation has to
emerge when plotting the known concentrations with respect to the collected peak areas,
for each of the components. Unfortunately, for some not yet identified reason no regular
plots were obtained, even after repeating this procedure for a second and third time.
Therefore, the calibration method of Dietz (1967) for Hydrogen Flame Ionization
Detectors was considered to process the GC results from the actual experiments [23].
Following this procedure, the weight fractions 𝑤𝑖 of the 𝑖’th component in the reaction
mixture is linked to the corresponding peak area 𝐴𝑖 by:
𝑤𝑖 =
𝐴𝑖𝑅𝑆𝑖
⁄
∑𝐴𝑗
𝑅𝑆𝑗⁄𝑗
(2-9)
where 𝑅𝑆𝑖 is the relative sensitivity of the component, for which tabulated values are
available for the most common chemical compounds and the normalization factor in the
denominator takes up all components in the mixture. The relative sensitivities for the
components in the reactor effluent stream are given in Table 2-4. Once weight fractions
were obtained for all components, the corresponding molar fractions 𝑥𝑖 were calculated
by:
𝑥𝑖 =
𝑤𝑖𝑀𝑊𝑖
⁄
∑ 𝑤𝑗 𝑀𝑊𝑗⁄𝑗
(2-
10)
with 𝑀𝑊𝑖 the molar mass. Out of these molar fractions, the total concentration of each
products follows straightforwardly as:
𝐶𝑖 =𝑥𝑖 ∙ 𝑀𝑡𝑜𝑡
𝑉𝑡𝑜𝑡= 𝑥𝑖 ∙ 𝐶𝑡𝑜𝑡 (2-11)
where the total molar concentration of all components 𝐶𝑡𝑜𝑡 yet to be determined.
However, study of equations (2-2), (2-3) and (2-5) learns that this factor cancels out on
both sides of the reactor equations and therefore does not need to be determined to allow
for estimation of the kinetic parameters.
20 Case study: kinetic modeling of continuous-flow transesterification
Table 2-4 Relative sensitivities of the reactants, products and internal standard
Component Relative sensitivity [-]
Methanol 0.23
Ethanol 0.46
Methyl acetate 0.20
Ethyl acetate 0.38
n-decane 1.00
2.4 Discussion of the experimental data
Once the molar composition of the samples for all 27 experiments was unraveled from
the from the GC data, the quality of the results was assessed by inspection of the mass
balance. Indeed, as was mentioned in the procedures section, care was taken that each
feed contained 5% of n-decane. Since this compound is an inert for the transesterification
process, it is expected that, in normal operation without leakage, the reactor effluent
contains exactly the same amount of internal standard. Moreover, as the reaction is
equimolar, the calculated molar fractions for both products have to be both equal to the
reductions of the initially present reactants. Hence, two measures are at disposal to
evaluate the quality of the data.
Surprisingly, this closure was never achieved for both methods simultaneously. The
relative mismatch of the calculated mass fraction of n-decane in the effluent was ranged
from 0.4 to 49.6%, while the percentage deviation between the calculated molar fractions
of ethanol and methyl acetate varied between 22 and 94%, which is far above errors that
are attributed to random experimental errors. Moreover, the amount of methanol, still a
reactant, in the reactor effluent was found to have increased compared to the feed. Since
no distinct explanations were found for these disappointing results, the interpretation of
their occurrence is not that straightforward.
It was noticed during the experiments, especially those at 45°C and 60°C, that some gas
bubbles popped up above the heating zone. At least one component appeared to
vaporize at these temperatures, although the boiling points of all reactants, products and
standard are higher. Vapor leakages were assumed to be negligible because the bubbles
had to pass a relatively long, non-heated zone before they reached the sampling point.
Therefore, it was expected that most of the bubbles would have condensed there, give
their relatively high boiling points. It is possible that this assumption was slightly too
ambitious and that this evaporation did change the composition of the effluent
Case study: kinetic modeling of continuous-flow transesterification 21
substantially. However, the equally present deviations at 30°C, where no bubble
formation was noticed, are not explained this way.
Alternatively, the unexpected observations may arise from ineffective or insufficient
reactor stabilization, i.e. part of the reaction mixture of the foregoing experiment was not
entirely flushed out and mixed up with the product stream of its successor. Although it
surely is a plausible argument, the measurements on the reactor’s stabilization prior to
the actual experiments were rather convincing that the chosen sampling period would
suffice. Moreover, if this mixing is indeed the culprit of the bad data quality, it is
reasonable to expect this effect would be smaller for the experiments at high volumetric
flow rate, and hence a fast flushing of the reactor content. Nevertheless, this trend was
not seen.
A third possible explanation was searched in the bad mixing of the reactants and the
internal standard. Indeed, as was mentioned in the previous section, the feed reservoir
had to be stirred prior to the experiments because of phase separation of the n-decane
and the polar reactants. Although no demixing was observed anymore afterwards, nor in
the feed container, nor in the vials with the samples, it is possible that a separation of the
n-decane did occur. However, if this was the case, it would suffice to neglect the
corresponding GC peak area and repeat the normalization in (2-9) for the products and
reactants solely to obtain the required reaction stoichiometry. Unfortunately, even this
procedure did not yield the desired results, raising once more doubt about the
plausibility of this reasoning.
A last potential culprit was found in the invalidity of the calibration factors. As has been
mentioned above, these values had to be taken from literature as repeated attempts to
determine them experimentally have all failed for a not yet identified reason. In this
perspective, it is interesting to remark that the explanation for these unsuccessful
attempts to collect qualitative calibration data is rooted in the same issue that caused the
poor outcome of the transesterification as well.
Although the quality of the data is doubtful, inspection and comparison of the calculated
composition for the product streams do allow for assessing the impact of the different
process variables on the process of the reaction. A useful variable to describe the degree
at which a reaction has occurred is the conversion of a certain component. For the 𝑖’th
component, its definition reads:
𝑋𝑖 =𝐶𝑖,𝑖𝑛 − 𝐶𝑖,𝑜𝑢𝑡
𝐶𝑖,𝑖𝑛=
𝑥𝑖,𝑖𝑛 − 𝑥𝑖,𝑜𝑢𝑡
𝑥𝑖,𝑖𝑛 (2-12)
where the latter equality holds because both the volume and the total amount of moles
are expected to be constant during reaction. Here, it was opted to base the calculations on
ethyl acetate, which resulted in the plots given in Figure 2-8.
22 Case study: kinetic modeling of continuous-flow transesterification
Figure 2-8 Experimentally obtained conversions of ethyl acetate for varying flow rates,
temperatures and initial molar feed composition
0
10
20
25 35 45 55 65
Conversion [%]
Temperature [°C]
Methanol : ethyl acetate = 1 : 1
5 ml/min
8 ml/min
15 ml/min
0
10
20
30
40
25 35 45 55 65
Conversion [%]
Temperature [°C]
Methanol : ethyl acetate = 5 : 1
5 ml/min
8 ml/min
15 ml/min
0
10
20
30
40
25 35 45 55 65
Conversion [%]
Temperature [°C]
Methanol : ethyl acetate = 10 : 1
5 ml/min
8 ml/min
15 ml/min
Case study: kinetic modeling of continuous-flow transesterification 23
The observed trends all correspond to what is intuitively expected. At first, for an
increasing excess of methanol, the conversion of ethyl acetate is favored. Indeed, in
accordance with the rate equation in (2-3), an increase of the amount of methanol will
augment the reaction rate and hence the degree of consumption of the reactants. The
biggest gain in conversion is obtained when going from a 1:1 to a 5:1 initial feed
composition; for a 10:1 molar ratio the increase is less distinct. Secondly, for each feed
composition, a higher temperature induces an increase in the conversion as well. Indeed,
according to the Arrhenius dependence of rate coefficient, see (2-7), the consumption of
reactants will be higher for increasing temperature. At last, a negative effect of the
volumetric flow rate is observed. Indeed, the space time is lower for higher flow rates,
which results in a less intense reaction and hence a reduced conversion of the reactants.
2.5 Parameter estimation and statistical analysis
Although the quality of observed concentrations was found to be doubtful, the data were
put in the simulation software Athena Visual Studio (AVS). The code of the kinetic model
for transesterification on ion-exchanging resins as described by E. Van de Steene [22],
originally developed for batch-reactor setups, served as a starting point for the
processing of the continuous-flow data in this work. Logically, the batch reactor model
equations were adapted to their plug-flow analogues, see equation (2-2), while the
section to correct for non-ideal mixing of the components was updated by updating the
settings for the original internal standard n-octane to those for n-decane. As for the
original model, the measured ethyl acetate concentrations are used to regress the model
and obtain the model parameters. The Bayesian minimization as implemented in AVS
was used to perform the optimal fitting.
In a first step, the ethyl acetate concentration were simulated for all studied process
conditions, using the reported values of the kinetic parameters as retrieved from batch-
reactor data and listed in Table 2-5. The results of this simulation are depicted in Figure
2-9.
Table 2-5 Reported kinetic parameters for the transesterification of ethyl acetate and
methanol on Lewatit K2629, as originally estimated from batch-reactor data [22]
Kinetic parameter Estimated value
𝒌𝑺𝑹,𝑻𝒎 9.2 10−3 mol kg𝑐𝑎𝑡
−1 𝑠−1
𝑬𝒂,𝑺𝑹 49.7 kJ mol−1
𝑲𝑴𝒆𝑶𝑨𝒄 9.04 [-]
𝑲𝑬𝒕𝑶𝑨𝒄 1.15 [-]
24 Case study: kinetic modeling of continuous-flow transesterification
Figure 2-9 Comparison of the observed (markers) and the simulated (line) conversion of
ethyl acetate concentration based on the reported kinetic parameter values for the studied
temperature, feed composition and volumetric flow (♦ 5, ■ 8 and ▲ 15 ml/min)
It is striking that the available kinetic parameters yield very poor predictions for the
observed conversion. The increase of the observed conversion for higher temperatures is
considerably less steep than the model predicted trend. Moreover, using the original values
of the kinetic parameters, the simulated conversion profiles have a much stronger mutual
difference for varying space time compared to the experimental values. Therefore, an
attempt is made to determine those parameter values which fit the continuous-flow data
optimally. Unfortunately, the statistical software was not able to yield significantly estimated
values for the parameters. Nevertheless, it was noticed that most of the misfit disappeared
for a higher value for the rate coefficient 𝑘𝑆𝑅 for the surface reaction, either by increasing the
pre-exponential factor or by lowering the activation energy.
To conclude, experiments have been performed on a continuous-flow transesterification
setup. The observed composition of the reactor effluent was found to be of doubtful quality,
as the mass balance did not close and stoichiometric requirements were not met. Some
potential explanations have been formulated, although the precise reason for the deviating
results has not been identified. The collected data have been put into the kinetic model, using
the reported kinetic parameter values which were originally estimated based on batch-data.
A strong discrepancy was revealed between the model predictions and the actual
observations on the conversion of the reactant ethyl acetate. Unfortunately, due to the
uncertain reliability of the experimental data, it was not possible to draw conclusions on the
accuracy of the kinetic parameter values from this misfit.
0
10
20
30
40
50
60
70
80
30 45 60Temperature [°C]
10 : 1
0
10
20
30
40
50
60
30 45 60Temperature [°C]
5 : 1
0
5
10
15
20
25
30 45 60Temperature [°C]
1 : 1
conversion [-]
Case study: kinetic modeling of continuous-flow transesterification 25
2.6 References
1. Nigam, P.S. and A. Singh, Production of liquid biofuels from renewable resources. Progress in Energy and Combustion Science, 2011. 37(1): p. 52-68.
2. European Commision. Reducing emissions from transport. 25 March 2015; Available from: http://ec.europa.eu/clima/policies/transport/index_en.htm.
3. Van Geem, K.M., Sustainable Chemical Production Processes - University Course. 2014, Ghent University.
4. Oberling, D.F., et al., Investments of oil majors in liquid biofuels: The role of diversification, integration and technological lock-ins. Biomass and Bioenergy, 2012. 46(0): p. 270-281.
5. UN, Kyoto protocol to the united nations framework convention on climate change. 1998, United Nations New York, NY.
6. Directive 2009/28/EC of the European Parliament and of the Council of 23 April 2009 on the promotion of the use of energy from renewable sources and amending and subsequently repealing Directives 2001/77/EC and 2003/30/EC. 2009, European Parliament, Council of the European Union.
7. Rajagopal, D. and D. Zilberman, Review of Environmental, Economic and Policy Aspects of Biofuels. 2007: World Bank Publications.
8. Demirbas, A., Political, economic and environmental impacts of biofuels: A review. Applied Energy, 2009. 86, Supplement 1(0): p. S108-S117.
9. Meher, L.C., D. Vidya Sagar, and S.N. Naik, Technical aspects of biodiesel production by transesterification—a review. Renewable and Sustainable Energy Reviews, 2006. 10(3): p. 248-268.
10. Balat, M. and H. Balat, Recent trends in global production and utilization of bio-ethanol fuel. Applied Energy, 2009. 86(11): p. 2273-2282.
11. Mussatto, S.I., et al., Technological trends, global market, and challenges of bio-ethanol production. Biotechnology Advances, 2010. 28(6): p. 817-830.
12. EU energy in figures - Statistical pocketbook. Collected from editions 2005-2013. Directorate-General for Energy - European Commission.
13. Eurostat, Share of renewable energy in fuel consumption of transport. 14. Demirbaş, A., Biodiesel fuels from vegetable oils via catalytic and non-catalytic supercritical
alcohol transesterifications and other methods: a survey. Energy Conversion and Management, 2003. 44(13): p. 2093-2109.
15. Di Serio, M., et al., From homogeneous to heterogeneous catalysts in biodiesel production. Industrial & Engineering Chemistry Research, 2007. 46(20): p. 6379-6384.
16. Ma, F. and M.A. Hanna, Biodiesel production: a review1. Bioresource Technology, 1999. 70(1): p. 1-15.
17. Bournay, L., et al., New heterogeneous process for biodiesel production: A way to improve the quality and the value of the crude glycerin produced by biodiesel plants. Catalysis Today, 2005. 106(1–4): p. 190-192.
18. Di Serio, M., et al., Heterogeneous Catalysts for Biodiesel Production. Energy & Fuels, 2008. 22(1): p. 207-217.
19. Semwal, S., et al., Biodiesel production using heterogeneous catalysts. Bioresource Technology, 2011. 102(3): p. 2151-2161.
20. Tesser, R., et al., Kinetics and modeling of fatty acids esterification on acid exchange resins. Chemical Engineering Journal, 2010. 157(2–3): p. 539-550.
21. Russbueldt, B.M.E. and W.F. Hoelderich, New sulfonic acid ion-exchange resins for the preesterification of different oils and fats with high content of free fatty acids. Applied Catalysis A: General, 2009. 362(1–2): p. 47-57.
26 Case study: kinetic modeling of continuous-flow transesterification
22. Van de Steene, E., Kinetic study of the (trans)esterification catalyzed by gel and macroporous resins. 2014.
23. Dietz, W.A., Response Factors for Gas Chromatographic Analyses. Journal of Chromatographic Science, 1967. 5(2): p. 68-71.
Case study: modeling of current-voltage characteristics of thin layer solar cells 27
Chapter 3
Case study: modeling of current-
voltage characteristics of thin
layer solar cells
3.1 Solar energy: towards a bright, sustainable future?
Over the last decades, the rapid expansion of global industrial capacity to meet the
growing demand for industrial products and the evolution towards a highly technology-
driven society due to major advances in research and product development have induced
a massive increase in the worldwide consumption of energy. Compared to 1965, the
needs for primary energy on an annual basis have tripled, and a further rise in demand
at a similar pace is to be expected for the near future, as is shown in Figure 3-1. Up to
now, a vast majority of this energy supply is still taken by classical, carbon-based fuels,
as more than 80% of the total amount of energy is produced by the combustion of oil,
coal and natural gas. Over the past couple of years, the detrimental impact of this carbon-
based way of energy generation on climate change has been made clear extensively and
has put an urge on the search for new sources of energy, which form sustainable and
environmentally friendly alternatives. Additionally, by establishing sufficient production
facilities for these renewable energy sources, the dependence of the energy demand on
the capricious price evolution of classical energy sources is lowered.
28 Case study: modeling of current-voltage characteristics of thin layer solar cells
Figure 3-1 Evolution of the global energy demand by fuel over the last 50 years
and outlooks for the near future [1]
Besides the kinetic energy of wind and the hydrodynamic power of falling water, a great
opportunity was thereby found in the valorization of the solar energy that is freely
available in our surroundings. Every day, the equivalent of 10000 times the annual global
energy demand is incident on the earth’s surface as sunlight. However, today only a
marginal part of this gargantuan amount of energy is effectively captured [2]. Due to this
significant potential of the recuperation of solar energy to supply growing energy
demands, the interest for valorizing techniques has grown. Typically, three technologies
are available to capture the radiant energy of the sun and transform it into a form that is
more easily handled, going from heat recuperated by boiler systems, over power
generated by means of solar-generated steam to electricity produced by photovoltaic
panels, consisting of semiconductor, mostly silicon-rich, materials [3], the latter being the
focus of this work. Moreover, due to the intrinsic renewable character of solar energy,
governmental action has been undertaken to actively support the expansion of sufficient
production capacity to increase the share taken by renewable sources in the energy
envelope.
Under the impulse of the binding engagement of the European Commission to raise the
share of the energy production taken by renewable sources to at least 20% by 2020 to
gradually build down Europe’s dependence on conventional energy carriers and lower
the negative impact of its energy supply on the environment, the development of
Renewables
Hydropower
Nuclear
Annual energy
consumption
[billion tonne of
oil equivalent]
Case study: modeling of current-voltage characteristics of thin layer solar cells 29
photovoltaic technology and the extension of the production capacity were strongly
stimulated by an active energy policy. This way, the last decade Europe has become by
far the leading region with regard to the installed photovoltaic production capacity of
generated solar energy, as is depicted in Figure 3-2 [4]. The share of photovoltaic
electricity in the entire energy production from renewable sources – wind, solar and
water – has increased from 0.1% in 2000 to 9.2% in 2013 [1, 5].
Figure 3-2 Recent evolution of the total photovoltaic production capacity worldwide
At the present, photovoltaic technology is still largely dominated by silicon-based
devices: classical designs relying on crystalline or polycrystalline silicon wafers account
for more than 80% of the photovoltaic market [6]. These solar panels consist out of slices
of pure, i.e. uncontaminated, silicon with a typical thickness of about 400 µm and a
surface area of 100 cm², which form the core of the unit as the photocurrent generation
takes place in being the forming the . By a series connection of typically 28 to 36 of these
cells into modules, dc output voltages of ca. 10 V are realized, which suffices for proper
device operation; a further increase in generated voltage and current through the device
is obtained by combining multiple modules is series and in parallel. To abduct the
generated photo-electricity, a grid of metallic contacts is imprinted on the cell surface.
Polycrystalline silicon wafers are easier to produce than the crystalline variant, at the
expense of lower cell performance due to the presence of lattice defects in the
semiconductor material. Typical commercial device efficiencies range from 15 to 22%,
while lab efficiencies exceeding 40 % render prospects for even higher performance in the
near future.
Nevertheless, the high initial investments still associated with photovoltaics result in
long payback times, which form a serious impediment for this technology to become a
serious challenger of conventional energy sources. Moreover, rising silicon prices in the
past due to market shortage of raw silicon have demonstrated the strong sensitivity of
0
20000
40000
60000
80000
100000
120000
140000
160000
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
Capacity
(MW)
Rest of the world
China
Americas
Asia Pacific
Europe
30 Case study: modeling of current-voltage characteristics of thin layer solar cells
the feasibility to feedstock prices and put an urge on the reduction of the material use
and the synthesis of performant alternatives competitive to silicon [7]. Research has
therefore resulted in drastic increases in the efficiency of classical solar panels based on
silicon, but also in the development of new and promising semiconductor materials as
potential replacers for silicon and the feasibility of thin-layer solar cells (TLSC), enabling
the production of flexible photovoltaic modules. For a long time, knowledge about the
underlying physical mechanisms was very limited and the development and production
of thin film solar technology was relying almost entirely on trial and error and experience
rather than on purposeful design, which logically strongly impeded advances in this
field. Recent insights in the nature of the microscopic phenomena have led to promising
models for the behavior of TLSC technology; the assessment of their overall applicability
and statistical relevance is the focus of this chapter. Before taking a closer look at the
proposed models for TLSC’s, the current state of the technology itself will be briefly
touched, however, not without first having given a short introduction on the physics
behind solar cells.
3.1.1 General working principle of a photovoltaic device
As for all photovoltaic devices, the working principle of CIGS-based thin-layer solar cells
relies on the photovoltaic effect which many semiconductor materials show when being
shone by sufficiently energetic light. Part of the incident light is absorbed by the material
and the associated photonic energy is transmitted to the solid’s electrons. If this energy
suffices to overcome the band gap between the valence and the conduction band of the
material, the electron is disrupted from its binding orbital and gains sufficient mobility to
move through the semiconductor. To prevent the excited electron from dissipating its
gained energy by releasing heat followed by retaking its position in the electronic shell, a
potential has to present over the material, which sweeps the generated, mobile photo-
electrons towards the positive terminal where they are collected and fed to the grid. The
covalent bond that is broken upon freeing of the electro is now available to bind to other
electrons in the lattice structure so that, simultaneously to the electron transport, a
positive hole moves through the material [8].
To ensure the separation of the electron-hole pairs, the photovoltaic device thus has to be
foreseen by a built-in driving force. Typical designs rely on the junction principle, a well-
known characteristic of the interface between two semiconductor materials that are
specifically designed to show a different electronic behavior by the purposeful addition
of impurities. Incorporation of those so-called dopant compounds, elements that have a
higher or lower number of valence electrons compared to the pure semiconductor and
are added in trace amounts to the material, indeed increases the concentration of charge
carriers in, and hence the conductivity of, the semiconductor. Impurities of the first type
have an excess of valence electrons compared to the number of bonds formed in the
semiconductor lattice. If the energy of this additional electron exceeds the Fermi level of
Case study: modeling of current-voltage characteristics of thin layer solar cells 31
the pure, often named intrinsic, material, the Fermi level will increase upon
contamination, which results in turn in a higher electron concentration in equilibrium.
Because of the excess of negative charge carriers, this type of doped materials is denoted
as n-type. Analogously, the addition of compounds with a relative shortage of valence
electrons will yield p-type semiconductors upon addition to the material, which have a
higher hole concentration in equilibrium and hence an excess of positive charge carriers
compared to the pure material [9].
If upon the crystallization process of the semiconductor material, a layer of p-type
semiconductor is allowed to grow on a n-type base, a p-n-junction is formed at the
interface. In case the n- and p-type semiconductors are intrinsically different materials,
the interface is called a heterojunction. Upon contact of the layers, a diffusional
interchange of charge carriers will start spontaneously due to the concentration gradients
across the junction; by recombination of electron-hole pairs in this zone of intense charge
interchange, a neutral, almost non-conductive depletion zone starts to form around the
interface. Simultaneously, the transfer of the charge carriers across the interface causes
the build-up of a fixed opposite charge on both sides of the junction, which opposes the
continued flow of majority carriers through the junction, see Figure 3-3 [10]. At
equilibrium, the diffusional transport of majority carriers across the depletion zone,
opposite to the electric field over the depletion zone, is balanced by the drift transfer of
minority carriers along the field.
Figure 3-3 Transfer of electrons (black) and holes (white) across a p-n (hetero)junction by
means of diffusion ( ) and drift ( ). Recombination induces the formation of a
neutral depletion zone around the interface
Hence, though strongly hindered, transport of the majority carriers over the depletion
zone is not completely absent. When an external forward voltage, often referred to as bias
voltage, is applied over this p-n junction, which means that the positive terminal is
attached to the n-region and vice versa, the built-in electric field is, at least partially,
Electric field
- + n region p region
32 Case study: modeling of current-voltage characteristics of thin layer solar cells
countered, which results in turn in a narrowing of the depletion zone. Majority carriers
will overcome this barrier more easily, and a higher current flows through the junction.
Analogously, a reverse bias enforces the internal electric field, which broadens the
depletion zone and strongly hinders the flow of current. This filtering electronic
characteristic of the p-n junction depending on the current direction – favoring for
positive and almost complete blocking for negative voltages – resembles the working
principle of a diode. Hence, in the absence of light, the current-voltage behavior of an
ideal photovoltaic cell equals that of a single diode.
When sufficiently energetic light is incident on the device and photo-emission takes
place, a photocurrent is generated which moves oppositely to the “dark” current
resulting from the charge carrier diffusion described above; the total current flowing
through the cell is then equal to the superposition of them both, shown in Figure 3-4. It
follows that, to produce a maximal photocurrent, the dark current has to be minimized,
being a major challenge in the design step for suitable photovoltaic devices.
Figure 3-4 Current-voltage characteristic of a typical CIGS solar cell
in the dark ( ) and under illumination ( )
Nevertheless, real-life photovoltaic devices do show strong deviations from ideal
behavior that highly affects the current flow through the cell. As will be explained in the
next paragraph, TLSC’s require a multilayered structure which introduces non-idealities
inevitably, especially in the form of imperfect interlayer contacts and resistive losses.
Non-uniformities inside the semiconductor material or originating from the production
process itself are additional sources for more of these so-called parasitic current pathways
inside the cell. As these phenomena all have a significant effect impact on the current-
voltage characteristics of the solar cell, a proper design of the devices requires that all
these processes are accurately identified and adequately modelled.
Current
density
Voltage
Case study: modeling of current-voltage characteristics of thin layer solar cells 33
3.1.2 Thin film solar cell technology
The development of promising alternative semiconductor materials, which show a
higher absorbance of the incident light, has been the major breakthrough to stimulate the
expansion of TLSC’s. Allowing for modules of only 10 µm thick to be produced - 10
times smaller than conventional technology – these materials enable the fabrication of
photovoltaic modules become sufficiently thin to flex, making them suited for deposition
on either flexible or stiff substrates, like polymer and metallic foils or glass respectively.
Additionally, this innovation strongly reduces the material costs associated with
photovoltaic installations, while its faster production process, relying on deposition and
sputtering rather than the slow crystal growth required for silicon-based designs, lowers
the manufacturing costs as well [11]. Literature reports three potential candidates to this
purpose, being amorphous silicon, cadmium-telluride and a complex semiconductor
containing copper, indium, gallium and selenium, therefore denoted CIGS. The latter is
thereby considered to be the most-promising candidate, showing lab-scale efficiencies of
over 20% combined with excellent stability and allowable production costs [12].
Moreover, future outlooks for this technology are highly optimistic, as efficiencies
exceeding 30% have come into view [13].
Nevertheless, efficiencies of actual CIGS-based modules have long been intolerably small
compared to the conventional silicon-based photovoltaic technology: module efficiencies
on monocrystalline silicon based already showed efficiencies of over 22.7 %, highly
outperforming the CIGS cells that offered only 12% [14]. One major reason for this strong
difference was found in the initial production process, which required elevated
temperatures and brought down the stability of the cell. Moreover, up-scaling of the
production process from laboratory-scale to industrial amounts were associated with the
introduction of non-uniformities in the multilayer structure of the cell, while a lack of
knowledge about the impact of these non-idealities hindered the correct addressing and
tackling of these problems. Recent advances in the production process and growing
understanding of the underlying mechanisms inside thin-layer cells have resulted in a
significant rise in performance, up to 15.7% in 2014, while maximal efficiencies of
conventional technology have stagnated, which improves its competitiveness towards
established technologies [15]. Moreover, future outlines for this technology are strongly
optimistic, allocating a high potential to outperform the currently used silicon-based
photovoltaic “workhorses”.
3.1.2.1Structure of CIGS-based thin-layer solar cells
As an alternative to these conventional devices, the light absorption and photo-electron
generation is located in a thin layer of a quaternary semiconductor containing copper,
indium, gallium and selenium, obeying the rule 𝐶𝑢𝐼𝑛𝑥𝐺𝑎1−𝑥𝑆𝑒2, where 𝑥 is a value
between 0 and 1. Having a tetragonal chalcopyrite crystal structure with a slight copper
34 Case study: modeling of current-voltage characteristics of thin layer solar cells
deficiency, the material is a p-type doped semiconductor, while its high absorption
coefficient – literature reports values exceeding 105 cm-1 for λ<600 nm – renders the CIGS
based solar cells the highest efficiency among all thin-layer solar cells. Moreover, as the
composition of CIGS crystals spans a range of materials bounded by pure copper-
indium-selenium or CIS and copper-gallium-selenium, the band gap is tunable between
1.04 and 1.65 eV, depending on the value of 𝑥 [16, 17]. The gain in band gap arising from
the partial substitution of indium by gallium increases the open-circuit voltage
performance of the heterojunction in the cell [18].
The general structure of a CIGS based thin layer photovoltaic cell shows a multilayer as
depicted in Figure 3-5 [19]. The solar cell is deposited on a substrate layer, which consists
out of soda-lime glass, polymer or metallic foil, the latter two showing the flexibility to
allow for the coverage of flexible surfaces while being reported as slightly less
performant than the glass carrier. It goes without saying that this variety of allowed
substrate materials strongly contributes to the overall applicability of the TLSC
technology.
On top of the carrier, a conductive, metallic layer is deposited which serves as the back
contact of the cell. Often molybdenum is used, as it combines a low purchase cost with
low-resistance contacts with the substrate by the formation of 𝑀𝑜𝑆𝑒2; moreover,
diffusivity of the metal into the upper layers is low, whilst thermal stability is ensured
due to the high melting point.
On this Mo layer, the light absorbing CIGS crystals are grown; typical deposition
processes often mentioned in literature include co-evaporation and vacuum selenization.
The n type counterpart of the junction is formed by a layer of CdS, acting as a buffer
layer as well: due to its adequate coverage of the CIGS crystals, it functions as a
passivation and protection of the absorber material during the sputtering process to
deposit the ZnO front contact. Due to its high band gap of 2.4 eV, photo-absorption is
minimized and a maximal transmittance of the incident light to the CIGS layer is assured,
explaining its naming as a window layer. Moreover, CdS has been reported to effectively
remove oxides and elementary metallic particles and to diffuse partially into the CIGS
material, thereby enhancing the interface characteristics [20]. The strong toxicity of
cadmium compounds has driven the research towards other semiconductors, yet up to
now CdS remains the most frequently industrially applied.
The front contact typically shows a bilayer structure of stacked intrinsic and n-doped
ZnO. The latter maximizes current transport due to its high conductivity and low
resistivity, while the high-resistance intrinsic layer minimizes undesired leakage currents
due to pinholes in the CdS window [21]. The combination of a high band gap, exceeding
Case study: modeling of current-voltage characteristics of thin layer solar cells 35
3 eV, and a transparency over 85% results in only a minimal amount of light retained by
the ZnO layers.
On top of the front contact a metallic grid made out of an aluminum-nickel alloy is
applied, sometimes combined with an antireflective (AR) MgF2 coating to maximize the
amount of incident light captured by the cell.
Figure 3-5 Schematic overview of the multilayer structure
of the CIGS thin layer solar cell
Due to the specific electrical interaction of the materials in the different layers of the solar
cell, the translation of a desired performance of a solar cell design into the actual module
requires a control on the structure of the cell. The combination of the very low thickness
of the TLSC and its yet complex structure gives rise frequently to imperfections within
the crystal structure of the different layers – especially of the semiconductors – or to non-
uniformities in their stacking. It goes without saying that any such deviation from
ideality will be reflected in a changing electrical behavior of the device, typically
resulting in a loss of generated photo-electrical power and hence decreasing cell
efficiency. In what follows, the contribution of several of these undesired leakage
currents, will be identified and modelled, aiming at a thorough understanding of the
underlying physical mechanisms. Knowing what goes wrong has proved repeatedly to
be a crucial, yet often heavily underrated, step in tackling an issue.
3.1.3 Outline of this chapter
In what follows, first a brief introduction will be given on the candidate mechanisms
which have been reported in literature as potential sources for power loss in a thin-layer
solar cell and on how their electrical characteristics have to be translated effectively into a
useful mathematical model. It will turn out that a number of unknown physical
parameters have to be determined to close the model. Therefore, experimental data have
to be collected, which will subsequently be used as input for a parameter estimation
procedure. The resulting parameter estimates of this model fitting and conclusions about
substrate
Mo back-contact [0.5-1 µm]
n-type CdS buffer [50 nm]
Al-Ni grid + AR
i- and n- ZnO front-contact [0.25-1 µm]
p-CIGS absorber [1-3 µm]
36 Case study: modeling of current-voltage characteristics of thin layer solar cells
the performance of the cell will then be subjected to an analysis to ensure their validity,
both mathematically and statistically.
3.2 Model for the dark current-voltage characteristic of
CIGS solar cells
3.2.1 Ideal versus non-ideal electric behavior of solar cells
The current flowing through a CIGS solar device in the dark when a voltage is applied is
ideally following a diode-like characteristic, as was discussed in section 3.1.1. Therefore,
the implicit relation between the voltage 𝑉𝐷 over the ideal cell and the resulting dark
current density 𝐽 throughout the device is given by:
𝐽 = 𝐽0 ∙ [𝑒𝑥𝑝(𝐴𝑉𝐷) − 1] (3-1)
where 𝐽0 denotes the saturation current density of the heterojunction. The parameter 𝐴
reflects the mode of carrier transport through the heterojunction when being rewritten
as 𝐴 =𝑞
𝑛𝑘𝐵𝑇⁄ , where 𝑞 is the electronic charge, 𝑘𝐵 the Boltzmann constant, T the
absolute temperature and n the ideality factor of the diode, expressing whether charge
transport through the junction is dominated by recombination, when n equals 2, or
diffusion, corresponding to an ideality factor of 1. Values exceeding 2 do not have an
interpretation in terms of carrier transport, and are reported as indicators for tunneling
effects [8, 22].
Comparison of a typical measured current-voltage behavior of real-life photovoltaic
devices with predictions for an ideal cell based on equation (3-1), reveals some distinct
deviations located over the entire voltage range, as is observed from Figure 3-6. The
profile is asymmetrical for around zero-voltage, showing power-law non-linearity for
negative voltages, while at mediate positive voltages an excess current appears, forming
a distinct shoulder. At high voltages, the profile tends to flatten as well.
Figure 3-6 Ideal (dashed line) vs. non-ideal (solid) current-voltage profiles,
showing the different non-idealities to be explained
0.00001
1
100000
-1 0 1
Current
density
[mA/cm²]
Voltage [V]
Case study: modeling of current-voltage characteristics of thin layer solar cells 37
It is important to notice that not all deviations result in an increase of the dark current
flowing through the device. This is an indication that the non-ideal electric behavior of
the cell does not originate solely from parasitic phenomena, i.e. the undesired leakage
pathways that favor the flow of current through or in parallel of the junction and thus
induce a loss of the generated photo-electric power, but other effects come into play as
well. Recent studies have been focusing on linking the deviating electric behavior of the
device to physical phenomena acting in the inner structure of the cell [22-24]; each of
these effects will be briefly introduced in the upcoming paragraphs. To translate the
impact of these different side-effects on the current-voltage relation for the entire
structure into an adequate mathematical form, each mechanism will be represented by an
appropriate elementary electric component, having well-known electric behavior. This
way, the construction of an equivalent electric circuit becomes straightforward, which
will be an important aid for the derivation of the final model equations for the current-
voltage characteristic of the device.
3.2.2 Modelling parasitic current pathways and non-idealities in a CIGS
heterojunction solar device
The need for a regularly multilayer structure for a heterojunction solar device at a
micrometer scale of thicknesses which is typical for TLSCs forms a major source for
introducing highly undesirable non-uniformities in the stacking during the production
process. Since the presence of such imperfections induces significant changes in the
electric behavior of the device as a whole and therefore often results in a loss of
performance, a thorough understanding of the underlying mechanisms and a precise
insight in their actual impact are indispensable steps in the development of an improved,
ideally defect–free, production process.
The most extended model that has been described up to the moment of writing of this
work has identified three potential rivalling pathways through the inner structure of the
cell, bypassing the desired passage of the current across the heterojunction.
38 Case study: modeling of current-voltage characteristics of thin layer solar cells
Figure 3-7 Typical parallel current pathways proposed for explaining the non-ideal
behavior of real solar cells and the equivalent electric circuit [22]
Two contributions have been categorized as shunt currents. Shunting behavior in fact
captures all physical pathways through the multilayer solar cell structure that offer a less
hindered alternative to current passage through the main junction and hence partly
bypass the desired multilayer structure. Because they originate from typically extremely
localized imperfections in the cell structure, shunting effects often show significant and
unpredictable local differences, causing a shunt resistance that varies potentially 1 to 2
orders of magnitude for different solar cells of the same type, or even for different spots
on the same cell.
As shown in Figure 3-7, a first type of shunt behavior arises from purely resistive current
transport phenomena through the cell. The presence of microscopic pinholes in the
multilayer structure, e.g. due to the unequal coverage of the back contact by the CIGS
absorber, will induce a low-resistance interface between back contact and the window
layers. Moreover, current flow along highly conductive grain boundaries in the CIGS
structure, originating from the tendency of In and Ga to build up at the boundaries
rather than in the bulk of the CIGS crystals, forms an attractive alternative to the
heterojunction [25]. Because these pathways form a low-resistance conductive route
parallel to the heterojunction, their electric behavior is accurately modelled by an Ohmic
relation, showing a linear dependence between the passing current 𝐽𝑂ℎ𝑚𝑖𝑐 and the voltage
𝑉𝑂ℎ𝑚𝑖𝑐 over the pinhole or boundary. Due to the parallel positioning of the Ohmic
resistance and junction diode, it holds that 𝑉𝑂ℎ𝑚𝑖𝑐 = 𝑉𝐷, so that the electric behavior of a
shunt resistance 𝑅𝑠ℎ is given by:
𝐽𝑂ℎ𝑚𝑖𝑐 =𝑉𝐷
𝑅𝑠ℎ (3-2)
Main Leakage
terms
Main
junction
𝑹𝒔
𝑹𝒔𝒉 𝑺𝑪𝑳𝑪
Case study: modeling of current-voltage characteristics of thin layer solar cells 39
Often, modelling of the shunt leakage currents solely by a purely resistive component
does not suffice to explain the entire deviating current through the cell at low positive
voltages. Therefore, a second contribution to shunt leakage was identified as a multistep,
trap-assisted tunneling mechanism, primarily present in solar cells with a high
concentration of mid-gap defect states and a heavily-doped emitter [26]. It has been
shown that the current leakage due to such tunneling processes obeys a diode-like
relation with its voltage [27]. Hence, analogously to equation (3-1), the attribution to the
current-voltage profile of this shunt tunneling diode follows from:
𝐽𝑠ℎ = 𝐽0,𝑠ℎ ∙ [exp(𝐴𝑠ℎ𝑉𝐷) − 1] (3-3)
, where, in contrast to the diode representing the main junction in the cell, the quality
factor 𝐴𝑠ℎ does not have an explanation in terms of relate to any transport mechanism.so
that in this case the associated ideality factor does not have any physical meaning.
Ultimately, even the incorporation of a tunneling contribution to the shunt does not yield
a proper fit of the modelled and measured J-V characteristics for reverse bias. As
mentioned earlier, experimental data obtained for negative voltages show a distinct
power-law dependence on the voltage over the cell, which is not explained by the
incorporation of an Ohmic resistance alone, since this introduced only a linear relation. In
the search for an appropriate mechanism to explain this trend, the principle known as a
space-charge limited current (SCLC) is frequently suggested as a reliable candidate.
Although originally discovered to explain the inexplicably high current passing through
an insulator material separating two electrodes [28], the SCLC mechanism has been
proposed recently as a potential source of current leakage through the semiconductor
absorber layer [29]. The occurrence of SCLC in a solar cell has been related to the
formation of metal-semiconductor-metal combinations. A major situation in which such
defects are formed is found in a non-uniform coverage of the CIGS absorber by the CdS
window layer, so that the emitter is trapped between the metallic front and back-
contacts. Additionally, diffusion of the aluminum dopant out of the front-contact
through the window layers towards the absorber material has been identified as a
potential source of leakage [30].
The current flowing through a SCL region subjected to a voltage 𝑉𝐷 – once more, this
leakage term is flowing parallel to the heterojunction – follows a power-law relation
obeying the general form:
𝐽𝑆𝐶𝐿𝐶 = 𝑠𝑔𝑛(𝑉𝐷) ∙ 𝑘 ∙ |𝑉𝐷|𝑚 (3-4)
the parameter 𝑘 depending on the thickness of the semiconductor layer, its conductivity
and the presence of carrier traps, and the exponent 𝑚 reflecting whether current leakage
is facilitated by deep traps, then 𝑚 > 2, or not, then 𝑚 ≅ 2 [24]. The incorporation of this
parallel unit in the solar cell model was able to tackle the misfit issue between model and
theory.
40 Case study: modeling of current-voltage characteristics of thin layer solar cells
Besides the presence of leakage pathways which cause the generated photo-electric
current to be partially lost, other non-idealities are found in a real-life solar cell that effect
its electric behavior and performance, but, in contrast to parasitic currents, not
specifically in a negative manner. One intuitive source of non-ideality is found in the
intrinsic, non-zero resistance of the materials that make up the different layers of the cell
and of the interfaces in between. Primarily, the effect of the non-ideal charge conduction
through the metal contacts at the front and the back of the cell has often been reported to
be sufficiently significant as to be incorporated in the model explicitly. Because the non-
ideality of the contacts is assumed to be almost uniform along the cell base, this effect is
modelled as a series resistance 𝑅𝑆 in the equivalent electric circuit. For an Ohmic
resistance, the current-voltage characteristic is given by the linear relation:
𝐽 =𝑉𝑅
𝑅𝑆 (3-5)
with 𝑉𝑅 the voltage over the series resistance and 𝐼 the current through the solar cell.
Because of the series connection of this resistance to the different parallel pathways
discussed above, the presence of the series resistance does affect the current passing
through the junction by altering the voltage in the characteristic diode equation. The
voltage over the diode and the resistance are indeed linked by:
𝑉 = 𝑉𝐷 + 𝑉𝑅 (3-6)
𝑉 being the total externally applied voltage over the entire cell. Sensitivity analyses on
the dark-voltage characteristics showed that this parameter mainly affects the behavior
for higher voltages; correction for the split potential fall over the cell captures a major
part of the observed flattening.
Taking all these parallel current-leakage mechanisms into account, an equivalent
electrical circuit as depicted in Figure 3-7 is found. The modelled current through a thin-
layer photovoltaic cell which is subject to an external voltage V is then modelled as:
𝐽 = 𝐽𝑗𝑢𝑛𝑐𝑡𝑖𝑜𝑛 ∙ [exp (𝐴𝑗𝑢𝑛𝑐𝑡𝑖𝑜𝑛(𝑉 − 𝐽𝑅𝑆)) − 1]
+𝐽0,𝑠ℎ ∙ [exp(𝐴𝑠ℎ(𝑉 − 𝐽𝑅𝑆)) − 1]
+𝐽0,𝑠ℎ ∙ [exp(𝐴𝑠ℎ(𝑉 − 𝐽𝑅𝑆)) − 1]
+𝑉 − 𝐽𝑅𝑆
𝑅𝑠ℎ
+ 𝑠𝑔𝑛(𝑉 − 𝐽𝑅𝑆) ∙ 𝑘 ∙ |𝑉 − 𝐽𝑅𝑆|𝑚
(3-7)
Of course, the magnitude of the different parasitic parameters are a measure for the
presence of the different current losses; a notion of the relative importance of these
mechanisms forms an indispensable part of the search and development of further steps
in the improvement and intensification of the thin layer solar cell technology.
Case study: modeling of current-voltage characteristics of thin layer solar cells 41
3.3 Experimental setup and procedures
The presence of multiple, parallel current pathways through a CIGS solar cell, each of
them associated with at least one unknown physical parameter, requires the collection of
data points covering the entire electrical behavior of the device in the highest possible
extent. Therefore, dark current-voltage measurements will have to be performed
covering both forward and reverse biases, as to assess the particularities of the suggested
model; for each experiment, the applied voltage ranges between -1 and 1 V, resulting for
201 data points for each studied temperature.
Additionally, not all electrical phenomena influence the observed dark J-V profiles in the
same extent. When analyzing the experimental data, the largest contributors to parasitic
current transport will tend to dominate the results, and it is therefore plausible that less
pronounced leakage terms, although equally relevant for a complete understanding of
the present imperfections and the following improvements of the technology, will be
overshadowed. To increase their weight in the observations and hence allow for
unravelling the physics of the systems in much finer detail, the measurements will be
repeated for lower temperatures as well. By cooling the system down, the larger
contributors are “frozen”, enabling the smaller ones to be more strongly distinguished.
Using liquid nitrogen as cooling medium, temperatures down to 110 K will be reached;
measurements will be collected for temperature increments of 10 K, starting from 290 K.
Two CIGS-type solar cells have been studied, both of them being slices cut out of one
mother panel produced by the Swiss EMPA institute. Given that their original mutual
distance amounted only a few centimeter, the potential difference in electrical behavior
of both pieces will demonstrate whether or not there are strongly local irregularities in
original panel. Each of the cells has a surface area of 0.5 cm² which will allow for the
calculation of specific model parameter values during the estimation procedure, which
will become clear in the following section.
The solar cells were put in the experimental setup shown in Figure 3-8. The samples are
attached to a metallic support, depicted in gray in the scheme, on a small disk, which is
in close thermal contact to a vessel filled with liquid nitrogen, to provide the cooling to
cryogenic regimes. To maintain the desired temperature during the experiment, the disk
is connected to a temperature sensor. The thermal contact between the solar cell and the
support is ensured by using thermal conductive glue.
A two-wire technique is used to register the applied voltage and measure the current
measurement separately. This way, the potential difference across the isolated wiring
will not bias the results.
42 Case study: modeling of current-voltage characteristics of thin layer solar cells
Figure 3-8 Experimental setup in real life and schematically,
showing the two-wire configuration
To minimize the thermal losses, a bell is placed on the setup, which allows
simultaneously for working under vacuum conditions. The configuration is shut off from
light by putting a black blanket on top.
3.3.1 Overview of the statistical analysis
The experimental data were used as input for the parameter estimation procedure in the
statistical modelling software package Athena Visual Studio (AVS), version 14.2. Because
of its more funded theoretical framework compared to ordinary nonlinear least squares
regression, the alternative Bayesian procedure as encoded in AVS was used. As was
already mentioned in Section 2.3.4, this routine relies on the approximate approach
where inference on confidence intervals is obtained by local linearization, rather than
MCMC sampling of the posterior density. Meanwhile, the default AVS settings on prior
density distribution are used, i.e. a uniform prior on all model parameters and Jeffreys’
non-informative prior for the error covariance matrix.
To determine whether an estimate for a certain model parameter is significant, in the
sense that it differ significantly from zero and hence participates actively in the model,
the test variable 𝑡𝑐𝑎𝑙𝑐 has to be calculated:
𝑡𝑐𝑎𝑙𝑐 =𝑏𝑖
𝑠(𝑏𝑖)~𝑡(𝑛 − 𝑝) (3-8)
where 𝑏𝑖 is the calculated parameter estimate for model parameter 𝛽𝑖 and 𝑠(𝑏𝑖) is the
corresponding standard deviation. To reject the null hypothesis 𝛽𝑖 = 0 with a certai0
probability 𝛼, 𝑡𝑐𝑎𝑙𝑐 is compared to the tabulated t-value with a probability 1 − 𝛼/2:
T sensor
solar cell
Liquid N2
vessel
Case study: modeling of current-voltage characteristics of thin layer solar cells 43
|𝑡𝑐𝑎𝑙𝑐| > 𝑡(𝑛 − 𝑝, 1 − 𝛼/2) (3-9)
Because the output file of an estimation procedure in AVS reports the confidence
intervals on the estimated parameters of a single-response model as:
𝑏𝑖 − 𝑠(𝑏𝑖) ∙ 𝑡 (𝑛 − 𝑝, 1 −𝛼
2) ≤ 𝛽𝑖 ≤ 𝑏𝑖 + 𝑠(𝑏𝑖) ∙ 𝑡 (𝑛 − 𝑝, 1 −
𝛼
2) (3-10)
meeting criterion (3-10) corresponds to excluding 0 from the confidence interval of all
parameters. Hence, it had to be assured that this condition was fulfilled for the optimal
parameter estimates, and this for all temperatures and both cells.
Once the individual confidence intervals were acquired and it was ascertained that all
parameters were significantly different from zero, the reliability of the estimation
procedure as a whole has to be checked. Typically, during this control loop the lack-of-fit
between experimental data and model-based predictions is assessed, while the
considered model fitting is tested onto which extent the supposed normality of the
residuals is satisfied or rather violated. In principle, for every estimation the lack-of-fit
has to be tested as well; however, since no replicate experiments have been performed,
no conclusions will be drawn.
In agreement to the analysis above to determine whether the estimated model
parameters are significantly different from zero individually, the significance of the
regressed model as a whole will be assessed as well. Indeed, for the parameter estimation
to be meaningful, it is crucial that the resulting model does predict the observed data
points significantly better than a model where all parameters equal zero. In the latter
case, it would indeed not be worth the effort of doing any parameter estimation at all.
To reject with a certainty 1 − 𝛼 the hypothesis that all model parameters are
simultaneously equal to zero, the following F-test has to be passed:
𝐹𝑖 =
∑ 𝐼𝑐𝑎𝑙𝑐,𝑖,𝑘2201
𝑘=1
𝑝
∑ (𝐼𝑜𝑏𝑠,𝑖,𝑘 − 𝐼𝑐𝑎𝑙𝑐,𝑖,𝑘)2201
𝑘=1
201 − 𝑝
> 𝐹(𝑝, 𝑛 − 𝑝; 𝛼) (3-11)
where 𝑝 gives the number of included model parameters. Values for 𝐹(𝑝, 𝑛 − 𝑝; 𝛼) are
tabulated and, taking 𝛼 equal to 0.05, amount to about 2.
At last, it has to be determined to what extent the model parameters are mutually
correlated. Indeed, in case of a strong dependence between two parameters, the
likelihood of compensation behavior becomes considerable. Hence, it is plausible that
part of the contribution of the first parameter to the final model predictions is taken by
the other one, which is detrimental for the reliability of the results from the estimation. A
measure for the degree of correlation between model parameters 𝑏𝑖 and 𝑏𝑗 is the binary
correlation coefficient given by:
44 Case study: modeling of current-voltage characteristics of thin layer solar cells
𝜌𝑖,𝑗 = 𝑽(𝒃)𝑖𝑗/√𝑽(𝒃)𝑖𝑖𝑽(𝒃)𝑗𝑗 (3-12)
where 𝑉(𝑏) is the covariance matrix of the parameter estimates 𝑏. Strong correlation
corresponds to |𝜌𝑖𝑗| ≥ 0.95. The AVS software automatically reports the binary
correlation matrix in the output file; for the type experiment
3.4 Analysis of the results
3.4.1 Results of the statistical assessment
Current-voltage measurements were collected for both cells. The experimental results for
the first one are depicted in Figure 3-9, showing clearly the non-ideality of the electrical
behavior. The profiles obtained from the second cell were slightly different but showed
similar deviations from ideality. It follows immediately that the impact of lowering
temperature is significant, which proves the need for cooling to fully assess the electrical
behavior of the cells. The calculation of the current density from the measured current 𝐼 is
straightforward:
𝐽 = 𝐼/𝐴𝑐𝑒𝑙𝑙 (3-13)
where 𝐽 is the current density through a cell with surface area 𝐴𝑐𝑒𝑙𝑙, which equals 0.5
cm².
Figure 3-9 Observed current-voltage characteristics of the first cell
All model parameters are expected to be temperature dependent, at least in some extent.
Due to the lack of overall applicable functions that accurately describe this dependence, it
was opted to perform the parameter estimation procedure for each particular
temperature separately. The physical meaning of their temperature dependence will then
be assessed by interpreting the plots of the resulting parameter estimates from the
isothermal fittings. The same modelling methodology was applied for the experimental
0.0001
0.01
1
100
-1 0 1
Current [mA]
Voltage [V]
290 K
270 K
249 K
230 K
209 K
189 K
169 K
151 K
130 K
110 K
Case study: modeling of current-voltage characteristics of thin layer solar cells 45
data from both cells and for all considered temperatures, giving a total of 38 performed
fitting operations.
The theoretical model given by equation (3-7) is in its most complete form, i.e. all terms
that have been suggested in literature up to the moment of writing are included.
Nevertheless, there is no a priori requirement for all associated leakage pathways to be
present in a particular CIGS cell. In other words, the suggested model will potentially be
too extensive, i.e. having redundant contributions, which explains the need for a proper
assessment of the need for each particular contribution.
The most elementary model, comprising only the main junction, is immediately excluded
because of the strong deviation from ideality by the measured J-V characteristics.
Including the series resistance associated with the non-ideal contacts as the first power
loss mechanism resulted in a flattening of the simulated J-V characteristic for high
positive voltages. As is depicted in Figure 3-10 for the experiment on the first cell at 290
K, the fit to the observations in this range is remarkably good, while at lower positive
voltages the deviation becomes significant. At negative voltages, the misfit is even more
pronounced.
Figure 3-10 Best fitting curve when considering resistance of non-ideal contacts only
When adding the shunt resistance as a first candidate parasitic contribution, optimal
curve fits similar to the one as shown in Figure 3-11 are obtained. It is seen that the
inclusion of a parallel resistance improves the match between observed and predicted
values significantly for lower positive voltages, without undermining the fit for higher
positive values. Meanwhile, a major part of the misfit between model and experiments in
the negative range is overcome. Nevertheless, some small deviations remain, which
suggests the need for a further extension of the model.
0.0000001
0.00001
0.001
0.1
10
-1 0 1Absolute
current
[mA]
Voltage [V]
Observed
Predicted
Main junction 𝑹𝒔
46 Case study: modeling of current-voltage characteristics of thin layer solar cells
Figure 3-11 Best fitting curve when considering non-ideal contacts and shunt resistance
for the first cell at 290 K
The incorporation of a space charge limited current pathway was able to solve the
deviations of the model-predicted for this specific situation, as is clear from Figure 3-12.
Additionally, again for this particular experiment, it was found that the inclusion of a
shunt tunneling diode, the last candidate parasitic pathway, did not improve the fit of
the observed responses significantly. Moreover, when attempting to fit the model with a
shunt tunneling contribution while removing the SCLC term, it followed that the final fit
throughout the entire voltage range was worse. Therefore, for this particular experiment,
it was found that the most complete and least complex model only includes two parasitic
terms, besides the non-ideal resistance of the contacts.
Figure 3-12 Best fitting curve when considering non-ideal contacts, shunt resistance and
space charge limited current leakage for the first cell at 290 K
It is important to stress that the exclusion of the tunneling contribution for this particular
experiment does not hold for all observations. The identification of the most suited
model function is repeated for each experiment. During these steps, it is important to
keep an eye on both the quality of the estimates, by checking the visual fit of the
predicted model, and on the statistical validity of the obtained parameter and the
regression as a whole.
0.001
0.01
0.1
1
10
100
-1 0 1
Absolute
current
[mA]
Voltage [V]
Observed
Predicted
0.001
0.01
0.1
1
10
100
-1 0 1
Absolute
Current
[mA]
Voltage [V]
Observed
Predicted
𝑹𝒔
𝑹𝒔𝒉
𝑹𝒔
𝑹𝒔𝒉 𝑺𝑪𝑳𝑪
Case study: modeling of current-voltage characteristics of thin layer solar cells 47
Unfortunately, while assessing the significance of the different parameters, the
simulation software did not succeed in calculating the individual confidence intervals of
all model parameters simultaneously, especially for lower temperatures. Therefore, only
a limited set of parameters was hence fully estimated in every run. Therefore, an
iterative procedure has been adopted, where the values of the estimated parameter
values were used as fixed values for a new minimization, wherein the indeterminable
parameter are assessed. Once point estimates and confidence intervals were obtained for
these parameters as well, this procedure was then repeated until no additional gain in
minimization was observed. By following this method, a considerable improvement of
the final fit of the model to the experimental data was realized.
When assessing the significance of the regression as a whole in accordance to equation (3-
11), the number of model parameters will amount to 6 or 8, depending on whether the
shunt tunneling diode term is considered or not. Calculation of the expression of the left-
hand side yields values exceeding 106, for each temperature and both cells, which clearly
demonstrates that the model as a whole is indeed significant.
A strong indication for the quality of the model fitting is the construction of a parity
diagram, plotting the observed with respect to the predicted response values. In the ideal
case of a perfect fit, all points are located at the first bisector. Hence, for a model
estimation procedure to be performant, the deviation for the different experiments has to
be as low as possible. Figure 3-13 shows the parity plot for the exemplary experiment.
For all J-V measurements, the points are lying almost perfectly on the bisector. Hence, the
goodness of the fit which was already observed in Figure 3-12 is confirmed. The square
of the multiple correlation coefficient given by:
𝑅𝑖2 =
∑ 𝐼𝑐𝑎𝑙𝑐,𝑖,𝑘2201
𝑘=1
∑ 𝐼𝑜𝑏𝑠,𝑖,𝑘2201
𝑘=1
(3-14)
denotes the fraction of the observed values for the current through the cell which is
captured by the model for the 𝑖’th experiment. Therefore, a higher value will, in general,
point at a better quality of the parameter estimates. For each estimation procedure, i.e. for
each temperature and both cells, this indicator amounted to more than 0.99.
48 Case study: modeling of current-voltage characteristics of thin layer solar cells
Figure 3-13 Parity diagram for the first cell at 290 K
The interpretation of the residual plot for each estimation step yields some useful
inference on the validity of the statistical theory that underlies the fitting routines, i.e. a
Gaussian distribution for the experimental error, with a constant variance for each data
point and a zero correlation between the different measurements at each set of
experimental conditions. In ideal situations, when plotting the misfit between
observations and model predictions, a chaotic scatter has to result, with values contained
in a finite and symmetric band across the horizontal axis.
The residual plot for the type experiment is depicted in Figure 3-14. For negative and
small positive voltages, the residuals are nicely located around the voltage axis; however,
at higher voltages a significant increase of the residuals is seen. Meanwhile, a trend
seems to emerge in this region. This behavior is probably explained by the quasi-
continuous measurement of the different points of the current-voltage characteristics at
each temperature. Hence, the misfit between the experimental data and the model
predicted responses is a trending function as well. Again, it follows that the actual fit is
accurate, as no residual amounts to more than 1% of the actually measured current.
Similar residual profiles have been observed for the other experimental conditions.
-50
0
50
100
-20 0 20 40 60 80 100
Calculated
current [mA]
Observed current [mA]
Case study: modeling of current-voltage characteristics of thin layer solar cells 49
Figure 3-14 Residual plot for the first cell at 290 K
At last, the correlational structure of the set of estimated model parameters is assessed.
For the type experiment, where only six model parameters had to be considered, the
result in shown in Table 3-1. For this particular temperature and cell, the only strongly
correlated parameters are those associated with the main junction. This does not surprise,
as a high correlation between the pre-exponential factor and parameters in the exponent
is a commonly encountered phenomenon for the kinetic modelling of chemical reactions
as well, where a similar Arrhenius dependence exists for the rate coefficients.
Table 3-1 Binary correlation matrix of the model parameters for the first cell at 290 K
𝑱𝟎𝟏 𝑨𝟏 𝑹𝒔𝒉 𝑹𝒔 𝒌 𝒎
𝑱𝟎𝟏 1 -0.999 -0.087 -0.917 -0.007 0.226
𝑨𝟏 -0.999 1 0.082 0.929 0.004 -0.217
𝑹𝒔𝒉 -0.087 0.082 1 0.039 0.843 -0.898
𝑹𝒔 -0.917 0.929 0.039 1 -0.019 -0.139
𝒌 -0.007 0.004 0.843 -0.019 1 -0.595
𝒎 0.226 -0.217 -0.898 -0.139 -0.595 1
However, since the software was not able to estimate all model parameters
simultaneously at all temperatures, for those cases no complete binary correlation matrix
was generated as well. Therefore, a similar analysis could not be performed for all
situations so that general conclusions on the correlational structure could not be drawn.
-0.5
0.5
-1 -0.5 0 0.5 1
Residual
[mA]
Voltage [V]
50 Case study: modeling of current-voltage characteristics of thin layer solar cells
3.4.2 Physical interpretation of the results
The point estimates for the model parameters obtained from the Bayesian estimation
routine in Athena Visual Studio are depicted as a function of temperature in Figure 3-15
and Figure 3-16 for the first and second cell respectively.
The pre-exponential factors 𝐽01 and 𝐽02 corresponding to the main junction and the shunt
tunneling diode term respectively are depicted in a semi log plot with respect to the
inverse temperature. As is concluded from the Figure 3-15a and Figure 3-16a, the pre-
exponential strongly decreases for lower temperatures. Moreover, a linear relation
emerges, which allows for the calculation of an Arrhenius type of temperature
dependence for both cells, according to:
𝐽0𝑖 = 𝐽0,0𝑖 exp (−𝑞𝐸𝑖
𝑘𝐵𝑇) , 𝑖 = 1,2 (3-15)
with 𝑞 the elementary electric charge and 𝑘𝐵 the Boltzmann constant. For the first cell,
the contribution of the shunt tunneling diode turned out not to be significant, and has
therefore not been reported. For the main junction term, it is found that 𝐽0,01 =
483.185 𝑚𝐴 while 𝐸1 = 624𝑚𝐽
𝐶. The associated exponential factor is represented in terms
of the ideality constant 𝑛1 in Figure 3-15b. It is seen that, for all studied temperatures, this
parameter takes a value between 1 and 2, which obeys the criterion for diffusion and
recombination controlled electron transport across the cell.
For the second cell, a similar decrease of the main junction factor is observed, although
less steep than for the first one: here 𝐽0,01 = 2.74 10−4 𝑚𝐴 and 𝐸1 = 437𝑚𝐽
𝐶 were
calculated. Except for the three lowest temperatures, the restraint on the value of the
ideality factor is again met, see Figure 3-16b. On the other hand, the optimal pre-
exponential factor is considerably smaller than for the first cell. In contrast to the first cell,
for the second one the shunt tunneling contribution does become significant for lower
temperatures. The corresponding values are given by the filled blocks in Figure 3-16a. To
guide the eye, the blank symbols extend the Arrhenius dependence of the shunt diode
pre-exponential to higher temperatures. Following the notation introduced in equation
(3-15), it was calculated that 𝐽0,02 = 2.74 10−4 𝑚𝐴 and 𝐸2 = 187𝑚𝐽
𝐶. As has already been
mentioned in the theoretical discussion of the model in Section 3.2.2, the value of the
corresponding ideality factor does not have any physical interpretation and is therefore
not explicitly shown. Where the shunt diode term was significant, values for this ideality
parameter amounted to almost 10.
The optimal estimates for the shunt resistance are depicted in Figure 3-15 and Figure 3-
16c. In contrast to the parameters for the main junction and shunt tunneling diode terms,
there is no common trend observed for both cells. For the first one, the estimated shunt
resistance is at first remarkably smaller than for the second cell: the difference amounts to
Case study: modeling of current-voltage characteristics of thin layer solar cells 51
Figure 3-15 Parameter estimates and corresponding 95% individual confidence intervals for
the first cell
0 0.005 0.01
J01
[A/cm²]
1/T [1/K]
0
1
2
100 200 300
n1 [-]
T [K]
0
1000
2000
3000
4000
100 200 300
Rsh
[Ω/cm²]
T [K]
0
100
200
300
400
100 200 300
Rs
[Ω/cm²]
T [K]
c
0
1
2
100 200 300
k
[mA/Vm/cm²]
T [K]
0
1
2
3
4
5
100 200 300
m [-]
T [K]
a b
d
e f
52 Case study: modeling of current-voltage characteristics of thin layer solar cells
Figure 3-16 Parameter estimates and corresponding 95% individual confidence intervals for
the second c ell
1
10-5
10-10
10-15
10-20
10-25
10-30
0 0.005 0.01
J0
[A/cm²]
1/T [1/K] 0
0.5
1
1.5
2
2.5
3
100 200 300
n1 [-]
T [K]
0
50000
100000
150000
200000
100 200 300
Rsh
[Ω/cm²]
T [K]
0
100
200
300
400
100 200 300
Rs
[Ω/cm²]
T [K]
0
0.04
0.08
100 200 300
k [mA/Vm/cm²]
T [K]
0
1
2
3
4
100 200 300
m [-]
T [K]
a b
c d
e f
Case study: modeling of current-voltage characteristics of thin layer solar cells 53
a factor of 1000. Moreover, for the first cell, the increase of the resistance follows a linear
trend for lowering temperatures. On the other hand, the increase of the simulated shunt
resistance is rather exponential for decreasing temperatures. To ensure that this
remarkable difference in the temperature dependence of the resistance of the second cell
was not caused by a modeling mistake, the estimation routine was run again, starting
from the estimated values for the first cell. Since the obtained fit was poor, the reason for
the deviation will rather have a physical origin.
A similar exponential increase for decreasing temperatures is observed for the simulated
series resistance, depicted in Figure 3-15 and Figure 3-16d. In contrast to the shunt
contribution, the estimations for the series resistance are remarkably similar for both
cells.
A stronger difference between the cells is observed for their estimated space-charge
limited current. While for the first cell the value for the multiplicative parameter 𝑘 is
almost constant for the entire considered temperature range, the second cell shows an
almost linear decrease for lower temperatures. Given the strong difference in the
estimated parameter values for both cells, a similar strategy was performed as for the
shunt resistance. Once more, using the estimated parameter values from the first cell in
the second resulted in an inferior fit, demonstrating again that reasons will have to be
searched in the physics of the system. Additionally, the resulting point values are a factor
50 higher for the first cell compared to the second one. For the exponent parameter 𝑚 to
be physically meaningful, the estimated parameters have to be located around 2 for all
temperatures. From the inspection of Figure 3-15 and Figure 3-16f, it follows that this
criterion is fulfilled, except for the experiment at 300K for the first cell. However, the
relatively wide confidence interval on this estimate, comprising the desired value of 2 as
well, points at the statistical uncertainty about its accuracy. It is interesting to notice that
the confidence intervals on the point estimates are remarkably higher for this parameter
compared to the others.
Based on these estimated parameter values, the contribution of the different suggested
parasitic current pathways to the global electric behavior of the cell can be determined.
This way, it is possible to perform a sort of electric path analysis, analogously to the often
performed reaction path analysis in the kinetic modelling of chemical reaction systems.
Therein, the importance of each potential step in a complex reaction mechanism is
assessed for varying process conditions. When doing a similar assessment for the
different parasitic pathways in a solar cell, temperature and voltage will be the most
relevant experimental variables.
The contribution of the different current pathways is graphically represented in Figure 3-
17, giving the share in the total current through the cell in percentage terms. It follows
immediately that both the temperature and the voltage across the cell have a strong
54 Case study: modeling of current-voltage characteristics of thin layer solar cells
impact on how strongly each mechanism takes part in the conduction of the electric
current.
In the negative and lower positive voltage range the current through the cell is
dominated by the shunt resistance and SCLC. Depending on the temperature, both
mechanisms are competing for the lead role: at 290 K the current through the shunt
resistance is always higher than the SCLC, while at lower temperatures latter is growing
in importance and becomes the strongest current transport mechanism for reverse biases.
Nevertheless, for all temperatures, the shunt resistance almost completely represents the
current at lower positive and negative voltages. Only for higher positive voltages, the
main junction comes into play and the contributions of the parasitic current pathways
start to decrease strongly. It follows that for lower temperatures, the onset for significant
current through the main junction is shifted towards higher voltages. The contribution of
the shunt tunneling diode, which was significantly estimated for the second cell at lower
temperatures, is visible for solely higher voltages as well.
Moreover, by showing the graphs for both cells together, it follows that the electrical
behavior of the cells is fundamentally different. Although similar qualitative trends are
seen about the importance of the different pathways, strong deviations exist between
their absolute contributions for the both cells. Compared to first cell, the SCLC
mechanism is present to a much lower extent, favoring current through the shunt
resistance. Since both cells are cut out of the same mother panel, this demonstrates the
strongly local character of the imperfections in the cell structure which cause the
undesired power loss in the device.
Case study: modeling of current-voltage characteristics of thin layer solar cells 55
Figure 3-17 Contribution of the suggested leakage pathways for both solar cells,
for different temperatures. First cell: left, second cell: right
Main junction Shunt resistance SCLC Shunt tunneling
0
100
-1 0 1
0
100
-1 0 1
0
100
-1 0 1
0
100
-1 0 1
0
100
-1 0 1Voltage [V]
0
100
-1 0 1Voltage [V]
290
K
200
K
110
K
56 Case study: modeling of current-voltage characteristics of thin layer solar cells
3.5 References
1. BP Energy Outlook 2035. 2015.
2. Zhu, J. and Y. Cui, Photovoltaics: More solar cells for less. Nat Mater, 2010. 9(3): p.
183-184.
3. Integration of Renewable Energy in Europe. 2010, DNV GL - Energy.
4. Directive 2009/28/EC of the European Parliament and of the Council of 23 April 2009 on
the promotion of the use of energy from renewable sources and amending and
subsequently repealing Directives 2001/77/EC and 2003/30/EC. 2009, European
Parliament, Council of the European Union.
5. EPIA, Global market outlook for Photovoltaics 2013-2018. European Photovoltaic
Industry Association, 2014.
6. Razykov, T.M., et al., Solar photovoltaic electricity: Current status and future prospects.
Solar Energy, 2011. 85(8): p. 1580-1608.
7. Saga, T., Advances in crystalline silicon solar cell technology for industrial mass
production. NPG Asia Mater, 2010. 2: p. 96-102.
8. Nelson, J., The Physics of Solar Cells. 2003: Imperial College Press.
9. Reyniers, M.-F., Algemene Scheikunde. 2010.
10. Luque, A. and S. Hegedus, Handbook of Photovoltaic Science and Engineering. 2011:
Wiley.
11. El Chaar, L., L.A. lamont, and N. El Zein, Review of photovoltaic technologies.
Renewable and Sustainable Energy Reviews, 2011. 15(5): p. 2165-2175.
12. Jackson, P., et al., New world record efficiency for Cu(In,Ga)Se2 thin-film solar cells
beyond 20%. Progress in Photovoltaics: Research and Applications, 2011. 19(7): p.
893-897.
13. Rockett, A.A., Current status and opportunities in chalcopyrite solar cells. Current
Opinion in Solid State and Materials Science, 2010. 14(6): p. 143-148.
14. Green, M.A., et al., Solar cell efficiency tables (version 15). Progress in Photovoltaics:
Research and Applications, 2000. 8(1): p. 187-195.
15. Green, M.A., et al., Solar cell efficiency tables (Version 45). Progress in Photovoltaics:
Research and Applications, 2015. 23(1): p. 1-9.
16. Neisser, A., et al., Effect of Ga incorporation in sequentially prepared CuInS2 thin film
absorbers. Solar Energy Materials and Solar Cells, 2001. 67(1–4): p. 97-104.
17. Kaigawa, R., et al., Improved performance of thin film solar cells based on Cu(In,Ga)S2.
Thin Solid Films, 2002. 415(1–2): p. 266-271.
18. Jager-Waldau, A., Progress in chalcopyrite compound semiconductor research for
photovoltaic applications and transfer of results into actual solar cell production. Solar
Energy Materials and Solar Cells, 2011. 95(6): p. 1509-1517.
19. Decock, K., Defect related phenomena in chalcopyrite based solar cells. 2012.
20. Schock, H.-W. and R. Noufi, CIGS-based solar cells for the next millennium. Progress
in Photovoltaics: Research and Applications, 2000. 8(1): p. 151-160.
21. Chopra, K.L., P.D. Paulson, and V. Dutta, Thin-film solar cells: an overview. Progress
in Photovoltaics: Research and Applications, 2004. 12(2-3): p. 69-92.
22. Williams, B.L., et al., Identifying parasitic current pathways in CIGS solar cells by
modelling dark J–V response. Progress in Photovoltaics: Research and Applications,
2015: p. n/a-n/a.
Case study: modeling of current-voltage characteristics of thin layer solar cells 57
23. Hengel, I., et al., Current transport in CuInS2:Ga/Cds/Zno – solar cells. Thin Solid
Films, 2000. 361–362(0): p. 458-462.
24. Pallarès, J., et al., A compact equivalent circuit for the dark current-voltage
characteristics of nonideal solar cells. Journal of Applied Physics, 2006. 100(8): p.
084513.
25. Bosio, A., et al., Polycrystalline CdTe thin films for photovoltaic applications. Progress
in Crystal Growth and Characterization of Materials, 2006. 52(4): p. 247-279.
26. Kaminski, A., et al. Conduction processes in silicon solar cells. in Photovoltaic
Specialists Conference, 1996., Conference Record of the Twenty Fifth IEEE. 1996.
27. Rau, U., et al., Electronic loss mechanisms in chalcopyrite based heterojunction solar
cells. Thin Solid Films, 2000. 361–362(0): p. 298-302.
28. Rose, A., Space-Charge-Limited Currents in Solids. Physical Review, 1955. 97(6): p.
1538-1544.
29. Dongaonkar, S., et al., Universality of non-Ohmic shunt leakage in thin-film solar cells.
Journal of Applied Physics, 2010. 108(12): p. 124509.
30. Liao, Y.-K., et al., A look into the origin of shunt leakage current of Cu(In,Ga)Se2 solar
cells via experimental and simulation methods. Solar Energy Materials and Solar Cells,
2013. 117(0): p. 145-151.
58 Literature review on alternative parameter estimation techniques
Chapter 4
Literature review on alternative
parameter estimation techniques
In this chapter three potential adaptations of the currently used, classical nonlinear
regression method, extracted from a literature survey on statistical techniques to estimate
unknown model parameters, will be discussed.
The first couple of methods originates from the rather strong requirements on the
covariance structure of the experimental errors under which classical nonlinear
regression is guaranteed to perform well. The underlying idea of these techniques is to
encode a correction system in the classical regression procedure to account explicitly for
a potential violation of the theoretical conditions. This way, the stringent theoretical
framework will be loosened and the range of applicability of the classical regression
methods will be highly extended. Logically, the adapted procedure will yield more
reliable results than the original one. The methods introduced in Section 4.1 will account
for the possibility of a non-constant error variance, or heteroscedasticity, while Section
4.2 will focus on the correction for the occurrence of a serial correlation in a set of time
series observations.
At last, the potential of the Bayesian approach to parameter estimation will be explored
in section 4.3. Starting from a fundamentally different approach to the problem of model
parameter estimation, the Bayesian framework combines some very attractive features,
including an efficient exploration of high-dimensional probability distributions, an
automatic weighing of experimental errors and the possibility to include knowledge on
the model parameter prior to any experiment. It goes without saying that, if proven to be
sufficiently performant, Bayesian parameter estimation will be a serious challenger of
classical regression schemes.
4.1 Tackling the heteroscedasticity issue: towards a
proper handling of heterogeneous variance of the
experimental error
4.1.1 Data-based weighing of the residuals
One of the fundamental assumptions that make up the mathematical framework of
ordinary least squares estimation is the homogeneity of the variance of the random error
terms which are associated with the experimental data. Only when this criterion is met,
Literature review on alternative parameter estimation techniques 59
the efficiency of the regression is assured. Moreover, in the case of nonlinear models both
the point estimates of the parameters and the confidence intervals resulting from the
regression are potentially biased by illicitly neglecting the presence of heteroscedasticity
[1]. The variance of the experimental error on a certain observation is a measure for its
precision, and, hence, of the reliability of the information it contains. A higher variance
reflects a more pronounced uncertainty about the actual response value. When assuming
a constant variance for all observations, all experimental data will contribute equally to
the objective function of the regression, irrespective of the information they contain,
which is a doubtful practice.
The presence of an inhomogeneous error variance is clearly reflected in the residual plot,
as illustrated in Figure 4-1. Where a random scatter is expected when the constant
variance criterion is met, distinct trends, like the formation of clusters or strong
fluctuations in residual order of magnitude, will be observed. Different causes for the
departure from the assumption of a constant error variance have been identified for
kinetic modelling. Reaction rates are known to be strongly depending on mainly
temperature and the concentration of the involved species. Depending on the response
variable to be measured and the analytical devices to collect the experimental data, this
will potentially bias the uncertainty on the results. Often higher variances are expected
for more ‘severe’ reaction conditions.
Figure 4-1 Illustrative example of residual plots for a case with constant variance (left) and
strong heteroscedastic experimental errors (right)
An additional source of heteroscedasticity arises from erroneous steps in the statistical
procedure itself, e.g., due to an explicit linearization of a nonlinear kinetic model by the
user. The statistical validity of the parameter estimates obtained for this linearized model
is doubtful, as not the actual dependent variables but rather functions of them are
regressed. In general, the additivity of the modelled prediction and its residual is not
invariant to any transformation, so that the assumption of a Gaussian error term on the
residual
Experiment number Experiment number
60 Literature review on alternative parameter estimation techniques
actual response does not necessarily hold for its transformed counterpart as well. The
detrimental effect of such transformations, when applied improvidently, on the precision
of the final parameter estimates and the validity of the resulting model predictions have
already been demonstrated [2].
The presence of heteroscedasticity is tackled straightforwardly by considering a suited
weighing of the collected data. If 𝑛 independent single-response experiments have been
carried out and variance heterogeneity is allowed, the covariance matrix of the
experimental errors 𝑽 is a diagonal matrix with the heterogeneous error variances as
diagonal elements, hence:
𝑽 = [𝜎112 ⋯ 0⋮ ⋱ ⋮0 ⋯ 𝜎𝑛𝑛
2] = 𝜎2
[ 1
𝑤1⋯ 0
⋮ ⋱ ⋮
0 ⋯1
𝑤𝑛]
= 𝜎2𝑾−1 (4-1)
where the weight matrix 𝑾 is introduced, 𝜎2 being a multiplicative constant. The
likelihood function of the 𝑝 regression parameters 𝜷 of the general nonlinear model:
𝑦 = 𝑓(𝒙, 𝜷) + 𝜖 (4-2)
is then given by:
𝐿(𝜷|𝑦) =1
√2𝜋|𝐕|exp {−
1
2∑
[𝑦𝑖 − 𝑓(𝒙𝒊, 𝜷)]2
𝜎𝑖𝑖2
𝑛
𝑖=1
}
=∏ √𝑤𝑖𝑛𝑖=1
√2𝜋𝜎exp {−
1
2𝜎2∑𝑤𝑖[𝑦𝑖 − 𝑓(𝒙𝒊, 𝜷)]
2
𝑛
𝑖=1
}
(4-3)
with n the number of collected data points, assuming the errors are normally distributed.
From this, it follows that the heterogeneity of the error variance is bypassed if the error
terms appearing in ordinary least squares regression are scaled by a factor inversely
proportional to the inverse of the error variances, so 𝑤𝑖 = 1/𝜎𝑖𝑖2. The interpretation of this
scaling as a weighing operation follows naturally, as an observation with the lower
variance, or equivalently, a higher precision, will have a stronger contribution to the
ultimate sum of squared residuals. Since knowledge about these values is often limited
and even completely absent, it is not possible to determine or predict 𝑤𝑖 exactly.
Alternative pathways towards an adequate and robust determination of the suited
weighing factors have to be explored.
An often encountered practice to implement a weighted least squares methodology. This
relies on the estimation of the error variance from the observations of replicate
experiments, thus by collecting a set of 𝑚 response values 𝑦𝑖,𝑗, 𝑗 = 1. . 𝑛 under identical
conditions 𝒙𝒊. The underlying reasoning is that the sample variance �̂�𝑖2 associated with
the experimental conditions 𝒙𝒊, given by:
Literature review on alternative parameter estimation techniques 61
�̂�𝑖2 =
∑ (𝑦𝑖,𝑗 − 𝑦�̅�)2𝑛
𝑗=1
𝑛 − 1 (4-4)
where 𝑦�̅� denotes the mean response value of the 𝑚 experiments, is an adequate and
unbiased estimator for the unknown error variance 𝜎𝑖𝑖2 of the i’th observation [3]. This
methodology is far from optimal: an analysis based on sample variances is very
inefficient unless the number of replicate experiments is very large. Approximations that
rest on a limited number of replications have been reported to be wildly unstable, and
hence introduce an additional level of variability in the regression procedure [4]. The
precision of the resulting parameter estimates is therefore often observed to be inferior to
the outcome of an unweighted regression procedure. Moreover, due to the lacking of a
solid theoretical base, the quality and adequacy of the parameter estimates obtained by
this regression is not ascertained, which makes their physical meaning uncertain [5].
One of the onsets for a more mathematically founded technique to correct for variance
heterogeneity was developed by Box and Hill (1974) [6]. The method finds it origin in the
assumption that the error variance is a monotonic function of the expected value of the
observation. It explicitly proposes the existence of a power law transformation of the
data, given by a function of the a priori unknown transformation parameter 𝜙, which
does show constant variance for all experiments. In general, the statement reads:
𝑦𝑖(𝜙)
= {
|𝑦𝑖|𝜙 − 1
𝜙,𝜙 ≠ 0
log(|𝑦𝑖|) , 𝜙 = 0, 𝑦𝑖 ≠ 0
(4-5)
when accounting for the suggestion of Pritchard, Downie and Bacon (1977) to expand the
methodology to allow for negative response values [2]. Based on this transformation, a
closed expression for the most suited weighing factors is derived:
𝑤𝑖 ∝ |𝑦�̂�|2𝜙−2 (4-6)
where 𝑦�̂� denotes the predicted response value for the i’th observation, based on the
calculated parameter estimates.
It follows that the most suited weighing factors and the parameter estimates are mutually
dependent, hence, the finally calculated values of these target variables have to be
consistent with each other. This forms the basis of the iteratively reweighted least squares
method, an iterative approach in which the weights and model parameters are
repeatedly calculated in a two-stage routine [4]. The corresponding algorithm reads:
1. Calculate a preliminary estimate �̂�∗ of the model parameters by ordinary least
squares minimization;
2. Calculate the weights 𝑤𝑖∗ in accordance to equation (4-6)
3. Determine an updated version �̂�∗ by minimizing of the weighted least squares
estimation ∑ 𝑤𝑖∗[𝑦𝑖 − 𝑓(𝒙𝒊, 𝜷)]
2𝑛𝑖=1
4. Recalculate the weights 𝑤𝑖∗ based on the updated model parameters
62 Literature review on alternative parameter estimation techniques
5. Repeat steps 3 and 4 𝑁 − 1 times
A clear value for the optimal number of iterations 𝑁 is not provided by literature. It is
often suggested that major gains in the performance of the weighted regression are found
for the first cycles only, and little improvement is gained by further iterations. For any
number of cycles, the asymptotic behavior of the obtained model parameters is well
described under the condition that the starting estimate �̂�∗ is √𝑛-consistent. When this
criterion is met, the final parameter estimator �̂� is asymptotically normally distributed
with mean 𝜷 and covariance matrix:
�̂� = 𝜎2 [∑𝑤�̂�
𝑛
𝑖=1
�̅�𝑓(𝒙𝒊, �̂�) ∙ [�̅�𝑓(𝒙𝒊, �̂�)]T]
−1
(4-7)
with �̅�𝑓(𝑥𝑖, �̂�) the 𝑝 × 1 gradient vector of the model function evaluated at the parameter
estimator, hence �̅�𝑓(𝑥𝑖, �̂�)|𝑘 =𝜕𝑓(𝑥𝑘,𝜷)
𝜕𝛽𝑘|𝜷=�̂�
.
Theoretically, the ideal estimation corresponds to iterating until convergence towards a
self-consistent set of weights and model parameters is achieved, which corresponds
to 𝑁 = ∞. In this case, it is suggested not to run the iterative procedure, but to maximize
directly the joint log-likelihood given by:
log[𝐿(𝜷, 𝜎, 𝜙|𝒚)] =
−𝑛
2log(2𝜋𝜎2) + (𝜙 − 1)∑log(𝑦𝑖)
𝑛
𝑖=1
− (2𝜎2)−1∑[𝑦𝑖(𝜙)
− 𝑓(𝜙)(𝒙𝒊, 𝜷)]2
𝑛
𝑖=1
(4-8)
to obtain the optimal parameter estimate �̂� from its mode. Hence, an explicit
transformation of both the response data and the model predictions appears, which is
why this method is referred to as Power Transformation Both Sides (PTBS) [7, 8]. This is
an important remark, since due to this strict dependence of the response values upon the
previously unknown value of 𝜙, the implementation of this regression may not be
allowed for certain statistical software packages.
Alternatively, Pritchard et al. suggested to estimate the parameters by means of a direct,
Bayesian estimation of the 𝑝 + 1-dimensional vector (𝜷, 𝜙), especially in case of highly
nonlinear model functions. Taking the Jeffreys prior for the standard deviation 𝜎 and a
uniform distribution for the parameter set (𝜷, 𝜙), the joint prior distribution is given by:
𝑝(𝜎, 𝜷, 𝜙) ∝ 𝜎−1 (4-9)
The corresponding joint posterior density function is then obtained by integrating out 𝜎,
leading to the expression:
Literature review on alternative parameter estimation techniques 63
𝑝(𝜷, 𝜙|𝒚) ∝ √∏𝑤𝑖
𝑛
𝑖=1
∙ [∑{𝑤𝑖[𝑦𝑖 − 𝑓(𝒙𝒊, 𝜷)]}2
𝑛
𝑖=1
]
−𝑛/2
(4-10)
All knowledge about the unknown parameters is described by this distribution. The
modal values (�̂�, �̂�) obtained by maximization of this posterior density function, will
eventually serve as point estimates of the parameters. However, in accordance to the
Bayesian view on regression, the true statistical inference on both 𝜷 and 𝜙 is found in
their highest probability intervals, rather than in point estimates, see section 4.3 for more
details. Assuming that the vector 𝜸 = [𝜷,𝜙] approximately obeys a multivariate normal,
the associated covariance matrix �̃� reads:
�̃� = {�̃�𝑖𝑗} = {−𝜕2log (𝑝(𝜷, 𝜙|𝒚))
𝜕𝛾𝑖𝜕𝛾𝑗|�̂�
}
−1
(4-11)
which is the inverse Hessian matrix of the posterior, evaluated at the modal parameter
values. Closed analytical expressions are available for linear models.
As for ordinary least squares regression, the model adequacy has to be assessed by
examining the residual plots, by ensuring the physical meaning of the obtained
parameter estimates – care has to be taken that not only the modal values are checked,
but rather the entire probability interval – and by performing a lack-of-fit test.
Comparisons of both the Bayesian approach and the PTBS method, showed both the
increased performance compared to ordinary, unweighted least squares as well as the
high mutual resemblance of the finally calculated parameter estimates [9].
One major drawback of this type of variance modelling is the high sensitivity of the
weighing factors, and thus of the final parameter estimates, on the presence of bad
experimental data points. At worst, if the weighing procedure does assign a relatively
high importance to these observations, the outcome of the regression will be
outperformed by the ordinary least squares method. Over the last decades, several
diagnostic techniques have been developed to distinguish outlier points from the reliable
results, and to deal with them appropriately.
4.1.2 Robust estimation and outlier detection
When fitting a model to a set of experimental data, some points have a higher impact on
the regression than others, e.g., when the associated responses strongly differ from those
of the other measurements. Such points often tend to ‘attract’ the regression curve, i.e.
pull it away from what would be the best-fitting line based on the other measurements,
and therefore strongly influence the final estimates of the model parameters. The origin
of these so-called influential points is two-fold. On the one hand, for some specific process
conditions, typically located at the boundaries of the operational range, the phenomenon
under study will probably behave somewhat unexpectedly, i.e. deviating from what is in
64 Literature review on alternative parameter estimation techniques
line with the other responses. Since such ‘extreme’ behavior contains the highest amount
of information to unravel all subtleties and details of the physicochemical mechanisms
over the entire range of process conditions, the proper inclusion of such unexpected
results in the final regression procedure is beyond dispute [5]. By applying a suited
weighing operation, as suggested above, the overly dominating of such responses on the
fitting procedure is normalized.
In contrast to this class of, in fact valuable, influential points, it is also possible that the
strong deviation of some responses finds its origin in serious errors during the
experiment, e.g., by mistakes in the measurement or analysis steps. The corresponding
observations are intrinsically wrong and do not add any valuable information to the
fitting method. Nevertheless, despite their incorrectness, such measurements will bias
the regression in a similar way as the desired influential data, however, now at the
expense of the reliability of the parameter estimates. Because of their detrimental impact
on the quality of the regression, a proper handling of these measurements is required.
The major issue in dealing with outliers is in their identification. Indeed, as such points
will strongly influence the regression curve, it is not ensured that the actual wrong data
point will be further away from the final fit than the other, correct observations.
Therefore, flagging an outlier based on the inspection of the residuals is often ineffective
[10].
Additionally, literature is not unambiguous about the most effective way to deal with
outlying data points. Several statistical tests have been developed to decide whether an
observation has to be seen as an outlier or not [11]. Since most of them require a
considerable amount of replicate experiments or only detect single outliers, their overall
use in an automated scheme is hindered. Another possible methodology is found in the
field of robust regression, a collective term for all adaptations to the least squares routine
to guard it against violation of the fundamental assumptions underlying regression
theory, including the presence of undesired erroneous data [12]. Although the
information included in outliers is of inferior quality, robust fitting routines do include
those observations in the estimation procedure, yet with a lower ‘weight’ for data which
are located far from the final regression curve. The performance of this rather
conservative approach was reported as insufficient, and its inability to provide reliable
confidence intervals for the parameters is seen as a serious shortcoming. More recently, a
new method has been described which combines the strengths of robust regression with
an automatic removal of outliers. Although the removal of data points which do not fit in
the expected framework is not uncontroversial, the automated nature of the method
blocks the possible infiltration of ad hoc decisions and intentional biasing of the
regression [13].
The first step of the algorithm consists of a robust regression of the model on the
complete data set. This routine differs from ordinary least squares regression since it
explicitly assumes a Lorentzian distribution of the residuals, which is claimed to be less
Literature review on alternative parameter estimation techniques 65
sensitive to response values that are located further from the ‘ideal’ baseline. Indeed, the
Lorentzian merit function to be minimized for set of experimental errors 𝜀 is given by:
∑ln [1 +𝜀𝑖2
2]
𝑖
(4-12)
which lowers the contribution of strongly deviating data, i.e. observations for which 𝜀𝑖 is
high.
The robust residuals 𝑒𝑅 associated with the experimental data set 𝑦 and the
corresponding model predictions �̂� are defined as:
𝑒𝑅,𝑖 =𝑦𝑖 − �̂�𝑖𝜎𝑅
(4-13)
where 𝜎𝑅 is the robust standard deviation of the residuals (RSDR), given by:
𝜎𝑅 = 𝑃68𝑛
𝑛 − 𝑝 (4-14)
with 𝑃68 the 68.27 percentile of the absolute value of the actual residuals 𝑒, 𝑛 the number
of experiments and 𝑝 the number of model parameters to be estimated. The final
objective function to be minimized by the regression procedure is given by:
∑ln [1 + 𝑒𝑅,𝑖
2 ]
𝑖
(4-15)
which is slightly different from the true merit function in equation (4-12) and therefore
more suited for robust estimation purposes. It is important to note that no user-specified
weighing factors have to be included, as this has a negative impact on the regression
quality for robust techniques. The local minimization of the objective function is done
via, e.g., the Levenberg-Marquardt algorithm. Since 𝜎𝑅 changes upon convergence of the
routine, it is important to use the initial value when evaluating the objective function
value for two subsequent iterations. Before the improvement of the goodness of fit is
determined, it is hence required to recalculate the merit function of the prior iteration
with the most recent value of 𝜎𝑅.
Once the regression has converged and preliminary parameter estimates have been
determined, the corresponding true residuals are calculated. Their absolute values are
then ordered from lowest to highest. It is suggested to set the maximum fraction of
outlying data at 30%; therefore, only those data points with the 30% highest residuals are
subjected to the outlier detection analysis. This procedure consists of an iterative cycle,
according to the following algorithm, for all data points:
For all 𝑖 from ⌈0.7𝑛⌉ to 𝑛
1. Calculate the parameter:
66 Literature review on alternative parameter estimation techniques
𝛼𝑖 =0.01 ∙ [𝑛 − (𝑖 − 1)]
𝑛 (4-16)
and the variable
�̂� =|𝑒𝑖|
𝜎𝑅 (4-17)
2. Determine the two-tailed P value from the t-distribution with 𝑛 − 𝑝 degrees of freedom,
corresponding to 𝑃𝑟(|𝑡| > �̂�)
3. If 𝑃 < 𝛼𝑖, data point 𝑖 and all observations with higher residuals are outliers and have to
be removed, while the iterative cycle is stopped;
Else, include data point 𝑖 in the set of reliable observations, and repeat the cycle for
observation 𝑖 + 1. If 𝑖 = 𝑁 there are no outliers in the data set.
Once the data set is purified from present outliers, the actual parameter estimation will
be performed using the remaining data by means of weighted least squares routines, like
those introduced in section 4.1.1.
4.2 Accounting for serial correlation of the error
A common procedure to study a chemical reaction in a batch reactor setup is by setting a
number of desired experimental conditions and, subsequently, following the evolution
for each of these systems over time, by periodically taking samples of the reaction
mixture. Since there is no variable feed stream, varying the clock time is the only way to
assess the effect of the residence time of the reaction mixture in the reactor.
Due to the absence of a continuous flow through the reactor, any fluctuation in the
system will remain as ideally no exchange with the environment is possible. This holds in
particular for the variables that are required to elucidate the underlying kinetics: any
source of experimental error that pops up at a certain moment of time will build up in the
reactor and influence the future dynamics of the system. Ultimately, the experimental
results at any following moment will show some trend with respect to the past, or, stated
otherwise, it is expected for any data point to be correlated to the result of the foregoing
measurement.
Experimental data showing some distinct and persistent trend over time are called to be
serially correlated. The occurrence of serial correlation in times series data is a well-
known phenomenon, which is often easily detected [5]. However, tackling the issue is not
that straightforward, as it requires the introduction of an additional error model, which
causes in turn the nonlinear regression procedure to become more complex. Moreover,
since the exact expression for the time dependence of the experimental error is not
Literature review on alternative parameter estimation techniques 67
available, both the diagnostics and remedies are inevitably restricted to, though well-
funded, approximations.
One of the requirements that have to be fulfilled to ensure that an ordinary least squares
regression returns reliable and accurate estimations of the unknown kinetic parameters,
is the mutual independence of the experimental errors associated with the output of the
performed measurements. Only if this criterion is met, the regression procedure will
return both the best (in terms of the maximization of the likelihood function) and the
most efficient (in terms of the amount of required information input) estimates for the
model parameters [14]. However, for an increasing degree of mutual correlation of the
errors on the experimental output, the improvident application of ordinary least squares
technique to a non-linear model will result in parameter estimates that are strongly
biased and inefficient while having estimated variances that deviate from their actual
values in an unpredictable manner [1]. To circumvent these potential pitfalls, a measure
for the degree of serial correlation for a certain situation has to be defined, and a suitable
correction procedure has to be applied.
4.2.1 Explicit modelling of the serial correlation of the error term
One possible way to model the time-dependent experimental error 𝜀(𝑡) of a series of
continuously measured data that are mutually correlated over time is by a so-called
autoregressive model of order 𝑞, i.e. 𝐴𝑅(𝑞):
𝜀(𝑡) =∑𝜌𝑖𝜀(𝑡 − 𝑖)
𝑞
𝑖=1
+ 𝑢(𝑡) (4-18)
By this definition, besides the term 𝑢(𝑡) part of the experimental error at a certain
moment t consists of a contribution of the error terms at foregoing moments, the order 𝑞
of the autoregression denoting the number of time steps incorporated. By assuming the
time series to be weakly stationary, the prefactor 𝜌𝑖, named the autocorrelation function
of the error, is given by:
𝜌𝑖 ≔ 𝜌(𝑖) = 𝑐𝑜𝑟𝑟[𝜀(𝑡), 𝜀(𝑡 − 𝑖)], ∀𝑡 (4-19)
and depends only on the time shift 𝑖 and not on the absolute clock time passed since the
start of the measurements. Hence, the correlation between the errors on two different
data points only depends on the distance in time that separates them [1]. Because of
equation (4-19), the following restriction on holds:
0 ≤ |𝜌𝑖| ≤ 1, ∀𝑖 (4-20)
The signal 𝑢(𝑡) in equation contains the contribution to the experimental error that
originates solely from the moment of sampling itself. To bridge the gap with the
experimental error under the classical assumptions for which ordinary least squares is
valid, the signal is believed to behave like white noise, hence obeying:
68 Literature review on alternative parameter estimation techniques
𝐸(𝑢(𝑡)) = 0;
𝑉𝑎𝑟(𝑢(𝑡)) = 𝜎𝑢2;
𝐶𝑜𝑣(𝑢(𝑡), 𝑢(𝑠)) = 0, ∀𝑡, 𝑠
The first criterion expresses the unbiasedness of the noise signal by stating that its time
average value is equal to zero; any instantaneous deviation of the value of the measured
variable is hence compensated over time. The requirement on the variance of the signal
limits the uncertainty of the signal to a finite and constant value 𝜎𝑢. The last characteristic
reflects the zero autocorrelation of the white noise at any moment of measuring. Stating
things this way, it follows readily that the white noise contribution is equivalent to the
experimental error under the classical assumptions that are made for the application of
ordinary least squares regression.
Due to its simplicity and its satisfactory capability to tackle the major part of the serial
correlation issue, the 𝐴𝑅(1) model is the most commonly used technique to describe the
relation between experimental error at different moments, allowing for the detection of a
significant degree of serial correlation by either a graphical way or by calculating a test
criterion having a closed form. The general definition in equation (4-18) hence simplifies
to:
𝜀(𝑡) = 𝜌𝜀(𝑡 − 1) + 𝑢(𝑡) (4-21)
so that only one autocorrelation function needs to be determined.
By plotting the couples of residuals [𝑒(𝑖), 𝑒(𝑖 − 1)] in a so-called lag-plot obtained by
ordinary least squares regression for all data points of the time series, a potential mutual
correlation of the errors will be readily visible in the form of a distinct trend, whereas a
chaotic scattering of the data points is characteristic for uncorrelated noise, as is shown in
Figure 4-2.
Figure 4-2 Typical lag-plots of the residuals of an uncorrelated (left) and
positively, first-order correlated (right) data set
-2
0
2
-2 0 2
et
et-1
-2
0
2
-2 0 2
et
et-1
Literature review on alternative parameter estimation techniques 69
Besides, a distinct test criterion has been developed to quantify whether the degree of
serial correlation is sufficiently high to become considerable. The Durbin-Watson test
was originally derived to detect 𝐴𝑅(1) relations between the experimental errors of linear
models, but meanwhile its approximate validity for nonlinear models has been described
as well. Moreover, signals showing an order of time-dependence higher than 1 are
equally failed for the test as well, making it a robust and tool to diagnose serial
correlation of higher order tool [5].
The starting point of the Durbin-Watson analysis of independence is the test statistic:
𝑑 =∑ [𝑒(𝑖) − 𝑒(𝑖 − 1)]2𝑛𝑖=2
∑ [𝑒(𝑖)]2𝑛𝑖=1
(4-22)
Hence, the higher the correlation between the residuals, the lower the value of the test
statistic becomes. The thus obtained value serves as criterion to determine the
significance of the serial correlation: the null hypothesis of independence, 𝐻(0): 𝜌 = 0, is
compared to the alternative hypotheses 𝐻𝑎1: 𝜌 > 0 and 𝐻𝑎
2: 𝜌 < 0. The null-hypothesis is
rejected with a certainty 1 − 𝛼 if 𝑑 < 𝑑𝐿,𝛼, accepted if 𝑑 > 𝑑𝑈,𝛼 and inconclusive if 𝑑𝐿,𝛼 <
𝑑 < 𝑑𝑈,𝛼. Numerical values for the critical numbers 𝑑𝐿,𝛼 and 𝑑𝑈,𝛼 depend on both the
number of experimental data points and regression parameters and are tabulated in
literature [15, 16]. To circumvent the issue of a region for which no reliable conclusion
can be drawn, it is suggested to treat it as part of the rejection zone.
Once the significance of serial correlation has been ascertained, a procedure is started to
properly correct for the presence of correlation in the errors. Two different pathways are
available: the first constructs an adapted form of the OLS likelihood function and yields,
after maximization, an asymptotically efficient estimator of the kinetic parameters.
Alternatively, the choice may fall on an iterative scheme, performing a cyclic calculation
in which the regression parameters and the autocorrelation function are updated until
finally convergence is reached.
Given that equation (4-19) holds for the experimental error at every sampling point, the
covariance of the errors of the measurements separated by a time distance k is given by:
𝑐𝑜𝑣(𝜀(𝑡), 𝜀(𝑡 − 𝑘)) = 𝜌𝑘 (4-23)
Therefore, the covariance matrix of the vector of experimental errors, all following
an 𝐴𝑅(1) model, is given by:
𝑽 ≔ 𝑽(𝜺) =1
1 − 𝜌2[
1𝜌⋮
𝜌𝑛−1
𝜌1⋮
𝜌𝑛−2
𝜌2
𝜌⋮
𝜌𝑛−3
⋯⋯⋱⋯
𝜌𝑛−1
𝜌𝑛−2
⋮1
] 𝜎𝑢2 (4-24)
with 𝜀 = [𝜀(1)… 𝜀(𝑛)]𝑇.
The joint probability density of the vector of experimental errors 𝜀, still assumed to
follow a multidimensional normal distribution, is then given by:
70 Literature review on alternative parameter estimation techniques
𝑝(𝜺|𝜷, 𝜌) =
1
√2𝜋|𝑽|𝑛 𝑒𝑥𝑝 {−𝑆(𝜷, 𝜌)} (4-25)
with the extended sum of squares 𝑆(𝜷, 𝜌) given by:
𝑆(𝜷, 𝜌) = 𝜺𝑻𝑽−1𝜺 = [𝒚 − 𝒇(𝒙,𝜽)]𝑇𝑽−1[𝒚 − 𝒇(𝒙,𝜷)]
=1
𝜎𝑢2[(1 − 𝜌2)[𝑦1 − 𝑓(𝒙1,𝜷)]
𝟐 +∑[𝑦𝑖 − 𝑓(𝒙𝑖 ,𝜷) − 𝜌(𝑦𝑖−1 − 𝑓(𝒙𝑖−1,𝜷))]2
𝑛
𝑖=2
] (4-26)
the latter expression resulting after the introduction of (4-24) and elaborating. As in
ordinary least squares, maximization of the joint probability density yields the maximum
likelihood estimates of the kinetic parameters 𝜷, the autocorrelation function 𝜌 and the
white noise variance 𝜎𝑢2 simultaneously. Under stringent restrictions, the consistency,
asymptotic normality and independence of the regression parameters are established.
As an alternative to the maximization of the joint probability density, an iterative
procedure can be adopted. A possible scheme starts with a regression of the nonlinear
model by use of ordinary least squares. The set of residuals 𝑒 that is obtained this way
serve as the basis to determine an approximation of the autocorrelation 𝜌:
�̂� =∑ 𝑒(𝑖) ∙ 𝑒(𝑖 − 1)𝑛𝑖=2
∑ [𝑒(𝑖)]2𝑛𝑖=1
(4-27)
being the value of 𝜌 that minimizes equation (4-26) for the set of kinetic parameters
obtained by OLS. Given this value, the regression parameters are updated by finding the
value that minimizes the approximated sum of squares:
�̂�(𝜷) = [𝒚 − 𝒇(𝒙,𝜷)]𝑇�̂�−1[𝒚 − 𝒇(𝒙,𝜷)]
=1
𝜎𝑢2[(1 − �̂�2)[𝑦1 − 𝑓(𝒙1,𝜷)]
𝟐 +∑[𝑦𝑖 − 𝑓(𝒙𝑖 ,𝜷) − �̂�(𝑦𝑖−1 − 𝑓(𝒙𝑖−1,𝜷))]2
𝑛
𝑖=2
] (4-28)
This cycle is repeated until convergence. Under certain regularity conditions, the thus
obtained set of kinetic parameters shares the same asymptotic distribution as the results
of direct maximization of the likelihood function. The covariance matrix of the estimated
model parameters then reads:
4.2.2 Second-order statistical regression
An alternative approach to deal with the cross-correlation between the experimental
errors in time series data has recently been developed by Roelant and al [17]. Their
statistical technique, called Second-Order Statistical Regression, is mathematically well-
founded and enables a more or less explicit estimation of the error variance matrix based
on replicate experiments. The most relevant aspects of this methodology will be shortly
discussed below.
Literature review on alternative parameter estimation techniques 71
Let the matrix 𝒚 ∈ ℝ𝑛×𝑣 represent all observations for 𝑛 sets of experimental conditions
with 𝑣 measured responses, so that:
𝑦 = [
𝑦11 … 𝑦1𝑣
⋮ ⋱ ⋮𝑦𝑛1 … 𝑦𝑛𝑣
] (4-29)
where each element 𝑦𝑖𝑗, 𝑖 = 1. . 𝑛, 𝑗 = 1. . 𝑣 contains all data points obtained during time-
series measurements at 𝑛𝑡 subsequent moments:
𝑦𝑖𝑗 = [
𝑦𝑖𝑗(𝑡1)
⋮𝑦𝑖𝑗(𝑡𝑛𝑡)
] (4-30)
If 𝑛𝑟𝑖 replicate experiments are available for each set of experimental conditions:
𝑦𝑖𝑗: 𝑦𝑖𝑗(1), 𝑦𝑖𝑗
(2), … , 𝑦𝑖𝑗𝑛𝑟𝑖 (4-31)
the approximate 𝑛𝑡 × 𝑛𝑟𝑖 error matrix 𝐸𝑖𝑗 of the time series is defined as:
𝐸𝑖𝑗 = [
⋮ ⋮ ⋮𝑦𝑖𝑗
(1) − 𝑦𝑖𝑗
… 𝑦𝑖𝑗(𝑛𝑟𝑖) − 𝑦
𝑖𝑗
⋮ ⋮ ⋮
] (4-32)
with 𝑦𝑖𝑗
the average observation for that particular time-series. The variance matrix of the
actual experimental error of the time series 𝑉(𝜀𝑖𝑗) is then approximately given by the
variance matrix of the sample :
�̂�(𝜀𝑖𝑗) =1
𝑛𝑟𝑖 − 1𝐸𝑖𝑗𝐸𝑖𝑗
𝑇 (4-33)
Since this matrix is symmetric and positive definite, there exists an eigenvalue
decomposition:
�̂�(𝜀𝑖𝑗) = �̂�𝑖𝑗 ∙ Λ̂𝑖𝑗 ∙ �̂�𝑖𝑗𝑇 (4-34)
where Λ̂𝑖𝑗 is the 𝑛𝑡 × 𝑛𝑡 diagonal matrix with the positive eigenvalues of �̂�(𝜀𝑖𝑗) as
diagonal elements, ordered from high to lower values. The 𝑛𝑡 × 𝑛𝑡 matrix �̂�𝑖𝑗 contains the
associated eigenvectors at its columns. When transforming the experimental data sets
according to:
𝑦𝑖𝑗′ = �̂�𝑖𝑗 ∙ 𝑦𝑖𝑗 (4-35)
the components of the associated transformed error vector 𝜀𝑖𝑗′ are given by:
𝜀𝑖𝑗′ = �̂�𝑖𝑗 ∙ 𝜀𝑖𝑗 (4-36)
and mutually uncorrelated, since the corresponding variance matrix �̂�(𝜀𝑖𝑗′ ) reads:
�̂�(𝜀𝑖𝑗′ ) = Λ̂𝑖𝑗 (4-37)
Moreover, after performing an additional scaling operation on the transformed data:
72 Literature review on alternative parameter estimation techniques
𝑦𝑖𝑗′′ = Λ̂𝑖𝑗
−1/2∙ �̂�𝑖𝑗 ∙ 𝑦𝑖𝑗 (4-38)
the resulting error vector
𝜀𝑖𝑗′′ = Λ̂𝑖𝑗
−1/2∙ �̂�𝑖𝑗 ∙ 𝜀𝑖𝑗 (4-39)
becomes virtually homoscedastic, as the corresponding variance matrix reads:
�̂�(𝜀𝑖𝑗′′) = 𝐼𝑛𝑟𝑖−1 (4-40)
The pitfall of the analysis lies in the approximate character of the variance matrix
obtained above. Hence, the above equation does not hold unambiguously for the actual,
unknown error variance matrix, and reservation has to be made when it is concluded
from the analysis above that:
𝑉(𝜀𝑖𝑗′′) ≈ 𝐼𝑛𝑟𝑖−1 (4-41)
The homoscedasticity of the errors associated with different measurements in the same
time-series and their mutual independence is therefore only approximately valid when
regressing the data points modified by the suggested transformations. Moreover, due to
the dimension reduction during the transformation, some statistical information is lost
inevitably, at the expense of wider individual confidence intervals for the model
parameters.
4.3 Bayesian statistical assessment
4.3.1 A Bayesian view on parameter estimation
Classical approaches to determining the unknown parameters of an – often nonlinear –
model rely fundamentally on the idea that each of these variables has a fixed value,
waiting for elucidation by the experimenter or regressor. To obtain them, typically
experimental results are collected for a wide range of process conditions, which quantify
the impact of different input parameters on the process under study. Based on that
limited data set, the proposed model curve, stating an underlying mechanism derived
from theoretical principles, is fitted by varying the unknown model parameters that have
to be determined. The set of parameter values which minimizes the objective function of
the regression is then considered as the most likely estimate for the unknown model
parameters. The predicted responses calculated based on the resulting model statement
will differ from the experimentally obtained values, which allows for the determination
of confidence intervals around the point estimates, where the model parameters will be
located with a certain probability. Calculation of these regions requires the explicit
allocation of a specific probability distribution to the model parameters and a
linearization of the model function in the neighbourhood of the point estimates
Literature review on alternative parameter estimation techniques 73
subsequently. Logically, the broader these intervals, the lower the quality of the
regressed parameters will be. Predictions of the future outcome for different process
conditions will be then based on these point estimates.
This procedure is the outcome of the frequentist view on parameter estimation, treating
the actual, unknown model parameters as fixed values, derivable from experimental data
as long as sufficient of these observations are corrected [18]. Although widely applied,
the technique is vulnerable to some fundamental criticism on the validity of its statistical
framework. As an alternative, the Bayesian approach to parameter estimation reasons
from an entirely different starting point [13, 19]. In contrast to the frequentist maximum
likelihood estimation, it considers the exact determination of the unknown model
parameters as impossible when only a limited set of experimental data is available.
Indeed, since information on the full response behaviour of the model, i.e. over the entire
range of possible process conditions, is incomplete, only approximate conclusions are
possible from any statistical analysis. Model parameters are therefore better treated like
statistical variables, characterized by a probability distribution, rather than by exact
values. The actual inference obtained from a Bayesian estimation procedure is hence a
confidence interval comprising a certain, user-defined probability density, rather than a
point estimate. Since these intervals are determined based on a sampling routine, as will
be discussed below, rather than a local linearization of the model function, the need for
forcing the unknown model parameters into a multivariate normal distribution is
avoided.
The interpretation of the model parameter vector as a statistical variable allows for
assessing its estimation by means of classical principles of probabilistic calculus. As will
follow from the more detailed elaborations below, this approach enables an elegant, and
therefore attractive, solution to cope with the unknown correlational structure of the
experimental error. Hence, Bayesian estimation does not require the stringent restrictions
on the regularity of the error like classical regression schemes. Additionally, Bayesian
estimation enables a more explicit capture of the available knowledge and insights about
the studied phenomenon before any experiment has been conducted. The experimenter
does have some so-called prior information about the range in parameter space where the
final parameter values are most probably located, based on personal research experience
or available literature. For example, both the pre-exponential factors and the activation
energy present in rate coefficients appearing in chemical reaction networks, require a
positive value to be physically meaningful. Moreover, the order of magnitude of these
parameters is typically readily obtained from literature, by comparison of the studied
case to similar studies performed in the past. This way, it follows that the candidate
regions in parameter space in which the pursued set of kinetic parameter estimates is
located, is drastically reduced. This particular feature of Bayesian estimation strongly
differs from regression analysis, as the only way to include prior experience in this
74 Literature review on alternative parameter estimation techniques
classical approach is by choosing a suitable initial guess for the local minimization
routine, based on for example the Levenberg-Marquardt algorithm. It goes without
saying that this rather indirect way of transferring readily available insights about the
model into the estimation procedure is far less efficient than the Bayesian approach.
Nevertheless, it can be argued that the incorporation of prior information entrains the
change of adding wrong insights to the estimation procedure as well. Hence, though
often presented as a beneficial feature, it will potentially hinder the statistical analysis
instead of improving it. Unfortunately, at the moment of writing no benchmark analysis
on the impact of bad prior information on the performance of Bayesian methods was
available. Hence, a cautious approach, like those discussed below, has to be followed.
4.3.2 Bayesian parameter estimation
Named after the pioneering investigator of conditional statistics, the Bayesian approach
towards the estimation of parameters in modelling is based on the well-known statement
of conditional probability, which reads:
𝑝(𝐴|𝐵)𝑝(𝐵) = 𝑝(𝐵|𝐴)𝑝(𝐴) (4-42)
where 𝑝(𝐴|𝐵) denotes the probability that an event A is observed when it is given that
event B happens, or, stated different, the chance of an event A to occur conditional on
event B. The reformulation of the estimation of unknown model parameters from a
limited number of experimental data into such a conditional framework follows quite
straightforwardly when approaching it from a Bayesian point of view. Indeed, since a
Bayesian philosophy accounts explicitly for the intrinsic uncertainty on the final findings
about the model parameters obtained from the estimation procedure, the relation of these
estimated confidence intervals and the experimental data set they are being obtained
from is very strong. Speaking in conditional terms, the results of the estimation
procedure are hence conditional to the choice of that particular set of observations that
was fed to the routine. When adopting this way of reasoning, the following statement
expressing the mutual dependency of the experimental data 𝒚 (an 𝑛 ×𝑚 matrix in
general, 𝑛 being the number of experiments performed, 𝑚 denoting the number of
outputs measured for each experiment) and the set of unknown model parameters 𝜷
holds:
𝑝(𝜷|𝒚) =𝑝(𝒚|𝜷)𝑝(𝜷)
𝑝(𝒚) (4-43)
for a non-linear model given in general terms as:
𝑦𝑖𝑗 = 𝑓𝑗(𝒙𝒊, 𝜷) (4-44)
which describes the j’th output for the i’th performed experiment, for experimental
conditions given by the vector 𝒙𝑖.
Literature review on alternative parameter estimation techniques 75
The prior probability density function 𝑝(𝜷) reflects the knowledge or intuition of the
researcher about the yet unknown parameters before any experiment has been
performed. Therefore, this statistical function does not depend on the experimental
output 𝑦. Prior information is expressed in different ways, depending on the level of the
insights present at the moment of the analysis. In the limiting case in which no
information is available, the choice falls on an unprejudiced prior function which assigns
an equal likelihood to every point of the parameter space, or, in particular, of those
regions of parameter space that have not been discarded based on prior beliefs.
Analytical expressions for these so-called non-informative priors are available based on the
theory developed by Jeffreys [20]. Those priors that are suggested for implementation in
a regression strategy for physicochemical modelling will be briefly discussed in what
follows. Likewise, any prior density function that does favour a certain region in
parameter space compared to others is being referred to as informative. It goes without
saying that these latter functions are of main interest for application when having an
advanced insight in the physicochemical phenomenon under study. It is worth
mentioning that, although prior functions are named density functions, it is not strictly
required that these functions are normalized. As long as the combination of prior
distribution and likelihood function yields an integrable and normalized posterior
density, the prior is free to take any reasonable form. Prior functions that do not obey the
criterion of normalization, e.g., a uniform density function for all positive parameter
values, are called improper. Likewise, priors that do integrate up to a finite value in
parameter space are called proper.
The factor 𝑝(𝒚|𝜷) has already been touched briefly in the above introduction of the prior
density function. It represents the probability of a certain experimental output set to be
obtained if the unknown model parameter values should nevertheless be known. To
solve this paradoxical situation, this probability function is handled by means of Fisher’s
interpretation of the likelihood function 𝐿(𝜷|𝒚) in case the experimental output is known.
This analysis showed that the following equality holds:
𝐿(𝜷|𝒚) ≔ 𝑝(𝒚|𝜷) (4-45)
which tackles the issue of the problematic interpretation of this factor. In contrast to the
original probability density, expressions for likelihood functions are available and their
characteristics are well understood. A more refined discussion of the likelihood functions
in case of physicochemical modelling is given below.
The final probability density that demands elucidation is the factor in the denominator in
the right-hand side of equation (4-43). As was mentioned above, this probability function
solely is a function of the experimental output alone, not depending on the unknown
model parameters. An explicit expression of this function is given by:
76 Literature review on alternative parameter estimation techniques
𝑝(𝒚) = ∫ 𝑝(𝒚|𝜷)𝜷
𝑝(𝜷)𝑑𝜷 (4-46)
Although this equation follows directly from the normalization criterion on the posterior
density function, the practical calculation of this integral is a non-trivial task when
studying non-linear models. Indeed, while it is demonstrated theoretically that the
likelihood function of the parameters for linear models takes a nice multivariate normal
form, the likelihood for their non-linear counterparts can be highly irregular in turn, as
illustrated in Figure 4-3. Therefore, the calculation of a closed analytical solution for
expressions like (4-46) is strongly hindered.
Figure 4-3 Typical likelihood distribution for the parameters in a linear (left) and non-
linear model (right) with two parameters
The posterior density function 𝑝(𝜷|𝒚) captures all available information, improving the
original, prior beliefs about the model parameters with the inference entrained by the
executed experiments. Logically, any statistical analysis about the model parameters will
be based on the behavior and characteristics of this probability distribution. Given the
difficulty of calculating the integral in equation (4-46), the original expression for the
posterior density function is often shortened to:
𝑝(𝜷|𝒚) ∝ 𝐿(𝜷|𝒚)𝑝(𝜷) (4-47)
, i.e. due to the lack of knowledge about the exact value for 𝑝(𝒚), replaced by a
proportional statement. Indeed, any study and analysis based on this new expression
will yield the same qualitative information as for the original statement, except for a
constant scaling factor. This feature is commonly used on Bayesian routines as
implemented in statistical software, e.g. Athena Visual Studio. Unfortunately, as will be
extensively discussed below, the Bayesian analysis in these simulation packages will
often stop at this point due to an inability to calculate the posterior density function
efficiently. To cope with this issue, the routines will assume explicitly that the unknown
posterior density function takes a predetermined, easily manipulable form. It is believed
that any statistical findings for the model parameters, e.g. on their confidence intervals,
Literature review on alternative parameter estimation techniques 77
which are based on that associated expression, will resemble the true statistical inference
from the original posterior closely.
Because it is intuitively clear that the correctness of the above strategy is doubtful, and
because some computational methods have been developed that get the calculation of
equation (4-46) within reach, at least approximately, a different approach will be
described in what follows. Combining the best of both worlds, the method will be based
on the strong theoretical framework which underlies the Bayesian routines in Athena
Visual Studio, followed by a quantitative assessment of the posterior density function by
means of Markov Chain Monte Carlo (MCMC) sampling schemes. These techniques,
with the Metropolis-Hastings and Gibbs algorithms as the most prominent exponents,
have been frequently suggested in literature as elegant and computationally efficient way
to approximate the posterior density function by evaluating deliberately taken samples
from parameter space [21, 22]. A more extended discussion of sampling methods will be
given in Section 4.3.4.2.
4.3.3 Posterior density distribution for relevant scenarios in
kinetic parameter estimation
The determination of an analytical expression for the likelihood that a particular set of
values equals the unknown model parameters based on a given set of experimental data,
is completely similar to the analysis underlying the classical maximum likelihood
approach. Assuming an additive error model, the link between the actually observed
response and the model prediction is given by:
𝑦𝑖𝑗 = 𝑓𝑗(𝒙𝒊, 𝜷) + 𝜀𝑖𝑗 (4-48)
for the 𝑗’th response and the 𝑖’th experiment, where 𝜀𝑖𝑗 gives the corresponding
experimental error. The error matrix 𝜺 = {𝜀𝑖𝑗}𝑖=1..𝑛𝑗=1..𝑚
is assumed to obey a multivariate
normal distribution, with expected value 𝟎𝑛×𝑚 and an a priori unknown covariance
matrix 𝑽. By allowing for a multi-response character of the model to keep the scope of the
discussion as wide as possible, this covariance matrix is 4-dimensional in general, with
𝑽 = {𝑉𝑖𝑗𝑘𝑙} = {𝐸(𝜀𝑖𝑗𝜀𝑘𝑙)}𝑖,𝑘=1..𝑛𝑗,𝑙=1..𝑚
∈ ℝ𝑛×𝑚×𝑛×𝑚, which strongly complicates the
mathematical framework of the statistical analysis. Several authors have been working
on the solution of this issue and proposed additional assumptions on the error
covariance structure to obtain closed and practically useful analytical expressions [23-25].
Two of those approaches have been reported to be useful in practice, and will therefore
be discussed below. The latter introduces an additional level of complexity compared to
the first approach, which allows for a more general application at the expense of a
tougher implementation.
78 Literature review on alternative parameter estimation techniques
A first step in the reduction of the high-dimensionality of the error covariance matrix 𝑽
was suggested by Box and Draper (1965) and further elaborated by Stewart et al. (1981)
[23, 26]. Therein, it is assumed that the covariance matrix of the experimental errors
between the different responses of one particular experiment is equal for all experiments,
up to a scaling factor. In short:
𝚺𝑖 = {𝐸(𝜀𝑖𝑗𝜀𝑖𝑙)}𝑗=1..𝑚𝑙=1..𝑚
=1
𝑤𝑖𝚺, 𝑖 = 1. . 𝑛 (4-49)
where 𝚺𝑖 = 𝑽𝑖,:,𝑖,:. 𝑤𝑖 is the weighing factor corresponding to the 𝑖’th experiment and has
to be specified by the user explicitly. All correlations between observations from different
experiments are assumed to be 0. To keep the scope of the reasoning as wide as possible,
𝜮 will be handled as a full, completely unknown 𝑚 × 𝑚 matrix. In this case, the
likelihood function of the model parameters associated with the 𝑖’th experiment is given
by:
𝑝(𝒚𝑖|𝜷, 𝚺) =1
√2𝜋|𝚺|exp {−
1
2∑ 𝑤𝑖𝜎𝑗𝑙[𝑦𝑖𝑗 − 𝑓𝑗(𝒙𝒊, 𝜷)][𝑦𝑖𝑙 − 𝑓𝑙(𝒙𝒊, 𝜷)]
𝑚
𝑗,𝑙=1
} (4-50)
where 𝜮−1 = {𝜎𝑗𝑙}𝑗=1..𝑚𝑙=1..𝑚
. Since the reduced covariance matrix is yet unknown, it is
explicitly taken as an argument of the function, besides the actual model parameters.
Following the assumption of non-correlation between the different experiments, the
following statement for the likelihood function for the complete set of observations
holds:
𝐿(𝜷, 𝚺|𝒚) ≔ 𝑝(𝒚|𝜷, 𝚺) =∏𝑝(𝒚𝑖|𝜷, 𝚺)
𝑛
𝑖=1
=1
√2𝜋|𝚺|𝑛 exp {−
1
2∑ ∑ 𝑤𝑖𝜎𝑗𝑙[𝑦𝑖𝑗 − 𝑓𝑗(𝒙𝒊, 𝜷)][𝑦𝑖𝑙 − 𝑓𝑙(𝒙𝒊, 𝜷)]
𝑚
𝑗,𝑙=1
𝑛
𝑖=1
}
(4-51)
Upon introduction of the auxiliary vector 𝒗(𝜷) containing the residual sum of squares for
the given by:
𝒗(𝜷) = {∑𝑤𝑖[𝑦𝑖𝑗 − 𝑓𝑗(𝒙𝒊, 𝜷)][𝑦𝑖𝑙 − 𝑓𝑙(𝒙𝒊, 𝜷)]
𝑛
𝑖=1
}
𝑗=1..𝑚
𝑙=1..𝑚
(4-52)
the expression of the likelihood function is significantly simplified:
𝐿(𝜷, 𝚺|𝒚) = (2𝜋|𝚺|𝑛 2⁄ )−1exp {−
1
2∑[𝒗(𝜷)𝚺−1]𝑗𝑗
𝑚
𝑗=1
}
∝ |𝚺|−𝑛 2⁄ exp {−1
2∑[𝒗(𝜷)𝚺−1]𝑗𝑗
𝑚
𝑗=1
}
(4-53)
Literature review on alternative parameter estimation techniques 79
which is a function of both the set of model parameters to be estimated and the unknown
covariance matrix of the experimental errors.
To close the determination of unknown parameters in a Bayesian framework, an
analytical expression for the prior function has to be identified. Although the exact form
of the covariance matrix 𝚺 of the experimental error is not known, its explicit appearance
in the expression for the likelihood function requires that a certain value has to be
provided for it, to allow for any inference on the model parameters 𝜷. To avoid this need
for an a priori, user-specified and hence potentially inaccurate covariance matrix, a good
practice is to treat 𝚺 as an additional variable to be determined and capture it in the
estimation procedure. Since both the model parameters and the covariance matrix of the
experimental errors are unknown in advance of the experimental program, an analytical
expression has to be found for the joint prior function 𝑝(𝜷, 𝜮). This search is significantly
facilitated by the assumption that 𝜷 and 𝜮 are not correlated, which allows for the joint
prior density function to be factorized:
𝑝(𝜷, 𝜮) = 𝑝(𝜷)𝑝(𝜮) (4-54)
so that the focus is now in finding the separate priors of both unknowns. This
assumption is intuitively convincing, as the vector of model parameters and the
covariance matrix of the experimental error, a measure for the quality of the performed
experiments, represent in fact two different things, which makes it reasonable to suggest,
at least preliminary, their mutual dependence.
Since the prior insights on the error covariance matrix are typically scarce, the choice for
a non-informative prior function is seen as a cautious yet useful option. An unprejudiced
prior density function based on the method of Jeffreys was obtained for the covariance
matrix of the experimental errors as:
𝑝(𝜮) ∝ |𝜮|−(𝑚+1) 2⁄ (4-55)
where 𝑚 still denotes the number of responses [20].
The definition of a similar non-informative prior for the vector of model parameters is
not as straightforward, since its strong dependence on the specific structure of the model
hinders the stipulation of a generally applicable expression. Up to the moment of writing,
complete theoretical analyses were only found for the choice of a uniform prior in the
allowed range for the model parameters:
𝑝(𝜷) = {1, 𝜷𝑚𝑖𝑛 ≤ 𝜷 ≤ 𝜷𝑚𝑎𝑥
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒 (4-56)
It goes without saying that the available knowledge and insights about model
parameters is often appreciably higher than just having a notion about the permitted
ranges in parameter space. Information available from literature or obtained from
techniques like preliminary isothermal regression for chemical kinetics allow for a more
explicit expression of the prior beliefs on the model parameters, e.g. in the form of a
distribution. At the moment of writing, full theoretical analyses are however only
80 Literature review on alternative parameter estimation techniques
reported for prior functions like (4-56), and most, if not all, available computational
routines for Bayesian parameter estimation like, e.g. Athena Visual Studio, are making
the same assumption. A distinct and well-documented comparison of the performance of
different prior functions and their impact on the quality of the final inference on the
model parameters is not available.
Updating of equation (4-47) for the posterior density function with the results from (4-53)
and (4-55) yields:
𝑝(𝜷, 𝜮|𝒚) ∝
{
|𝜮|−(𝑚+𝑛+1) 2⁄ 𝑒𝑥𝑝 {−
1
2∑[𝒗(𝜷)𝜮−1]𝑗𝑗
𝑚
𝑗=1
}
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
,𝜷𝑚𝑖𝑛 ≤ 𝜷 ≤ 𝜷𝑚𝑎𝑥 (4-57)
giving the fullest possible information on both the model parameters and the error
covariance matrix, appearing both as arguments of this joint probability distribution.
Removal of the covariance matrix from this expression to obtain the marginal posterior
distribution of the model parameters alone is achieved by replacing 𝜮 by the most
probable value �̃�(𝜷) for each value of 𝜷 [27]. The modified posterior density function
then obeys:
�̃�(𝜷|𝒚) ≔ 𝑝(𝜷, �̃�(𝜷)|𝒚) ∝ |𝒗(𝜷)|−(𝑚+𝑛+1) 2⁄ (4-58)
on the condition that 𝒗(𝜷) is non-singular, still in the permitted range for 𝜷. When
minimizing |𝒗(𝜷)|, the resulting modal point �̂� serve as the Bayesian alternative to the
point estimates obtained from non-linear regression analysis. Due to the replacement of
the unknown covariance matrix of the error by an optimal ‘representative’, the user does
no longer have to make any assumptions on the correlational structure of the
experimental errors prior. For example, in case of inter-response heteroscedasticity, an
optimization of the covariance matrix, in that situation an 𝑚 × 𝑚 diagonal matrix with
non-constant elements, will result in a proper weighing of the error on the responses by
itself, instead of requiring user-specified values as an input.
The assumption of a constant covariance matrix 𝜮 of errors for different responses,
independent of the experimental conditions, is not generally valid. The analysis of Box
and Draper (1972) acknowledged the issue of non-homogeneous variances associated
with different experiments and suggested an extension of the approach discussed above
[24]. This way, the former restriction on the error covariance structure posed by equation
(4-49) is abandoned and an inter-response covariance matrix is now considered for each
experiment separately, obeying:
𝚺𝑖 = {𝐸(𝜀𝑖𝑗𝜀𝑖𝑙)}𝑗=1..𝑚𝑙=1..𝑚
, 𝑖 = 1. . 𝑛 (4-59)
Still assuming the zero-correlation between the errors of the different experiments
mutually, the likelihood function reads for this situation reads:
Literature review on alternative parameter estimation techniques 81
𝑙(𝜷, 𝚺|𝒚) = (2𝜋)−𝑚 2⁄ ∏[|𝚺𝑖|−1/2 𝑒𝑥𝑝{−
1
2∑ 𝜎𝑗𝑙
(𝑖)𝑣𝑗𝑙(𝑖)
𝑚
𝑗,𝑙=1
}]
𝑛
𝑖=1
(4-60)
where 𝚺𝑖−1 = {𝜎𝑗𝑙
(𝑖)}𝑗=1..𝑚
𝑙=1..𝑚, and:
𝒗(𝑖)(𝜷) = {𝑣𝑗𝑙(𝑖)}𝑗=1..𝑚
𝑙=1..𝑚= {[𝑦𝑖𝑗 − 𝑓𝑗(𝒙𝒊, 𝜷)][𝑦𝑖𝑙 − 𝑓𝑙(𝒙𝒊, 𝜷)]}𝑗=1..𝑚
𝑙=1..𝑚 (4-61)
Application of the Jeffreys invariant prior for the ensemble {𝚺𝑖} of the unknown
covariance matrices reads:
𝑝({𝚺𝑖}) ∝∏|𝚺𝑖|−(𝑚+1) 2⁄
𝑛
𝑖=1
(4-62)
and assuming a uniform prior function for the model parameters as given in equation (4-
56), the analogue of equation (4-58) becomes:
�̃�(𝜷|𝒚) ≔ 𝑝(𝜷, 𝜮�̃�(𝜷)|𝒚) ∝∏| 𝒗(𝑖)(𝜷)|−(𝑚+2) 2⁄𝑛
𝑖=1
(4-63)
on the condition that 𝒗(𝑖)(𝜷) is non-singular, for 𝑖 = 1. . 𝑛, and valid in the relevant
regions in parameter space. This feature has a high potential for unbiased parameter
estimation. Replacing the unknown error covariance matrices by an optimal candidate
allows for a kind of automatic weighing of the responses for each experiment. As a
consequence, the need for the researcher to stipulate the weighing factors by himself is
completely bypassed. Unfortunately, although the less stringent assumptions in (4-59)
allow for a more general assessment of the experimental error than (4-49), it seems to
have not yet been encoded in a practically useful routine.
It is important to keep in mind that point values for the model parameters, obtained from
minimizing expressions (4-58) and (4-60), are in fact irrelevant in a Bayesian framework.
The true Bayesian inference is in the confidence intervals from their posterior density
function. Nevertheless, statistical software packages like Athena Visual Studio do
calculate such modal values, and use them for the approximate determination of
confidence intervals as well, as will be discussed in Section 4.3.4.1. It goes without saying
that the need for a minimization of a highly non-linear function, with potentially a
numerous amount of local minima, introduces a similar pitfall encountered in classical
non-linear regression. Only if the initial guess is chosen in the neighbourhood of the
global minimum of the objective function, the optimal solution of the estimation
procedure will be found. To overcome this issue, a new methodology, based on Monte
Carlo sampling techniques, will have to be applied. The details of these methods will be
described in Section 4.3.4.2.
82 Literature review on alternative parameter estimation techniques
4.3.4 Posterior inference on model parameters
4.3.4.1Statistical assessment by local approximation
In the introductory discussion of this chapter, attention has been paid repeatedly to the
importance of the final confidence intervals resulting from the statistical analysis when it
comes to inference about the unknown model parameters instead of point estimates. Due
to the beforehand undeterminable and often capricious behaviour of the posterior
density function, depending on the precise structure of the model, the calculation of
intervals in parameter space comprising a desired probability density is a non-trivial
task. For an optimal accuracy of the interval calculations, which accounts for all
particularities in the course of the posterior density function, a sampling routine has to be
applied which allows for a thorough scanning of parameter space.
Nevertheless, Bayesian routines in modelling software, e.g., Athena Visual Studio, often
simplify these calculations by calculating the probability intervals approximately.
Therein, the objective function 𝑆(𝜷) = −2𝑙𝑛[𝑝(𝜷|𝒚)] is linearly expanded around the
modal value �̂� of the posterior density as:
�̃�(𝜷) = 𝑺(�̂�) + (𝜷 − �̂�)𝑇�̂�𝜷𝜷(𝜷 − �̂�) (4-64)
where �̂�𝜷𝜷 =1
2{𝜕2�̃�
𝜕𝛽𝑖𝜕𝛽𝑗}𝑖=1..𝑝
𝑗=1..𝑝
. Reformulating the definition of the objective function back to
the – now modified – posterior density function yields:
𝑝(𝜷|𝒚) ∝ exp {−1
2(𝜷 − �̂�)
𝑇�̂�𝜷𝜷(𝜷 − �̂�)} (4-65)
which resembles a multivariate normal distribution with expected value �̂� and
covariance matrix [�̂�𝜷𝜷]−1
. Hence, the highest posterior (1 − 𝛼) probability density
interval for the estimated parameters reads:
�̂�𝑖 − �̂�𝛽𝑖𝒩(𝛼
2) ≤ 𝛽𝑖 ≤ �̂�𝑖 + �̂�𝛽𝑖𝒩(
𝛼
2) (4-66)
where �̂�𝛽𝑖 = {[�̂�𝜷𝜷]
−1}𝑖𝑖, 𝑖 = 1. . 𝑝 and the factor 𝒩(
𝛼
2) denotes the 1 −
𝛼
2 percentile of the
standard normal distribution. The highest probability confidence intervals following
from the approximations are hence symmetrical around the point estimates, and this
applies for each model parameter.
4.3.4.2Sampling schemes for Bayesian estimation
Since this approach explicitly approximates the unknown posterior density distribution
and forces it into a multivariate normal frame, a lot of information about the true
distribution is lost. Therefore, to explore the posterior density function into its finest
details, a Monte Carlo sampling procedure will have to be followed.
As has already been noted above, the core of the Bayesian approach to parameter
estimation is the posterior density function 𝑝(𝜷|𝒚), since it comprises all information
Literature review on alternative parameter estimation techniques 83
available from both prior beliefs in the model parameters and inference from the
experimental results. All statistical analyses, including the calculation of confidence
intervals and correlational structures requires calculations like:
𝐸[𝑓(𝜷)] = ∫𝑓(𝜷) 𝑝(𝜷|𝒚)𝑑𝜷 =∫𝑓(𝜷) 𝑝(𝒚|𝜷)𝑝(𝜷)𝑑𝜷
∫𝑝(𝒚|𝜷)𝑝(𝜷)𝑑𝜷 (4-67)
where the function 𝑓(𝜷) depends on the characteristic of the posterior density to be
assessed, e.g., 𝑓(𝜷) = 𝜷 for the expected value 𝐸(𝜷) or 𝑓(𝜷) = [𝜷 − 𝐸(𝜷)]2 to determine
the variances on the different parameter variances.
Moreover, the confidence intervals showing where the model parameters are located
with a certain, user-defined probability are of particular interest when quantifying the
quality of the estimation procedure. To obtain such intervals for each separate parameter,
the joint posterior density, describing the statistics of all parameters simultaneously, is
integrated over parameter space to get the marginal posterior distribution 𝑝(𝛽𝑖|𝒚)
obeying:
𝑝(𝛽𝑖|𝒚) = ∫𝑝(𝜷|𝒚)𝑑𝛽1. . 𝑑𝛽𝑖−1𝑑𝛽𝑖+1. . 𝑑𝛽𝑝 (4-68)
The calculation of the integrals appearing in equations (4-67) and (4-69) is a non-trivial
task, and their problematic computation has long been an impediment for fully
quantitative results and has hence blocked the breakthrough of Bayesian methods in the
field of parameter estimation for a long time.
Indeed, the classical approach to approximate the posterior density function by placing a
discrete grid over parameter space and calculate the posterior density at the nodes is
associated with some important drawbacks. At first, before the actual discretization is
performed, the relevant zone where the grid is spanned has to be specified.
Consequently, all regions outside this zone will escape from analysis and all
particularities in the posterior’s behavior will not be unraveled. Hence, sufficient prior
knowledge about the location of the most likely values for the kinetic parameters is
required. Secondly, the lack of knowledge about the behavior of the posterior makes it
hard, if not impossible, to specify an optimal resolution and get a sufficiently detailed
idea about the posterior’s behavior. Connected to this resolution issue, the number of
required numerical operations varies exponentially with the number of model
parameters. Hence, even for a limited number of parameters, the computational load of
the discretization scheme will strongly mount up for a decreasing cell width.
Fortunately, the development of a class of powerful Markov Chain Monte Carlo (MCMC)
algorithms allowing for a computationally efficient approximation of any probability
distribution via a sampling procedure has broadened the scope of Bayesian approaches,
which has resulted in growing interest for the application of Bayesian routines in kinetic
84 Literature review on alternative parameter estimation techniques
modelling [19, 28]. In contrast to classical discretization procedures, these sampling
methods scan parameter space automatically for the zones with considerable posterior
probability density and are intrinsically apt to locate and elucidate those regions where
this density is the highest.
All relevant methods for handling the calculations similar to equation (4-67) are relying
on a Monte Carlo implementation. In an attempt to get around the explicit integration of
the unknown posterior probability function, the expected value for a function 𝑓 of the
model parameters is approximated by drawing a number of random samples {𝜷𝑖}
from 𝑝(𝜷|𝒚), and calculating:
𝐸(𝑓(𝜷)) =∫𝑓(𝜷) 𝑝(𝒚|𝜷)𝑝(𝜷)𝑑𝜷
∫𝑝(𝒚|𝜷)𝑝(𝜷)𝑑𝜷≅1
𝑛∑𝑓(𝜷𝑖)
𝑛
𝑖=1
(4-69)
Hence, the expected value for any function of the model parameters is estimated by the
mean value for the different samples. According to the law of great numbers, the
approximation will hold more strongly for an increasing value for 𝑛 on the condition that
the samples are taken independently [29]. Unfortunately, it is in general not
straightforward to draw independent samples of the model parameters from a
potentially highly complex probability distribution.
The construction of a Markov chain to control the sampling was found to be extremely
useful to overcome this barrier. Thereby, the sample 𝜷𝑖 to be drawn at step 𝑖 is taken
from a distribution 𝑃(𝜷𝑖|𝜷𝑖−1) which depends only the foregoing sample 𝜷𝑖−1. Hence,
the probability at which the current sample is taken is only influenced by the very near
history of the chain, i.e., earlier samples do not come into play. Nevertheless, the value of
the sample 𝜷𝑖 will depend on the choice of the starting point 𝜷0, say according to a
distribution 𝑄𝑖(𝜷𝑖|𝜷0) which explicitly depends on the time step 𝑖. Indeed, as more
samples are drawn, the ‘distance’ between the starting point and the final sample will
increase, and it is intuitively clear that this will change their mutual relation.
Surprisingly, for increasing 𝑖 𝑄𝑖(𝜷𝑖|𝜷0) tends to converge to a stationary distribution 𝜙(𝜷)
which does no longer depend on the starting point 𝜷0. The trick that forms the bridge to
a sampling technique for the posterior density function is to design the Markov chain in
such a way that its stationary distribution equals the posterior, hence 𝑄𝑖(𝜷𝑖|𝜷0) →
𝜙(𝜷) ≡ 𝑝(𝜷|𝒚). This way, the more samples are taken, the more their distribution
resembles the desired posterior density function. The number of samples 𝑚 needed for
stabilization of the probability function is called the burn-in time.
The construction of a Markov Chain with a stationary distribution which equals the
posterior distribution is typically carried out by the Metropolis-Hastings algorithm,
named after its discoverers [30, 31]. First, a starting value 𝜷0 is chosen to initialize the
procedure. Then, during the 𝑖’th cycle, a candidate parameter set 𝑩 is sampled randomly
Literature review on alternative parameter estimation techniques 85
from a proposal distribution 𝑞(𝜷|𝜷𝑖−1) which solely depends on the previous sample
value 𝜷𝑖−1. This proposal has to be specified by the researcher and may be a fixed as well
as an updating function of the model parameters and the parameter values of the
previous step, 𝜷𝑖−1, solely. If the proposal distribution neither depends on the current
point, the algorithm is called independent. When the proposal distribution is symmetrical,
i.e. 𝑞(𝜷|𝜷𝑖−1) = 𝑞(𝜷𝑖−1|𝜷), the technique is called Metropolis sampling. This is the case
when choosing a multivariate normal proposal.
Based on the candidate 𝑩, the acceptance ratio 𝐴 is determined as follows:
𝐴(𝑩,𝜷𝑖−1) = min(1,𝑝(𝑩|𝒚)
𝑝(𝜷𝑖−1|𝒚)
𝑞(𝜷𝑖−1|𝑩)
𝑞(𝑩|𝜷𝑖−1)) (4-70)
When it holds that
𝐴 ≥ 𝑈 (4-71)
where 𝑈 is a randomly generated number from [0,1], 𝑩 is accepted, and hence 𝜷𝑖 = 𝑩;
otherwise, 𝜷𝑖 = 𝜷𝑖−1. It is important to remark that the unknown normalization factor in
the statement of the posterior cancels out since only ratios of the density functions appear
in equation (4-71). Hence, given that the exact form of the posterior has to be known up
to a scaling factor constant in the model parameters, this technique is extremely suited
for this particular problem.
As was discussed above, the distribution of the samples will gradually resemble the
unknown posterior density function while running the routine for a certain,
predetermined number of cycles 𝑛, as depicted in Figure 4-4 for an illustrative example,
not specifically encountered during parameter estimation. All inference about statistical
characteristics of the posterior density function 𝑝(𝜷|𝒚) is now equally accessible from the
distribution of the Markov chain samples. It goes without saying that all samples taking
during the burn-in time are useless and therefore have to be discarded from the statistical
analysis.
Because the explicit calculation of the posterior density function throughout the relevant
ranges in parameter space allows for the elucidation of all particular irregularities in its
course, the conclusions drawn based on the sampled distribution will be more accurate
than those relying on normal approximations as mentioned in section 4.3.4.1.
86 Literature review on alternative parameter estimation techniques
Figure 4-4 Illustration of the convergence of a MCMC sampling routine towards
an unknown probability distribution (full line) [32]
The proposal distribution 𝑞(. |. ) appearing in the calculation of the acceptance ratio has
to be specified beforehand. Although the distribution of the Markov chain will converge
to 𝑝(𝜷|𝒚) regardless of its exact form, choosing the proposal function wisely, i.e. to show
an as high as possible resemblance with the unknown posterior distribution, will
strongly enhance the rate of convergence; moreover, to make calculations not
unnecessarily complex, it is advisory to take a distribution which allows for easy
sampling and evaluation. Because large-sample analysis stipulates that the posterior
probability density approaches the multivariate normal distribution for an increasing
number of samples [21], the proposal distribution is often chosen as:
𝑞(𝜷|𝜷𝑖−1) = 𝓝(𝜷|𝜷𝑖−1, 𝜎2𝑰𝑝) (4-72)
i.e. a multivariate normal distribution centered at the foregoing sample 𝜷𝑖−1 with an
uncertainty expressed by 𝜎2. The user-defined value for this variance is crucial for the
quality of the MCMC sampler, as is clearly demonstrated in Figure 4-5. Proposals that are
too narrow will cause the routine to get stuck at certain values during the iterations,
instead of scanning parameter space for regions with higher posterior probabilities. On
the other hand, in case of an excessively wide proposal, the high acceptance rate of the
candidate samples will undermine the convergence of the sampled distribution to the
targeted posterior density, a behaviour often referred to as bad mixing [33]. When
perfectly tuned, convergence to the target distribution is achieved fast, resulting a
regularly zigzagging trace plot, showed in the lower half of the figure. This plot gives the
evolution of the sampled value throughout the iteration and clearly shows the fast
convergence of the chain combined with an intense scanning of along the parameter axis.
Literature review on alternative parameter estimation techniques 87
Figure 4-5 MCMC sampling from a one-dimensional target distribution for different
variances of the normal proposal distribution: 0.05 (left), 1 (middle) and 100 (right)
Top: actual distribution (red line) and sample histogram (blocks)
Bottom: trace-plot of the sampled parameter value as function of the iteration number
4.3.5 Including insights by an informative prior function
Instead of choosing the default uniform prior which includes no information on the
model parameters, it is proposed to follow a procedure often used in-house for ordinary
least squares minimization of chemical kinetic models. Whereas the technique aims in
fact at obtaining suitable initial guesses of the kinetic parameters to start the
minimization routine, the results will be useful in a Bayesian approach as well by
allowing for the construction of an informative prior distribution.
Specifically applied to the estimation of kinetic parameters for chemical reactions, the
technique consists of a grouping of the experimental data for each temperature being
studied. By performing a nonlinear regression on each of these data sets, for each
temperature a preliminary estimate is obtained for the rate coefficients associated with
the different reactions in the reaction mechanism. Assuming an Arrhenius dependence
β1 β1 β1
Iteration Iteration Iteration
𝑝(𝛽1|𝒚)
𝛽1
88 Literature review on alternative parameter estimation techniques
on temperature, regression of the values for the rate coefficients with respect to the
different temperatures yields estimates �̃�𝐼𝑅 = {�̃�𝐼𝑅𝑖 }
𝑖=1..𝑝 of the corresponding pre-
exponential factors and activation energies. In a standard nonlinear regression routine,
these values will be set as the initial guesses for the iterative optimization algorithm.
For a Bayesian analysis, these values may serve as the modes of the prior beliefs on the
model parameters. When determining a ‘variance’ term for each of the parameter,
reflecting the uncertainty on their estimated values, a preliminary covariance matrix
�̃�𝐼𝑅 of the model parameters is calculated. This allows in turn for expressing the prior
information by means of a multivariate normal distribution:
�̃�(𝜷|𝑦) =1
√2𝜋|�̃�𝐼𝑅|
𝑒𝑥𝑝 {−1
2[𝜷 − �̃�𝐼𝑅]
𝑇�̃�𝐼𝑅−1[𝜷 − �̃�𝐼𝑅]}
∝ 𝑒𝑥𝑝 {−1
2[𝜷 − �̃�𝐼𝑅]
𝑇�̃�𝐼𝑅−1[𝜷 − �̃�𝐼𝑅]}
(4-73)
where 𝑝 represents the number of model parameters to be estimated and 𝑛𝑇 denotes the
number of different temperatures studied.
Following the same reasoning as before, an alternative posterior density function for the
model parameters is determined as:
�̃�(𝜷|𝒚) ∝∏| 𝒗(𝑖)(𝜷)|−(𝑚+2) 2⁄
𝑛
𝑖=1
exp {−1
2[𝜷 − �̃�𝐼𝑅]
𝑇�̃�𝐼𝑅−1[𝜷 − �̃�𝐼𝑅]} (4-74)
The vector �̂� that maximizes this probability will serve as a new point estimation of the
unknown model parameters. Given the difference in the function statement of the
posterior density distribution, it is expected that the predictions acquired by
implementation of this method will be different from those obtained with the non-
informative prior.
Literature review on alternative parameter estimation techniques 89
4.4 References
1. Seber, G.A.F. and C.J. Wild, Nonlinear Regression. 2003: Wiley.
2. Pritchard, D.J., J. Downie, and D.W. Bacon, Further Consideration of Heteroscedasticity in
Fitting Kinetic Models. Technometrics, 1977. 19(3): p. 227-236.
3. Maria, G., A review of algorithms and trends in kinetic model identification for chemical and
biochemical systems. Vol. 18. 2004, Zagreb, CROATIE: Croatian Society of Chemical
Engineers. 28.
4. Carroll, R.J. and D. Ruppert, Transformation and Weighting in Regression. 1988: Taylor
& Francis.
5. Rawlings, J.O., S.G. Pantula, and D.A. Dickey, Applied Regression Analysis: A Research
Tool. 1998: Springer.
6. Box, G.E.P. and W.J. Hill, Correcting Inhomogeneity of Variance with Power
Transformation Weighting. Technometrics, 1974. 16(3): p. 385-389.
7. Carroll, R.J. and D. Ruppert, Diagnostics and Robust Estimation When Transforming the
Regression Model and the Response. Technometrics, 1987. 29(3): p. 287-299.
8. Carroll, R.J. and D. Ruppert, Power Transformations when Fitting Theoretical Models to
Data. Journal of the American Statistical Association, 1984. 79(386): p. 321-328.
9. Beal, S.L. and L.B. Sheiner, Heteroscedastic Nonlinear Regression. Technometrics, 1988.
30(3): p. 327-338.
10. Rousseeuw, P.J. and A.M. Leroy, Robust Regression and Outlier Detection. 2005: Wiley.
11. Barnett, V. and T. Lewis, Outliers in Statistical Data. 1994: Wiley.
12. Hampel, F.R., et al., Robust Statistics: The Approach Based on Influence Functions. 2011:
Wiley.
13. Motulsky, H.J. and R.E. Brown Detecting outliers when fitting data with nonlinear
regression - a new method based on robust nonlinear regression and the false discovery rate.
BMC bioinformatics, 2006. 7, 123 DOI: 10.1186/1471-2105-7-123.
14. Wooldridge, J., Introductory Econometrics: A Modern Approach. 2008: Cengage
Learning.
15. Durbin, J. and G.S. Watson, Testing for Serial Correlation in Least Squares Regression: I.
Biometrika, 1950. 37(3/4): p. 409-428.
16. Durbin, J. and G.S. Watson, Testing for Serial Correlation in Least Squares Regression. II.
Biometrika, 1951. 38(1/2): p. 159-177.
17. Roelant, R., Mathematical determination of reaction networks from transient kinetic
experiments. 2011.
18. Samaniego, F.J., A Comparison of the Bayesian and Frequentist Approaches to Estimation.
2010: Springer New York.
19. Hsu, S.-H., et al., Bayesian Framework for Building Kinetic Models of Catalytic Systems.
Industrial & Engineering Chemistry Research, 2009. 48(10): p. 4768-4790.
20. Jeffreys, H., The Theory of Probability. 1998: OUP Oxford.
21. Gelman, A., et al., Bayesian Data Analysis, Second Edition. 2003: Taylor & Francis.
22. Qian, S.S., C.A. Stow, and M.E. Borsuk, On Monte Carlo methods for Bayesian inference.
Ecological Modelling, 2003. 159(2–3): p. 269-277.
23. BOX, G.E.P. and N.R. DRAPER, The Bayesian estimation of common parameters from
several responses. Biometrika, 1965. 52(3-4): p. 355-365.
90 Literature review on alternative parameter estimation techniques
24. Box, M.J. and N.R. Draper, Estimation and Design Criteria for Multiresponse Non-Linear
Models with Non-Homogeneous Variance. Journal of the Royal Statistical Society. Series
C (Applied Statistics), 1972. 21(1): p. 13-24.
25. Stewart, W.E., Multiresponse Parameter Estimation with a New and Noninformative Prior.
Biometrika, 1987. 74(3): p. 557-562.
26. Stewart, W.E. and J.P. Sørensen, Bayesian Estimation of Common Parameters from
Multiresponse Data with Missing Observations. Technometrics, 1981. 23(2): p. 131-141.
27. Stewart, W.E. and M. Caracotsios, Computer-Aided Modeling of Reactive Systems. 2008:
Wiley.
28. Galagali, N. and Y.M. Marzouk, Bayesian inference of chemical kinetic models from
proposed reactions. Chemical Engineering Science, 2015. 123(0): p. 170-190.
29. Gilks, W.R., S. Richardson, and D. Spiegelhalter, Markov Chain Monte Carlo in Practice.
1995: Taylor & Francis.
30. Metropolis, N., et al., Equation of State Calculations by Fast Computing Machines. The
Journal of Chemical Physics, 1953. 21(6): p. 1087-1092.
31. Hastings, W.K., Monte Carlo Sampling Methods Using Markov Chains and Their
Applications. Biometrika, 1970. 57(1): p. 97-109.
32. Andrieu, C., et al., An Introduction to MCMC for Machine Learning. Machine Learning,
2003. 50(1-2): p. 5-43.
33. Haario, H., E. Saksman, and J. Tamminen, Adaptive proposal distribution for random
walk Metropolis algorithm. Computational Statistics, 1999. 14(3): p. 375-395.
Benchmark analysis of alternative parameter estimation techniques 91
Chapter 5
Benchmark analysis of
alternative parameter estimation
techniques
In the upcoming section, the alternative techniques towards the estimation of unknown
model parameters that were suggested in Chapter 4 will be evaluated on their potential to be
a reliable, well-performing challenger of currently applied, classical nonlinear regression
theory. Routines were programmed to correct ordinary regression for heteroscedasticity and
for serial correlation and implement the Bayesian approach combined with MCMC
sampling. At first, a single response linear model was considered as a candidate case to
assess their performance. Indeed, for linear models ordinary regression methods are
straightforward to apply, which allows in turn for a relatively simple comparison to the
outcome of the alternative techniques. Specifically for the Bayesian procedures, the scope
was extended to simple nonlinear models as well. As will be discussed below more
extensively, this introduced the need for more refined sampling methods compared to those
described in Chapter 4.
All routines were coded in Matlab version 8.4 and can be found in the appendix. A set of
𝑛 = 20 experimental data is randomly generated prior to the run of each script in accordance
to the assumption of a normally distributed, zero-mean error. For a single response model,
this boils down to:
𝒚 ~ 𝒩(𝒚𝒄𝒂𝒍𝒄, 𝑽) (5-1)
where 𝒚𝒄𝒂𝒍𝒄 is the exact response value corresponding to the independent variable 𝑥, i.e. for a
simple linear model:
𝒚𝒄𝒂𝒍𝒄 = 𝐴𝑥 + 𝐵 (5-2)
and 𝑽 a user-specified variance matrix of the ‘experimental’ error. As will follow from the
discussions below, the actual form of this matrix will vary throughout the benchmark
analysis, depending on which particular candidate method will be evaluated. The values for
the model parameters 𝜷 = [𝐴, 𝐵] were chosen beforehand and were generated randomly 5
and 1, respectively. The simulated experimental data set in (5-1) then serves as the basis for
92 Benchmark analysis of alternative parameter estimation techniques
the estimation of these values by the suggested techniques. Their performance and accuracy
will then be assessed by a comparison to the results from classical linear regression.
For ordinary linear least squares estimation the point estimates 𝒃 = [�̂�, �̂�] for the unknown
model parameters 𝜷 are given by:
𝒃 = (𝑿𝑇𝑿)−1𝑿𝑇𝒚 (5-3)
when assuming a normally distributed, uncorrelated and homoscedastic experimental error
[1]. In this case, the 𝛼 percent confidence intervals on these parameter estimates are given by:
𝑏𝑖 − 𝑉𝒃,𝒊𝒊𝑡 (1 −𝛼
2, 𝑛 − 𝑝) ≤ 𝛽𝑖 ≤ 𝑏𝑖 + 𝑉𝒃,𝒊𝒊𝑡 (1 −
𝛼
2, 𝑛 − 𝑝) , 𝑖 = 1,2 (5-4)
with 𝑿𝑇 = [𝑥1 ⋯ 𝑥𝑛
1 ⋯ 1], 𝑆(𝒃) = ∑ (𝑦𝑖
𝑛𝑖=1 − �̂�𝑥𝑖 − �̂�) and the number of model parameters
𝑝 = 2. An unbiased estimator for the in general unknown covariance matrix of the
parameter estimates 𝑽𝒃 is given by:
�̂�(𝒃) = (𝑿𝑇𝑿)−1𝑠2 = (𝑿𝑇𝑿)−1𝑆(𝒃)
𝑛 − 𝑝 (5-5)
5.1 Data-based weighted regression
In Section 4.2, two slightly different techniques to correct for a non-constant variance of the
experimental error were discussed, i.e. the iterative Power-Transform-Both-Sides method
and the direct maximization of the joint posterior density obtained from a Bayesian
approached. As has been mentioned in the literature survey, the difference in performance of
both techniques was reported to be minimal. Therefore, given that Matlab offers some built-
in optimization routines, the second method was chosen to be implemented. Based on its
performance, the need to properly account for heteroscedasticity will be evaluated. The error
covariance matrix 𝑽 was implemented as a diagonal matrix, with elements given by:
𝑉𝑖𝑖 = 𝜎2|𝑦𝑐𝑎𝑙𝑐,𝑖𝑖|2
(5-6)
i.e., the uncertainty on the experimental error is set proportional to the actual magnitude of
the corresponding response, with the true ‘homoscedastic’ error variance 𝜎2 as a scaling
factor. Hence, if the weighted regression is performed, the optimal value for the
transformation parameter 𝜙 will ideally be around 0. The value for the scaling factor is free
to choose and will be varied to assess the robustness of the presented technique concerning
the quality of the experimental observations. Indeed, the higher 𝜎2, the more scattered the
generated data set will be around the true, model based response value.
After the experimental data set was simulated, both an ordinary and a weighted regression
were performed. As was mentioned above, the latter requires the use of available local
optimization routines, which in turn needs the definition of an initial guess, both on the
Benchmark analysis of alternative parameter estimation techniques 93
Figure 5-1 Estimates for the model parameters A (left) and B (right) for different values of the
scaling factor 𝜎2 for 10 subsequent runs. Filled symbols denote the results of the weighted
regression, open markers correspond to classical least squares estimation. The dotted line
corresponds to the true parameter values. Remark the varying scaling of the vertical axes.
4.9
5
5.1
0 5 10
0.9
1
1.1
0 5 10
4
5
6
0 5 10
0
1
2
0 5 10
3
5
7
0 5 10Run number
-2
1
4
0 5 10Run number
𝜎 = 0.1
𝜎 = 1
𝜎 = 0.01
�̂�
�̂�
�̂�
�̂�
�̂�
�̂�
94 Benchmark analysis of alternative parameter estimation techniques
model parameters and the transformation parameter. For this purpose, the optimal
parameter estimates obtained from the ordinary regression were used, together with a
starting value of 1 for 𝜙, which corresponds to unweighted regression. The routine was then
run for 10 times for 𝜎 equal to 0.01, 0.1 and 1. The confidence intervals on the point estimates
of the model and transformation parameters are calculated approximately by application of
(5-4) in combination with the adapted covariance matrix of the parameter estimates, as
introduced in equation (4-11). To keep the clarity of Figure 5-1, these intervals are not shown
explicitly. Nevertheless, it was found that the results for the weighted regression were
almost equally broad than those for ordinary least squares. Hence, from a purely statistical
point of view, both techniques yield estimates which are equally informative.
Figure 5-1 depicts the results of this procedure. Inspection of the results shows that the
parameter estimates obtained by the automated weighing procedure are in almost all
situations considerably closer to the true values. The introduction of heteroscedasticity does
have a negative impact on the performance of ordinary least squares regression, which is
especially clear for the estimations of the intercept 𝐵, showing deviations higher than 5% for
multiple runs. In contrast, the estimates from the weighted estimation stay in the vicinity of
the true parameter values for all runs with low and all but one situations for mild values
of 𝜎, i.e. at superior and intermediate quality of the experimental data. Logically, for the
highly uncertain observations with 𝜎 = 1, the performance of both the classical and adapted
methodology decreases drastically yielding an output that has a highly variable accuracy for
different runs. Still, in most cases, the point estimates from the weighted regression still
outperform those obtained by the classical approach.
The point estimates for the transformation parameter 𝜙 are shown in Figure 5-2. Ideally, its
estimated value has to approach 0, since the optimal weighing factors are inversely
proportional to the squared model predictions for the error covariance matrix as in (5-6). It is
seen that all values, i.e. for all runs at each considered value for the scaling factor 𝜎, are
estimated different from 1, which corresponds to ordinary least squared. However, it is
noticed that for multiple situations and irrespectively of the scaling factor, the outcome of the
estimation was strongly fluctuating around 0, being in the close vicinity of the true value in
only about half of the studied cases. This is truly remarkable, given that the inference on the
model parameters was in fact quite satisfying. Obviously, the performance gains of the data-
based weighing routines is hence somewhat indifferent to how well the true value of 𝜙 is
estimated. Only for 𝜎 = 1, a trend is seen between the accuracy of the estimates on 𝜷 and 𝜙,
as the closer the latter approaches 0, the less badly the quality of the model parameter
estimates becomes.
Benchmark analysis of alternative parameter estimation techniques 95
Figure 5-2 Optimal values for the transformation parameter for all three considered scaling
factors and for all runs
The weighted residuals {𝑒𝑤,𝑖}, 𝑖 = 1. . 𝑛, corresponding to the last run at 𝜎 = 0.01 are plotted
in Figure 5-3 for both the weighted and the ordinary regression. Given the point
estimates [�̂�, �̂�], these are given by:
𝑒𝑤,𝑖 = √𝑤(�̂�, �̂�) ∙ 𝑒𝑖 = √𝑤(�̂�, �̂�) ∙ [𝑦𝑖 − �̂�𝑥𝑖 − �̂�] (5-7)
Logically, the weight factors all equal 1 for the unweighted regression. While a clearly
diverging scatter is noticed for the classical approach, even at the smallest considered value
for the scaling factor, the residuals for the weighted have been stabilized remarkably,
removing all trends and resulting in a bounded scatter around 0.
Based on this rudimentary comparison and despite the limited scope of the analysis to
simple cases only, some trends have been emerging. It is clear that an unallowable neglect of
the heteroscedasticity will have a strongly negative impact on the accuracy and quality of the
inference on unknown model parameters by the analysis of experimental data. Accounting
explicitly for a non-constant variance of the experimental error during the regression was
shown to yield considerable gains in the accuracy of the parameter estimates, on the
condition that the overall quality of the observations is sufficient and that the uncertainty on
a certain response is proportional to its magnitude.
-1
0
1
0 2 4 6 8 10
Experiment
σ = 1
σ = 0.1
σ = 0.01
�̂�
96 Benchmark analysis of alternative parameter estimation techniques
Figure 5-3 Weighted residuals for the ordinary (open symbols) and weighted least squares
estimation (filled markers)
Nevertheless, since weighted regression still relies on the minimization of a residual sum of
squares, it does not solve the issue of getting stuck in a local rather than the global minimum
of the objective function, yielding incorrect point estimates for the model parameters
especially in case of nonlinear models. Hence, although weighted regression will properly
correct for heteroscedasticity and therefore outperform ordinary least squares estimation, the
reliability of its outcome is still not ensured.
5.2 Explicit modelling of serial correlation in the data
A similar procedure was followed to evaluate the added value of correcting for serial
correlation between subsequent experiments, a phenomenon that is believed to be relevant in
particular when batch reactor setups are involved.
A set of 20 experimental data were randomly generated for the simple single-response linear
model introduced above. The covariance matrix 𝑽 was implemented in order to obey
an 𝐴𝑅(1) time dependence of the experimental errors, and hence reads:
𝑽 = {𝑉𝑖𝑗} = 𝜎𝟐𝜌|𝑖−𝑗| (5-8)
with 𝜌 is the tunable autocorrelation function and 𝜎𝟐 once more the homoscedastic
experimental error and free to choose. In what follows, its value will remain fixed at 0.1.
The Two-Stage Iterative method as discussed in Chapter 4, which relies on an 𝐴𝑅(1)
modelling of the experimental error, was implemented in an attempt to unravel this
correlation from the scattered data points and properly correct for it. Keeping in mind that
the value of the autocorrelation is bounded between -1 and 1 to be physically relevant, the
-0.4
0
0.4
-5 5
x
𝑒𝑤
Benchmark analysis of alternative parameter estimation techniques 97
performance of the code will be evaluated for 𝜌 equal to -0.99, 0.1 and 0.99, which
corresponds to strongly negative and mildly and strongly positive correlation, respectively.
The results from running the code ten times for each of the proposed autocorrelations are
given in Figure 5-4. The Durbin-Watson test criterion to detect considerable serial correlation
was calculated for each of the runs, and yielded values between -0.955 and 0.6986. Keeping
in mind that the null hypothesis on zero serial correlation for 20 observations and 2 model
parameters is rejected with 95% certainty when this test criterion is lower than 1.10, the test
clearly points at a non-negligible mutual dependence of the experimental data set.
In contrast to the correction for heteroscedasticity of the experimental error that was
discussed in the foregoing section and showed a considerable gain in the accuracy of the
point values of the parameter estimates, accounting for serial correlation yields less distinct
results.
For highly positive correlation, the performance of the correction method is somewhat
capricious. For about one third of the runs, the estimated point values of the slope 𝐴 via the
adapted regression are worse than those obtained by classical regression, while in 4 other
cases the difference in the estimates is negligibly small. The same applies to the results for
the parameter 𝐵, for which no improvement is remarked by applying the alternative routine.
The quality of the estimates is very low for both techniques, yielding values which are of by
more than 100% for multiple experimental data sets.
At milder positive correlation, the theoretical difference between both techniques starts to
diminish. Studying the corresponding graphs indeed reveals that the difference between the
parameter estimates for both procedures is almost absent. It is a remarkable observation that
apparently, for such small autocorrelations, the impact of the presence of time propagation
of the experimental error is small enough to not be detected by the correction mechanism.
This is slightly different for the regression of the strongly negatively correlated data set.
Primarily for the slope parameter, the parameter estimates from the adapted procedure
approach the actual model parameters considerably better. The picture is more nuanced as
regards the second parameter, showing slight differences between the estimates from both
methods. In about half of the simulations, the classical regression outperformed the adapted
technique, an additional sign that the overall performance of this methodology to account for
serial correlation is at least doubtful.
98 Benchmark analysis of alternative parameter estimation techniques
Figure 5-4 Estimates for the model parameters A (left) and B (right) for different values of the
autocorrelation 𝝆 as function of the run number. Filled symbols denote the results of the Two
Stage Iterative regression, open markers correspond to classical least squares estimation. The
dotted line corresponds to the true parameter values. Remark the varying scaling of the
vertical axes.
4.9
5
5.1
0 5 10
-1
0
1
2
3
0 5 10
4.5
5
5.5
0 5 10
0
1
2
0 5 10
4.9
5
5.1
0 5 10
0.9
1
1.1
0 5 10
𝜌 = 0.99
𝜌 = 0.10
𝜌 = −0.99
Benchmark analysis of alternative parameter estimation techniques 99
Part of the explanation for the bad performance of the modified regression, especially at
highly positive correlation, follows from inspection of Figure 5-5. Obviously, the routine was
not capable to accurately retrieve the actual autocorrelation in none of the runs. As this , now
wrongly estimated, value explicitly occurs in the objective function from which the optimal
values for the model parameters are determined, it is not surprising that the quality of the
resulting parameter estimates is only moderate. On the other hand, for the two other studied
autocorrelation factors, the routine was able to retrieve the specified value quite accurately
for all cases. Not truly surprisingly, this corresponds to considerably better estimates for the
model parameters as well.
Figure 5-5 Point estimates for the autocorrelation values for all runs as tinted markers with
the corresponding, true values given by the dotted lines
Although the correction for serial correlation has only a limited effect on the final parameter
estimates, a closer look at the lag plots indeed reveals its detrending effect. Figure 5-6 shows
both the corrected residuals and the lag plot for the last run for 𝜌 = −0.99. The plotted
corrected residuals are calculated as:
𝑒𝑐𝑜𝑟𝑟,𝑖 = 𝑒𝑖 − �̂�𝑒𝑖−1 (5-9)
where �̂� equals 0 for ordinary regression, and was estimated at -0.9507 for the modified
procedure. It is noticed that the introduction of a strong negative correlation between
subsequent experiments causes the sign of the associated residuals to alternate, although
their magnitude is bounded between 0 and 0.15. This regular flipping behavior is clearly
removed for the adapted regression scheme, while the variance of the residuals is
remarkably smaller as well. This demonstrates, at least in this case, that the alternative
procedure is capable of accounting correctly for the contribution of the foregoing response to
-2
-1
0
1
2
0 2 4 6 8 10
ρ
Experiment
ρ = 0.99
ρ = 0.1
ρ = - 0.99
100 Benchmark analysis of alternative parameter estimation techniques
its successor. The same stabilization is noticed in the lag plot, giving the residuals as function
of its predecessor. To guide the eye, the line 𝑒𝑐𝑜𝑟𝑟,𝑖 = −0.99𝑒𝑐𝑜𝑟𝑟,𝑖−1 was drawn, and is
readily seen that the residuals from the classical regression are distinctly scattered around it.
On the other hand, the corrected residuals from the modified regression are, besides smaller
in magnitude, also spread out more randomly. Hence, the latter resemble the behavior of
truly uncorrelated errors more closely.
Figure 5-6 Residual (left) and lag plot (right) for ordinary (open symbols) and corrected
regression (filled markers) for a run with 𝝆 having a pre-specified value of -0.99. The solid
line in the lag plot is given by 𝒆𝒄𝒐𝒓𝒓,𝒊 = −𝟎. 𝟗𝟗𝒆𝒄𝒐𝒓𝒓,𝒊−𝟏
Based on this analysis, the performance of the studied procedure to account for serial
correlation of the experimental data is concluded to be prone to considerable fluctuations, as
well as highly dependent on the actual degree of autocorrelation. Nevertheless, when the
method succeeded in elucidating the actual autocorrelation quite accurately, the resulting
parameter estimates were often better than those obtained from ordinary regression.
Moreover, in those situations the stabilizing effect of the method on the behavior of the
residuals was clearly demonstrated.
Given that the method’s stability is not assured even for simple models, it is questionable
whether its application for complex, highly nonlinear problems will prove to be
considerable.
5.3 Bayesian estimation by MCMC posterior sampling
As has already been pointed out in the second chapter, Bayesian procedures start from a
different view on the estimation of model parameters compared to classical regression. In
contrast to the methods that were evaluated above, which boiled down to tuning regression
analysis to make it robust in ‘problematic’ situations, the successful implementation of a
Bayesian routine will generate an alternative pathway to the point of capturing the inference
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
-5 0 5
ecorr
x -0.15
-0.1
-0.05
0
0.05
0.1
0.15
-0.15 -0.05 0.05 0.15
ecorr,i
ecorr,i-1
Benchmark analysis of alternative parameter estimation techniques 101
entrained by finite experimental data sets in accurate and translating it into useful and
accurate findings on unknown model parameters. In what follows, the number of model
parameters will be denoted as 𝑝.
The Matlab implementation relied on the theoretical findings as exposed on Bayesian
estimation in Section 4.4. This way, it combines the insights from the most general analytical
expression for the posterior density function as given in (4.63) , which fully captures the lack
of knowledge on the covariance structure of the experimental errors, with the strengths of
Markov Chain Monte Carlo schemes to scan its 𝑝-dimensional surface efficiently, and make
well-founded decisions about the model parameters.
Again, the simple linear model 𝑦 = 5𝑥 + 1 was applied to test the performance of the code,
adding a normally distributed error 휀 ~ 𝒩(0,1) to the true response value to simulate the
experimental data 𝒚 = {𝑦𝑖} set for 20 conditions {𝑥𝑖} . Since the model under study is single-
response, i.e. 𝑚 = 1, the posterior density function is given by:
𝑝(𝐴, 𝐵|𝒚) ∝ ∏ |𝑦𝑖 − (𝐴𝑥𝑖 + 𝐵)|−3
20
𝑖=1
(5-10)
with 𝐴 and 𝐵 the model parameters to be estimated. The Metropolis-Hastings algorithm as
introduced before was applied to perform the MCMC sampling of the posterior density. The
proposal distribution 𝑞 from which the samples are drawn was chosen as a bivariate normal,
centered at the foregoing sample value with constant variance:
𝑞(𝜷|𝜷𝑖−1) = 𝓝(𝜷𝑖−1, 𝑐2𝑰2) (5-11)
Choosing a suited value for 𝑐 is vital for a good sampling quality, but literature on how to
make a good guess is scarce. Therefore, the value was taken, quite randomly, at 10. The
number of samples to be drawn was set at 10000, with a burn-in time of 1000. Lastly, the
parameter values to start the Markov chain were taken at 0. Once samples for the model
parameters were collected, a two-dimensional histogram was constructed which resulted,
after normalization, in the sampled representative of the posterior density function. At last,
after summing the probabilities along each of the dimensions, and repeating this for both
parameters, the marginal sampled posterior density functions were calculated.
The results of this procedure are given in Figure 5-7. Inspection of the marginal probability
densities learns that the accuracy of the sampled distribution is rather poor: the actual value
of parameter 𝐴 is present as a small peak at 5, which is strongly outnumbered by two, almost
equally high, peaks at around 4 and 6. Moreover, the marginal distribution of parameter 𝐵 is
shows two equally distinct maxima: besides a peak at the actual value 1, a similar peak is
noticed around 0.5. Fortunately, the lacking quality of the Bayesian inference is explained by
the trace plots at the bottom of the figure, which indicate that the mixing ability of the
sampling is too low. The specified variance of the proposal distribution is apparently too
high, which results in an excessive rejection rate of the candidate samples and causes the
102 Benchmark analysis of alternative parameter estimation techniques
scheme to get ‘stuck’ at certain values during multiple iterations. Hence, multiple peaks
result, which do not necessarily correspond to parameter values with a high posterior
probability.
Figure 5-7 Results of the Metropolis-Hastings procedure for c = 10, giving the marginal
probabilities from the sampled posterior (above) and the sampled values throughout the
iteration for the model parameters A (left) and B (right)
The code was therefore run again, now for 𝑐 = 1, for which the results are given in Figure 5-
8. The marginal sampled posterior probabilities are now sharply peaked, having only one
Figure 5-8 Results of the Metropolis-Hastings procedure for c = 1, giving the marginal
probabilities from the sampled posterior (above) and the sampled values throughout the
iteration for the model parameters A (left) and B (right)
0
1
2
0 5 10
p(A|y)
A
0
10
0 0.5 1 1.5
p(B|y)
B
0
2
4
6
8
0 10000
A
0
1
0 10000
B
01234
0 5 10
p(A|y)
A
0
10
0 1 2 3
p(B|y)
B
-2
0
2
4
6
8
0 10000
A
Run
-1
0
1
2
3
0 10000
B
Run
Benchmark analysis of alternative parameter estimation techniques 103
maximum located in the close vicinity of its actual value. Simultaneously, the corresponding
trace plots reveal that the mixing behaviour of the Markov chain is remarkably better when
choosing a proposal with a lower variance. Nevertheless, since the iteration still tempts to get
stuck, an even lower value for 𝑐 seems to be required for optimal performance.
Apparently, the implemented sampling scheme turned out to show very poor robustness to
a suboptimal choice for the proposal distribution. The need for a manual tuning to ensure
proper functioning of the routine is logically considered as very unattractive with respect to
the overall applicability of the code to more complex models with a higher number of
parameters to be estimated. Therefore, more advanced sampling techniques were searched in
literature, especially for those algorithms which allow for an automated, sample based
updating of the proposal density function.
A plausible candidate was found in the field of adaptive MCMC [2-4]. As the name suggests,
this routine uses a built-in mechanism to tune the proposal distribution ‘on the fly’ so that an
optimal performance of the MCMC sampling is obtained. For a bivariate normal distribution,
this boils down to updating both its mean and variance matrix for every iteration, based on
the values of preceding samples. The most general of the reported procedure, the so-called
Global Adaptive Metropolis componentwise adaptive scaling was implemented. In this
algorithm, the proposal distribution from which the candidate sample is drawn reads:
𝑞(𝜷|𝜷𝑖−1) = 𝓝(𝜷𝑖−1, 𝚲𝑖−11/2
𝚺𝑖−1𝚲𝑖−11/2
) (5-12)
a 𝑝-dimensional multivariate distribution around the foregoing sample with a variable
covariance matrix. No specifications are made about the exact form of the latter, so that non-
constant variance of the different parameters and non-zero correlations between them are
allowed. The core of this matrix consists of a marginal contribution 𝚺𝑖−1, which gives the
variance of the collected samples, and is scaled by a 𝑝 × 𝑝 diagonal matrix 𝚲𝑖−1 to tune it
specifically for each component, i.e. each model parameter. The global property reflects that
all parameters in the new sample are allowed to be updated and not one parameter at a time.
The complete algorithm is given by [4]:
1. Choose starting values 𝜷0, 𝝁0, 𝚺0 and 𝚲0
2. Repeat for each sample 𝑖
2.1 Sample the candidate 𝑩𝑖 as 𝜷𝑖−1 + 𝑿𝑖 with 𝑿𝑖 ~ 𝓝(𝜷𝑖−1, 𝚲𝑖−11/2
𝚺𝑖−1𝚲𝑖−11/2
) and 𝑈
randomly from [0,1]. If 𝑈 < 𝛼(𝑩𝑖, 𝜷𝑖−1), set 𝜷𝑖 = 𝑩𝑖; otherwise, set 𝜷𝑖 = 𝜷𝑖−1;
2.2 Update for 𝑘 = 1 … 𝑝
ln(𝜆𝑖𝑘) = ln(𝜆𝑖−1
𝑘 ) + 𝛾𝑖𝑘[𝛼(𝜷𝑖−1 + 𝑋𝑖,𝑘𝒆𝑘 , 𝜷𝑖−1) − �̅�∗∗], with 𝒆𝑘 = {𝛿𝑘𝑗}
𝑗=1..𝑝 and
𝜆𝑖𝑘 = 𝚲𝑖,𝑘𝑘
𝝁𝑖 = 𝝁𝑖−1 + 𝛾𝑖(𝜷𝑖 − 𝝁𝑖−1)
𝚺𝑖 = 𝚺𝑖−1 + 𝛾𝑖[(𝜷𝑖 − 𝝁𝑖−1)(𝜷𝑖 − 𝝁𝑖−1)𝑇 − 𝚺𝑖−1]
104 Benchmark analysis of alternative parameter estimation techniques
The vanishing parameter 𝛾 is required to stabilize the adaptations of the proposal. Indeed,
because a continuous change of the proposal distribution will potentially hinder the
convergence of the stationary distribution of the chain to the actual posterior. Therefore, it is
suggested to reduce the effect of recent samples on the properties of the proposal for an
increasing number of iterations. For simplicity, 𝛾𝑖 was set at 1 𝑖⁄ . The value for the expected
acceptance probability �̅�∗∗ was taken at 0.44 as suggested. The starting values 𝜷0 and 𝝁0
were set at 0, while 𝚺0 and 𝚲0 were taken at 𝑰2.
The main results of the run are shown in Figure 5-9. It is noticed immediately that, while the
marginal posterior densities are similar to those generated earlier, the mixing behaviour is
considerably worse. Apparently, the explicit incorporation of a permanent updating
mechanism does not yield the expected improvements in performance of the sampling
routine.
Both the initial guess on the proposal variance and the number of samples have been varied
manually to check for explanations for the lack of mixing, but without success. Hence, the
problematic convergence of the sampling scheme seems to be an intrinsic shortcoming of the
algorithm. Although adaptive MCMC sampling has been reported as a valuable technique,
its application to this particular problem showed that the performance of the algorithm has
at least some shortcomings. The literature survey was resumed which resulted in a recent
sampling technique which is reported as a promising algorithm in cases with badly scaled
proposal distribution, yet without requiring a mechanism for continuous updating [5, 6].
Figure 5-9 Results from the Adaptive Metropolis algorithm, giving the marginal probabilities
from the sampled posterior (above) and the sampled values throughout the iteration for the
model parameters A (left) and B (right)
0
10
20
0 5 10
p(A|y)
A
0
10
20
0 1 2 3
p(B|y)
B
0
2
4
6
0 10000
A
Run -5
0
5
10
0 10000
B
Run
Benchmark analysis of alternative parameter estimation techniques 105
This alternative sampling scheme introduces two new concepts compared to the previously
applied algorithms. At first, the starting point of the method is to design a Markov chain
generator which is affine invariant, i.e. the way in which candidate samples are drawn from
the posterior distribution is not influenced by any linear transformation of its variables.
Apparently, this feature causes the convergence of a sampling routine to become more
robust. Secondly, where the traditional approaches to MCMC sampling initialize one
Markov chain and follow its evolution in time, i.e. throughout the iterations in the
Metropolis algorithm, an alternative pathway is in starting from multiple seeds which are
allowed to explore parameter space in parallel, during a proportionally smaller number of
iterations. By allowing explicitly for strong interactions between the different walkers, i.e. to
make the evolution of one walker depend on the positions of the others, the moves of this
ensemble of walkers is known to adapt automatically to the targeted posterior distribution.
One particular manner to realize an affine invariant walk through parameter space is by
means of a stretch move, as represented in Figure 5-10. This way, every candidate sample for
a certain walker 𝑿𝒌 from the ensemble is obtained by using the current value of one other
walker 𝑿𝒋, 𝑗 ≠ 𝑘, to obtain the linearly interpolated value 𝒀:
𝑋𝑘(𝑡) → 𝑌(𝑡 + 1) = 𝑋𝑗(𝑡) + 𝑍 (𝑋𝑘(𝑡) − 𝑋𝑗(𝑡)) (5-13)
where 𝑍 is a scaling factor to be sampled, as will be discussed below. Which of the
complementary walkers has to be used for the interpolation has to be determined at random.
Figure 5-10 Stretch move of walker 𝑿𝒌 along the line through walker 𝑿𝒋, yielding candidate
sample 𝒀. All other walkers (grey dots) do not participate.
106 Benchmark analysis of alternative parameter estimation techniques
The complete algorithm for an ensemble consisting of 𝐾 walkers reads:
For each time step t
1. Repeat, for each walker:
1.1 Choose a walker 𝑿𝒋, 𝑗 ≠ 𝑘 at random
1.2 Draw a sample 𝑍 from the distribution 𝑔(𝑧) ∝1
√𝑧, 𝑧 ∈ [
1
2, 2]
1.3 Calculate 𝑌(𝑡) according to (5-13)
1.4 Determine 𝑞 = 𝑧𝑝−1 𝑝(𝑌(𝑡))
𝑝(𝑋𝑘(𝑡−1)) and sample 𝑟 from [0,1]
1.5 If 𝑟 ≤ min (1, 𝑞), set 𝑋𝑘(𝑡) = 𝑌(𝑡); otherwise, set 𝑋𝑘(𝑡) = 𝑋𝑘(𝑡 − 1)
Literature is not clear about which values to choose for the number of walkers and the
number of iterations to give optimal performance of the sampling scheme. One important
guideline is that a sufficiently large number of walkers is more beneficial for convergence
than a large number of time steps. It has to be remarked that this will increase the burn-in
time and hence the number of samples to be discarded [6].
In this case, the number of walkers was chosen at 100, so that 100 iterations were required to
generate the same number of samples as for the previous methods. The preliminary burn-in
time was taken at 10. The results of running the Matlab code are shown in Figure 5-11.
Again, two peaked marginal distributions are found, the one of the slope parameter being
sharper than that of the increment. In contrast to the previous sampling techniques, the trace-
plots now do show the desired mixing of the chain, which is promising concerning the
Figure 5-11 Results from the affine invariant MCMC algorithm, giving the marginal
probabilities from the sampled posterior (above) and the sampled values throughout the
iteration for the model parameters A (left) and B (right)
0
1
2
-5 0 5 10
p(A|y)
A
0
0.5
1
1.5
-10 -5 0 5 10
p(B|y)
B
-5
0
5
10
0 5000 10000
A
Run -10
0
10
0 5000 10000
B
Run
Benchmark analysis of alternative parameter estimation techniques 107
quality of the samples.
All statistical properties of the posterior density function are determined from the samples,
including the uncertainty on the model parameters. The 𝛼-percent probability interval in
which the parameter values are most likely located, is defined as the smallest interval on the
parameter axis for which the surface integral of the marginal posterior density equals 𝛼. In
this case, the 95% probability intervals on 𝐴 is given by [4.5306, 5.3469], that for 𝐵
reads [0.5102, 1.7347], which clearly include the actual parameter values. Classical
regression on the simulated data set estimated the optimal parameter values at [�̂�, �̂�] =
[5.0846, 0.9966]. Application of this formula to parameters 𝐴 and 𝐵 yields 95% confidence
intervals of [5.0752, 5.0940] and [0.9101, 1.0832] respectively, which are considerably
smaller than those obtained from the Bayesian estimation.
The remarkable gain in performance of the last sampling scheme and its ability to yield
reasonable results for the test case, though very simplified, is an encouraging finding in the
search for reliable alternative routines towards parameter estimation. Naturally, more tests
will have to be performed on more complex models to truly assess the overall reliability of
the routine. Moreover, in the discussion above the quality of the sampling scheme was
evaluated from a visual inspection of the trace-plots, which can be argued to be a rather
subjective decision criterion. Unfortunately, at the moment of writing of this work no solid
and mathematically well-funded theory is available to draw incontestable conclusions on the
convergence of the chain. Nevertheless, since Bayesian estimation procedures using MCMC
sampling are a growing field of interest in statistical research, progress on its theoretical
foundation might be expected in the near future.
5.4 References
1. Thybaut, J.W., Kinetic Modeling and Simulation - University Course. 2014, Ghent
University.
2. Haario, H., E. Saksman, and J. Tamminen, An adaptive Metropolis algorithm.
Bernoulli, 2001: p. 223-242.
3. Gelfand, A.E. and S.K. Sahu, On Markov Chain Monte Carlo Acceleration. Journal of
Computational and Graphical Statistics, 1994. 3(3): p. 261-276.
4. Andrieu, C. and J. Thoms, A tutorial on adaptive MCMC. Statistics and Computing,
2008. 18(4): p. 343-373.
5. Goodman, J. and J. Weare, Ensemble samplers with affine invariance.
Communications in Applied Mathematics and Computational Science, 2010. 5(1): p.
65-80.
6. Foreman-Mackey, D., et al., emcee: The MCMC hammer. Publications of the
Astronomical Society of the Pacific, 2013. 125(925): p. 306-312.
108 Conclusions and future work
Chapter 6
Conclusions and future work
In this master thesis, the currently applied methodology to estimate unknown kinetic
parameters in chemical reaction modelling is critically reviewed. As these procedures rely on
classical regression analysis, the quality of the parameter estimates will only be guaranteed
in those conditions where the necessary theoretical conditions, primarily on the regularity of
the experimental error, are fulfilled. Unfortunately, this assumption is often too strong and
as a consequence, the performance of the estimation methodology is not ascertained.
In the first part of this work, an attempt was made to evaluate the overall performance of the
current procedures for parameter estimation. To assess their robustness, it was attempted to
evaluate whether the estimated values for kinetic parameter values were determined for data
from both batch-reactor and continuous-flow experiments. Continuous-flow experiments for
the transesterification of ethyl acetate with methanol catalyzed by the ion-exchanging resin
Lewatit K2629 were performed for varying process conditions. Batch-data based kinetic
parameters were available from recent research on this topic.
A qualitative comparison of the results of the experiments yielded the expected trends. The
conversion of the reactants was positively influenced by increasing temperature and if one of
reactants was excessively present, while higher flow rates through the reactor, and hence a
shorter contact time of the mixture with the catalyst, was found to lower their consumption.
Unfortunately, a quantitative analysis of the data showed some strong inconsistencies of the
results, as the observed concentrations of the ethanol and methyl acetate did not obey to the
required reaction stoichiometry. The explanation for this imbalance was searched in an
improper function of the reactor setup, yet up to the moment of writing the true reason as
not yet been identified. An attempt to fit the experimental observations with the reported
kinetic model revealed strong deviations. Therefore, the evaluation of the robustness did not
succeed and will have to be retaken, if desired, in the future.
Apart from its robustness, the current statistical methodology was tested on its applicability
to physical systems as well. For this purpose, the modelling of the electrical behavior of thin-
layer solar cells was chosen as it is a growing field of interest in the development of new
photovoltaic technology. Current-voltage experiments have been performed on two copper-
indium-gallium-selenium (CIGS) type solar cells which were cut out of the same mother
panel, at ambient to heavily cooled temperatures. Based on these data, the parameters of the
most extended model in literature to data were successfully simulated, yielding both
physically relevant and statistically significant estimates. Based on these values, a current
path analysis was made for both cells that revealed and quantified the contribution of
Conclusions and future work 109
different current leakage mechanisms to the performance and efficiency of the entire cell.
Comparison of these analyses showed strong deviations between the cells, which
demonstrated the strongly localized nature of the parasitic effects. Therefore, the application
of the statistical methodology to physical model building was considered successful. It is
hoped and believed that this promising synergy will prove to be of even higher use in future
fundamental research on solar cells.
The second half of the thesis focuses on the evaluation of alternative techniques for
parameter estimation. Three candidate routines were withheld from an extensive literature
and encoded in Matlab. Their performance was benchmarked by parameter estimation for a
simple, single-response linear model, which allowed for a first comparison with the results
from classical regression. The first method extended the scope of classical regression routines
to situations with heteroscedastic errors, i.e., with non-constant variance, by modelling the
weighing factors as proportional to a power of the model predictions. The performance of
this procedure was found to be rather variable, as its ability to retrieve the true parameter
values heavily depended on the overall quality of the data. Nevertheless, a significant gain in
accuracy was noticed with respect to ordinary least squares estimation, while the regularity
of the heteroscedastic error was drastically increased after weighing.
The second method accounted for non-zero correlation among the experimental error, a
situation that is of particular interest for time series experiments, e.g. data from batch-reactor
setups for chemical modelling. Now, the experimental error was modelled as a time series
obeying to a first order auto-regressive model. Although again a considerable positive
impact of the procedure was seen on the randomness of the errors, the improvement in the
quality of the parameter estimates was not striking. For some runs of the routine, the
calculated estimates were performing even less well. Although the routine was not explicitly
evaluated on a nonlinear model, as this introduces an additional level of complexity in the
algorithm, the only moderate performance on simple models does not leave much hope for
its added value for more complicated situations.
At last, a Bayesian estimation procedure was implemented. The routine tried to combine, for
the first time, theoretical insights on optimal design of the posterior density function to
allow, at least theoretically, for an automated weighing of the experimental data, with the
computational power of MCMC sampling schemes to evaluate that multidimensional
posterior efficiently. It turned out that classical MCMC sampling schemes were not able to
converge properly and therefore, more advanced algorithms had to be searched. Although
the obtained inference on the model parameters was not as efficient as the parameter
estimates from classical regression, as the calculated 95% confidence intervals were slightly
broader, the first successful application of this new methodology is yet seen as a promising
result. Keeping in mind that this procedure allows not only for an automated weighing, but
for the explicit inclusion of prior information on the model parameter values as well, further
research on and testing of this interesting technique for more complex situations is highly
recommended.
110 Appendix
Appendix
A.1 Matlab routines for alternative techniques to
estimate model parameters
A.1.1 Data-based weighing
clear all % format long
% Proposed model y = Aact*x+Bact par = 2; Aact = 5; Bact = 1;
% Generate the heteroscedastic output set. It is assumed that the variance % of the experimental error is proportional to the square of the actual % value of the output n = 20; sigma = 1; x = linspace(-5,5,n)'; X = [x,ones(size(x))];
ycalc = X*[Aact;Bact]; V = diag(abs(ycalc).^2)*sigma^2; yobs = ycalc+mvnrnd(zeros(size(ycalc)),V)';
% Ordinary Least Squares estimate and confidence intervals b0 = (X'*X)^-1*X'*yobs; yopt = X*b0; Sb = (yobs-yopt)'*(yobs-yopt); Vb0 = (X'*X)^-1*Sb/(n-par); tval = tinv(1-0.05/2,n-par); CIb0low = b0-tval*sqrt(diag(Vb0)); CIb0up = b0+tval*sqrt(diag(Vb0));
% Start of the weighted routine weights = @(p,b) vpa(abs(X*b).^(2*p-2)); f1 = @(p,b) -1/2*log(prod(weights(p,b)))+n/2*log(sum(weights(p,b).*(yobs-
X*b).^2)); f = @(v) f1(v(1),v(2:end)); vopt = fminsearch (f,[1;b0]); popt = vopt(1); bopt = vopt(2:end); yopt = X*bopt; wopt = weights(popt,bopt);
% Confidence intervals Vb = zeros(par+1,par+1); res = yobs-yopt; WSSR = sum(wopt.*res.^2); a = (popt-1)./(yopt.^2)+n/WSSR*((popt-1)*(2*popt-
3)*wopt.*res.^2./(yopt.^2)-4*(popt-1)*wopt.*res./yopt+wopt);
Appendix 111
b = 1./yopt-n/WSSR*(wopt.*res.*(res./yopt+2*((popt-1)*res./yopt-
1).*log(abs(yopt)))); k = sum(wopt.*res.^2.*log(abs(yopt))); c = wopt.*res.^2.*(log(abs(yopt))).^2; for i = 1:par+1 for j = 1:i S = @(q) sum(wopt.*res.*((popt-1)*res./yopt-1).*X(:,q)); if i <= par Vb(i,j) = sum(a.*X(:,i).*X(:,j))-2*n*S(i)*S(j)/WSSR^2; Vb(j,i) = Vb(i,j); elseif j <= par Vb(i,j) = -sum(b.*X(:,j))-2*n*k*S(j)/WSSR^2; Vb(j,i) = Vb(i,j); else Vb(i,j) = 2*n/WSSR*sum(c)-2*n*k^2/WSSR^2; end end end Vb = Vb^-1; tval = tinv(1-0.05/2,n-(par+1)); CIblow = [bopt;popt]-tval*sqrt(diag(Vb)); CIbup = [bopt;popt]+tval*sqrt(diag(Vb));
disp([CIb0low b0 CIb0up]) disp([CIblow [bopt;popt] CIbup])
% figure % subplot(2,2,1) % scatter (x,yobs,'b') % hold on % scatter (x,X*b0,'r') % hold off % subplot(2,2,2) % scatter (x,yobs,'b') % hold on % scatter (x,X*bopt,'r') % hold off % subplot(2,2,3) % scatter (x,yobs-X*b0) % subplot(2,2,4) % scatter (x,sqrt(wopt).*res);
A.1.2 Correcting for serial correlation
% Benchmark study for the correction for serial correlation for a single % response, linear model by implementing the "iterated two-stage" AR(1) % model as suggested by Seber and Wild (2003) and introduced in Chapter 2
clear all
% Proposed model y = A*x+B A = 5; B = 1;
% Generate the correlated output set, x=[-5,5] with predetermined value % rhospec for the AR(1) parameter n = 20; sigma = 0.1; x = linspace(-5,5,n)';
112 Appendix
X = [x,ones(size(x))]; rhospec = -0.99; yexact = A*x+B; V = zeros(n,n); for i = 1:n for j = 1:n V(i,j) = rhospec^(abs(j-i)); end end V = V*sigma^2; yobs = yexact+mvnrnd(zeros(size(yexact)),V)';
% Ordinary Least Squares estimate b0 = (X'*X)^-1*X'*yobs;
% Calculate Durbin-Watson test criterion % For alpha = 0.05, n = 20 and p = 2, dL = 1.10 and dU = 1.54 res0 = yobs-X*b0; dL = 1.10; dU = 1.54; d = (res0(1:end-1)'*res0(2:end))/(res0'*res0);
tol = 1; bopt = b0; while tol>10^-4 bold = bopt; res = yobs-X*bopt; rho = res(1:n-1)'*res(2:n)/(res'*res); f = @(a,b) (1-rho^2)*(yobs(1)-X(1,:)*[a;b])^2+sum((yobs(2:n,:)-
X(2:n,:)*[a;b]-rho*(yobs(1:n-1,:)-X(1:n-1,:)*[a;b])).^2); fun = @(b) f(b(1),b(2)); bopt = fminsearch (fun,bold); tol = norm((bopt-bold)./bopt); % disp(rho) end
% Define the new, uncorrelated experimental error vector u = [res(1);res(2:n)-rho*res(1:n-1)];
disp(b0) disp(bopt) disp(rho) disp([dL d dU])
% figure % subplot(3,2,1) % scatter (x,yobs,'b') % hold on % plot (x,X*bopt,'r') % hold off % subplot(3,2,2) % scatter (x,yobs,'b') % hold on % plot (x,X*b0,'r') % hold off % subplot(3,2,3) % scatter (x,yobs-X*b0) % subplot(3,2,4) % scatter (x,u) % subplot(3,2,5)
Appendix 113
% scatter(res0(1:end-1),res0(2:end)); % subplot(3,2,6) % scatter(res(1:end-1),res(2:end)-rho*res(1:end-1));
A.1.3 Bayesian estimation with affine invariant MCMC
% Bayesian estimation of a single-response model clear all
% Create the experimental data par = 2; A = 5; B = 1;
% Number of "experimental" data points n = 20;
% Number of responses m = 1; sigma = 1; x = linspace(-5,5,n)'; y = @(a,b) a*x+b; ycalc = y(A,B); %y = @(a,b) a*exp(b*x); %ycalc = y(A,B); yobs = mvnrnd(ycalc,sigma^2);
% For a single-response model and heteroscedastic but non-correlated
experimental % data set, the marginal posterior density function % p(O|y) is given by % p(O|y) ~ prod[|y(i)-f(x(i))|^-(m+2)], i = 1..n
%p = @(theta) prod(abs(yobs-y(theta(1),theta(2),theta(3),theta(4))))^-
(m+2); p = @(theta) prod(abs(yobs-y(theta(1),theta(2))))^-(m+2);
% Initialize all walkers at 0; K = 10000; T = 1000; walkers = zeros(T*K,par); % Initialize the walkers at a random position between -1 and 1. % Initialization at 0 will make the walkers immobile. walkers (1:K,:) = 2*rand(K,par)-1;
for t = 2:T temp = walkers((t-2)*K+1:(t-1)*K,:); for k = 1:K disp((t-1)*K+k); % Construct the set of complementary walkers if k == 1 temp2 = temp(2:end,:); else temp2 = [temp(1:k-1,:);temp(k+1:end,:)]; end % Random picking of the complementary walker for the stretch move pos = ceil((K-1)*rand); Xj = temp2(pos,:); % Sample from the distribution g(z) = 1/sqrt(z) , z = 1/a..a a = 2;
114 Appendix
intz = (2*a-2)/sqrt(a); z = (rand*intz*sqrt(a)+2)^2/4/a; Xk = temp(k,:); Y = Xj+z*(Xk-Xj); q = z^(par-1)*p(Y)/p(Xk); % Perform the acceptance/rejection of the candidate r = rand; if p(Y) == 0 walkers((t-1)*K+k,:) = Xk; elseif r <= min(q,1) walkers((t-1)*K+k,:) = Y; else walkers((t-1)*K+k,:) = Xk; end end end
burnin = 50; walkers1 = walkers; walkers = walkers(burnin*K+1:end,:);
% Now calculate the par-dimensional probability density distribution of the % parameters of a par-dimensional histogram data set, using the function
histcn.m added in the folder edge1 = linspace(4,6,1000); edge2 = linspace(0,2,1000); [histdata, edges,mids] = histcn (walkers,edge1,edge2); h = size(histdata,1); edges = cell2mat(edges); mids = cell2mat(mids); % The vector histdata now contains the number of counts for a par-
dimensional % cube of h bins. The analogue for the marginal posterior density function % for th j'th parameter is then found by summing over all dimensions, % except for the j'th. prob = zeros(h,par);
dist1 = mids(2)-mids(1); dist2 = mids(h+2)-mids(h+1); histdata = histdata/(sum(histdata(:)))/dist1/dist2;
% Calculate the marginal probability densities along each parameter axis prob(:,1) = dist2*sum(histdata,2); prob(:,2) = dist1*sum(histdata,1)';
figure subplot(2,2,1) plot(mids((1:h)),prob(:,1)); subplot(2,2,2) plot(mids(h+1:end),prob(:,2)); subplot(2,2,3) plot(1:size(walkers,1),walkers(:,1)); subplot(2,2,4) plot(1:size(walkers,1),walkers(:,2)); hold off
% Calculate the alpha percent confidence interval alpha = 0.95; int1 = 100; int2 = int1;
Appendix 115
for i = 1:h-1 for j = i+1:h area1 = sum(prob(i:j,1))*dist1; area2 = sum(prob(i:j,2))*dist2; if area1 >= alpha && (j-i+1)*dist1<int1 int1 = (j-i+1)*dist1; unlim1 = edges(i); uplim1 = edges(j+1); Area1 = area1; end if area2 >= alpha && (j-i+1)*dist2<int2 int2 = (j-i+1)*dist2; unlim2 = edges(h+1+i); uplim2 = edges(h+1+j+1); Area2 = area2; end end end
116 Appendix
A.2 Lab journal: table of contents
Overview of calibration experiments (pp 1-4)
Overview of transesterification experiments (pp. 5-10)