Using Neural Networks to Predict Business Failures · 2016-02-12 · University of Warwick –...
Transcript of Using Neural Networks to Predict Business Failures · 2016-02-12 · University of Warwick –...
University of Warwick – School of Engineering – Engineering Business Management
Using Neural Networks to Predict Business Failures ES327 – Undergraduate Research Project
By Miika Oskar Jokinen – Student ID: 1105638
By Miika Oskar Jokinen – Student ID: 1105638
1
Author’s Self-Assessment
Relevance and contribution to Engineering Business Management
Supply chain management is a crucial part of managing engineering businesses. According to
McKinsey & Company [1], companies have been struggling in the recent years with the increasing
financial instability of their suppliers. A study in 2011 by PricewaterhouseCoopers of more than 100
aerospace suppliers found that 20% were at high risk of being able to keep up with the rising
production and had weak financial strength [2]. Dun & Bradstreet (D&B), who provide software for
the aerospace industry, claim that they can “project 92 percent of all U.S. bankruptcies at least six
months in advance” [3]. This project develops a similar early-warning model that is of great
importance to large corporations with increasingly wide, global and complex supply chains.
Achievements
The project can be considered an achievement with the following merits: a successful use of a
financial database encompassing data extraction and manipulation; the development of a neural
network model using MATLAB programming language; the analysis of mathematical concepts
underlying the neural networks and their optimisation tools; the exploration of financial ratios
affecting the success and failure of a firm; and the consideration of the neural network model
applications. In addition, there are only two theses in the area of business failure prediction that have
been carried out at the University of Warwick. They were done by Hsu-Che Wu from the Department
of Engineering in 2007 and by Lin, Lee Hsuan from Warwick Business School in 1992, both being Ph.D.
theses [4]. This project adds value to the University of Warwick community in the area of forecasting
in business applications.
Areas for improvement
The project also has areas for improvement: the project does not encompass the use of a feature
selection method to select the most appropriate financial variables. Such feature selection methods
are principal component analysis (PCA), factor analysis and stepwise regression. They are useful in
eradicating noise in the data sample and reducing the size of input data, both speeding up the training
of the neural network. Even though many of the business failure prediction papers have used a simple
hold-out sample to validate the prediction results, including this project, the use of a 5-fold or 10-fold
cross-validation would enhance the robustness of the model.
Benefits to others
Regarding the benefits of this project for others, it is particularly useful in identifying fruitful areas for
further studies. Furthermore, due to the transparency of research methodology, especially data
selection, other researchers or students in the area have a solid foundation for future research.
By Miika Oskar Jokinen – Student ID: 1105638
2
Abstract Bankruptcy prediction models and tools are of interest to a large number of individuals and
organisations including investors, bankers, governmental and regulatory bodies and auditors. This
project focuses on how financial ratios can be used in combination with neural networks to forecast
business failures from one to three years before bankruptcy. The prediction model uses 11 financial
ratios that are selected based on previous research and the availability of data on the Fame database.
Best prediction results with a hold-out sample are obtained two years before the failure with a
prediction accuracy of 88.2%. This accuracy is obtained with three years of past data that is used to
calculate the average, standard deviation and trend of the 11 financial ratios. The analysis also
demonstrates how more consistent results can be achieved when the financial ratios are inspected
over a longer period of time. In addition to reviewing a wide range of bankruptcy prediction studies,
the project takes a critical approach on the current direction of the academic research, which is
focused on developing more and more complicated prediction models. The author concedes that the
availability of published financial data is a large limitation, but a focus on the sample data would
potentially yield better prediction results. This approach has been picked up by the commercial world
including companies such as Growth Science and ZestFinance who have taken a very different
approach in comparison to the academic research.
By Miika Oskar Jokinen – Student ID: 1105638
3
Table of Contents Author’s Self-Assessment .........................................................................................................................................1
Abstract ....................................................................................................................................................................2
List of figures ............................................................................................................................................................4
List of tables ..............................................................................................................................................................6
1. Introduction ..........................................................................................................................................................7
1.1 Bankruptcy definitions ....................................................................................................................................9
2. Project Specification .......................................................................................................................................... 11
3. Literature review ............................................................................................................................................... 12
3.1 Statistical models ......................................................................................................................................... 12
3.2 Theoretical models ...................................................................................................................................... 13
3.3 Artificially intelligent expert system models (AIES) ..................................................................................... 15
3.4 Usage frequency of different bankruptcy prediction models ..................................................................... 16
3.3.1 Neural network architecture ................................................................................................................ 17
3.3.2 Previous studies on bankruptcy prediction using neural networks ..................................................... 21
4. Research methodology ...................................................................................................................................... 26
4.1 Data set ........................................................................................................................................................ 26
4.2 Design of the Neural Network model .......................................................................................................... 31
4.2.1 Neural network structure ..................................................................................................................... 33
4.3 Cross-validation ........................................................................................................................................... 34
5. Results ............................................................................................................................................................... 35
5.1 One year before failure ............................................................................................................................... 35
5.2 Two years before failure .............................................................................................................................. 36
5.3 Three years before failure ........................................................................................................................... 40
5.4 Error types I and II ....................................................................................................................................... 40
6. Analysis and discussion of results ...................................................................................................................... 42
6.1 Analysis of financial ratios ........................................................................................................................... 43
6.2 Comparison to previous studies .................................................................................................................. 47
6.2.1 Neural networks ................................................................................................................................... 47
6.2.2 Discriminant analysis ............................................................................................................................ 48
6.2.3 Logistic regression ................................................................................................................................ 48
By Miika Oskar Jokinen – Student ID: 1105638
4
6.2.4 Other methods ..................................................................................................................................... 49
6.3 Application of the prediction model ........................................................................................................... 50
6.3.1 Error types I and II ................................................................................................................................ 51
6.4 Advantages of neural networks ................................................................................................................... 52
6.5 Disadvantages of neural networks .............................................................................................................. 53
6.6 Limitations of financial ratios ...................................................................................................................... 54
7. Recommendations for Further Studies ............................................................................................................. 57
7.1 Macroeconomic and qualitative measures ................................................................................................. 57
7.2 Big Data ........................................................................................................................................................ 60
7.3 Behaviour of financial ratios ........................................................................................................................ 60
8. Conclusions ........................................................................................................................................................ 61
8.1 Project costing ............................................................................................................................................. 62
Acknowledgements ............................................................................................................................................... 62
References ............................................................................................................................................................. 63
Appendices ............................................................................................................................................................ 74
Appendix 1 – MATLAB code .............................................................................................................................. 74
Appendix 2 – Input data selection criteria on the Fame databse ..................................................................... 81
Appendix 3 - Levenberg-Marquardt training algorithm .................................................................................... 84
List of figures Figure 1. Number of bankruptcies in the UK between 1960-2013 [6]. .................................................... 8
Figure 2. Relative frequency of occurrence of different bankruptcy prediction methods [9]. .............. 17
Figure 3. Basic layout of a neural network model. Adapted from [15]. ................................................. 18
Figure 4. Inputs, outputs and computations of a single neuron. Adapted from [16]. ........................... 19
Figure 5. Neural network structure used in the project with the only variation in the number of inputs
[71]. ......................................................................................................................................................... 34
Figure 6. Data scatter for 2 years before failure with 3 years of past data. ........................................... 38
By Miika Oskar Jokinen – Student ID: 1105638
5
Figure 7. Performance development. ..................................................................................................... 39
Figure 8. Data scatter for 3 years before failure with 3 years of past data. ........................................... 41
Figure 9. Prediction results 1, 2 and 3 years before bankruptcy with a varying number of years of past
data used. ................................................................................................................................................ 43
Figure 10. The development of the working capital/total assets ratio for a subsample from 4 years to 1
year before failure. ................................................................................................................................. 44
Figure 11. The development of working capital /total assets ratio from 5 years to 1 year before
bankruptcy. Adapted from [72]. ............................................................................................................. 44
Figure 12. The development of the working capital/total assets ratio for a subsample over 3 years for
healthy companies. ................................................................................................................................. 45
Figure 13. The development of the current ratio for a subsample from 4 years to 1 year before failure.
................................................................................................................................................................. 46
Figure 14. The development of the current ratio from 5 years to 1 year before bankruptcy. Adapted
from [72]. ............................................................................................................................................... 46
Figure 15. Comparison of the project’s results to previous neural network models represented in
Table 9. .................................................................................................................................................... 47
Figure 16. Comparison of the project’s results to discriminant methods represented in Table 10. ..... 48
Figure 17. Comparison of the project’s results to logistic regression models represented in Table 11. 49
Figure 18. Comparison of the project’s results to methods represented in Table 12............................ 50
Figure 19. Standard & Poor’s framework for identifying the credit risk of a company [97]. ................ 59
Figure 20. Neural network structure and notation used in the Levenberg-Marquardt algorithm
By Miika Oskar Jokinen – Student ID: 1105638
6
derivation [10]. ........................................................................................................................................ 84
List of tables
Table 1. Division of prediction models into three categories, adapted from [8]. .................................. 10
Table 2. Statistical prediction models, adapted from [8]. ...................................................................... 13
Table 3. Theoretical prediction models, adapted from [8]. .................................................................... 14
Table 4. Artificially intelligent expert system prediction models, adapted from [8].............................. 16
Table 5. Data used in neural network studies. ....................................................................................... 27
Table 6. Prediction results one year before bankruptcy. ....................................................................... 36
Table 7. Prediction results two years before bankruptcy. ...................................................................... 36
Table 8. Prediction results three years before bankruptcy. ................................................................... 40
Table 9. Bankruptcy prediction results for six different neural network models................................... 47
Table 10. Bankruptcy prediction results for four different discriminant methods. ............................... 48
Table 11. Bankruptcy prediction results for two different logistic regression models. ......................... 49
Table 12. Bankruptcy prediction results for a number of different prediction methods. ...................... 49
Table 13. A list of advantages of neural networks. Adapted from [55]. ................................................. 52
Table 14. A list of the disadvantages of neural networks. Adapted from [55]. ...................................... 53
Table 15. Factors used in the credit risk analysis by Standard & Poor’s. Adapted from Langohr &
Langohr (2008), source [98]. ....................................................................... Error! Bookmark not defined.
Table 16. Costs of the project. ................................................................................................................ 62
Table 17. Input data selection criteria for bankrupt companies. ........................................................... 81
Table 18. Input data selection criteria for healthy companies. .............................................................. 82
By Miika Oskar Jokinen – Student ID: 1105638
7
1. Introduction
The financial stability of firms is of interest to a large number of stakeholders including investors,
bankers, governmental and regulatory bodies and auditors. If bankruptcy can be predicted accurately,
it may be possible for the firm to be restructured, thus avoiding failure. This would benefit owners,
employees, creditors, and shareholders alike. In addition, the government and regulatory authorities
require tools and techniques to help them monitor the general financial status of firms in order to
make sound economic and industrial policies. Piesse, et al. [5] point out how the failure of one firm
can affect a number of stakeholders including shareholders, debtors and employees. Moreover, if a
multitude of firms simultaneously face serious financial distress this can have a detrimental effect on
the national economy and possibly on that of other countries. The recent global financial crisis that
started snowballing in 2007 is a fine example of such concurrent failing of companies, including the
meltdown of a number of many important institutions, such as Lehman Brothers, Merrill Lynch,
Fannie Mae, Freddie Mac, Washington Mutual, Wachovia and AIG [6]. Bankruptcy prediction models
and tools can be used by governments as early warning mechanisms, which will help them to develop
policies in time to maintain industrial cohesion and minimise the damage caused to the economy as a
whole [5].
Furthermore, the increase in insolvencies in the last few decades, illustrated by Figure 1, demonstrates
how bankruptcy prediction models are of increasing importance.
By Miika Oskar Jokinen – Student ID: 1105638
8
Figure 1. Number of bankruptcies in the UK between 1960-2013 [7].
This report starts by defining bankruptcy, after which the literature review gives an overview of the
different bankruptcy prediction models that have been developed in the past. The research
methodology section describes the sample data selection process, the design of the neural network
model and the validation methodology of results. The Results section presents results from one to
three years before bankruptcy, and they are compared to previous research in the Analysis and
Discussion of Results. The potential and limitations of the prediction model are explored here, too.
Finally, the Recommendations for Further Studies section calls into question the current focus of
academic research on the development of more complex models rather than focusing on the quality
of the data sample. The Appendix 1 displays the MATLAB code that was used to process the input
0
5 000
10 000
15 000
20 000
25 000
30 000
19
60
19
63
19
66
19
69
19
72
19
75
19
78
19
81
19
84
19
87
19
90
19
93
19
96
19
99
20
02
20
05
20
08
20
11
Number of insolvencies in England and Wales from 1960 to 2013
Number of insolvencies in Englandand Wales
By Miika Oskar Jokinen – Student ID: 1105638
9
data, create the network model, validate the results and produce the graphs and tables that are
presented in the Results section. Appendix 2 shows the data sample selection criteria that was used to
extract financial data from the Fame database.
1.1 Bankruptcy definitions
The bankruptcy prediction literature has used a variety of terms to describe corporate distress.
According to Zopounidis & Dimitras [8] the most commonly used terms are ‘failure’, ‘insolvency’ ,
‘bankruptcy’ and ‘default’. Other commonly used terms are ‘financial distress’ and ‘business failure’
[5].
Most countries legislate for formal bankruptcy proceedings for the protection of the public interest
such as Chapters 8 and 12 in the US, and the Insolvency Act in the UK. The Insolvency Act was
developed in 1968 and it contains six distinct procedures, which can be applied to different
circumstances to prevent either creditors, shareholders, or the firm as a whole from unnecessary loss,
thereby reducing the degree of individual as well as social loss [5]. The six procedures are listed
below:
1. Company voluntary arrangements
2. Administration order
3. Administrative receivership
4. Creditors’ voluntary liquidation
5. Members’ voluntary liquidation
6. Compulsory liquidation
By Miika Oskar Jokinen – Student ID: 1105638
10
[5]
Fame database, that contains financial information on UK companies, is used in this project to extract
financial data that is then inputted into the prediction model. On the Fame database ‘liquidation’ is a
criterion that can be used to filter failed companies, and therefore procedures 4.-6. are considered in
this study. The terms for business failure are used rather interchangeably in this study, but they
always refer to the three forms of liquidation defined by the Insolvency Act, unless otherwise stated.
Aziz & Dar [9] divide the bankruptcy prediction models into three categories: statistical models,
artificially intelligent expert system models (AIES) and theoretical models as illustrated in Table 1. This
project focuses on the use of artificially intelligent expert system models (AIES), and more specifically
on the use of neural networks (NN).
Categories of prediction models
Model category Main features
Statistical models -Focus on symptoms of failure -Use mainly information from company accounts -Univariate or multivariate (more common)
Theoretical models -Focus on qualitative causes of failure -Multivariate -Usually employ a statistical technique to provide quantitative support for the theoretical argument
Artificially intelligent expert system models (AIES)
-Focus on symptoms of failure -Use mainly information from company accounts -Usually multivariate -Result of technological advancement and information development -Heavily depend on computer technology and computing power
Table 1. Division of prediction models into three categories, adapted from [9].
By Miika Oskar Jokinen – Student ID: 1105638
11
2. Project Specification The purpose of the project was to use a neural network model to predict bankruptcies.
The stages in the project involved:
1. A literature review that was carried out to get an understanding of the usefulness of different
prediction models.
2. Choosing an appropriate model: Adaptive Neuro Fuzzy Inference System (ANFIS) was chosen
initially as there are only a few studies that have applied it to the area of bankruptcy
prediction and they have demonstrated promising results. However, after some research on
the topic, it was deemed to be out of scope for an undergraduate project, and instead a neural
network model was chosen as ANFIS is based on neural networks and excellent results have
been obtained with neural networks in previous studies.
3. Selection of an appropriate financial database: the data sample was formed using the Fame
database.
a. Companies that were extracted from the database were from mixed industries
b. Financial ratios were chosen based on previous studies and availability of data on Fame
c. Pre-processing of data
4. A neural network prediction model that uses the Levenberg-Marquardt backpropagation
training algorithm was developed using MATLAB® programming tools.
5. Training the neural network model with the data sample selected from Fame.
6. Results are validated with a hold-out sample.
7. Results are analysed and compared to earlier research.
By Miika Oskar Jokinen – Student ID: 1105638
12
3. Literature review A broad overview of different prediction methods is first presented based on the three category
model developed by Aziz & Dar [9]. A more comprehensive review is then given on the use of neural
networks in the area of business failure prediction.
3.1 Statistical models
Different types of statistical prediction models
Models Main features
Univariate (see Altman, 1993; Morris, 1998)
-Conventionally focused on financial ratio analysis -Underlying rationale: if financial ratios exhibit significant differences across the failing and non-failing firms then they can be used as predictive variables
Multiple discriminant analysis (MDA) (see Klecka, 1981; Altman, 1993;Morris, 1998)
-MDA model is a linear combination (a bankruptcy score) of certain discriminatory variables (often financial ratios) -Bankruptcy score is used to classify firms into bankrupt and non-bankrupt groups -Altman Z-score is a widely used bankruptcy score
Linear probability model (LPM) (see Maddala, 1993; Theodossiou, 1991; Gujarati, 1998; Morris, 1998)
-LPM expresses the probability of failure or success of a firm as a dichotomous dependent variable. The variable is a linear function of a vector of explanatory variables such as financial ratios -Boundary values are obtained to distinguish between failing and non-failing firms
Logit model (see Maddala, 1983; Theodossiou, 1991; Gujarati, 1998; Morris, 1998)
-Like LPMs, logit models express the probability of bankruptcy as a dichotomous dependent variable that is a function of a vector of explanatory variables -The dichotomous dependent variable of a logit model, however, is the logarithm, of the probability that an event (fail/not-fail) will occur -Such a transformation of LPM is accomplished by replacing the LPM distribution with a logistic cumulative distribution function -In application to bankruptcy, a probability of 0.5 implies an equal chance of company failure or non-failure. Therefore, where 0 indicates bankruptcy, the closer the estimate is to 1 the less the chance of the firm becoming bankrupt
Probit model (see Maddala, 1983; Theodossiou, 1991; Gujarati, 1998; Morris, 1998)
-It is possible to substitute the normal cumulative distribution function, rather than logistic, to obtain the probit model -Rest of the interpretations remain same as for the logit model
By Miika Oskar Jokinen – Student ID: 1105638
13
Cumulative sums (CUSUM) procedures (see Page, 1954; Healy, 1987; Kahya and Theodossiou, 1999)
-CUSUM procedures are among the most powerful tools for detecting a shift in a distribution from one state to another -In the case of bankruptcy prediction, the time series behaviour of the attribute variables for each of the failed and non-failed firms is estimated by a finite order Vector Autoregression (VAR) model -The procedure, then, optimally determines the starting-point of the shift and provides a signal about the firm’s deteriorating state as soon as possible thereafter -The overall performance of the firm at any given point in time is assessed by a cumulative (dynamic) time-series performance score (a CUSUM score) -As long as a firm’s time series performance scores are positive and greater than a specific sensitivity parameter, the CUSUM score is set to zero, indicating no change in the firm’s financial condition -A negative score signals a change in the firm’s condition
Partial adjustment processes (see Laitinen and Laitinen, 1998; Gujarati, 1998)
-Partial adjustment models are a theoretic rationale of the famous Koyck approach to estimate distributed-lag models -Application of these models in bankruptcy prediction can best be explained by using cash management behaviour of the firms as an example, with failure being defined as the inability of the firm to pay financial obligations as they mature -Elasticities of cash balances with respect to the motive factors will be smaller in absolute magnitude for a failing firm than for a similar healthy firm -Also, the adjustment rate for a failing firm will exceed the rate of a healthy firm
Table 2. Statistical prediction models, adapted from [8].
3.2 Theoretical models
Different types of theoretical models
Model Main features
Balance sheet decomposition measures (BSDM)/enthropy theory (see Theil, 1969; Lev, 1973; Booth, 1983)
-One way of identifying financial distress is to examine changes in the structure of balance-sheets, under the argument that firms try to maintain equilibrium in their financial structure -If a firm’s financial statements reflect significant changes in the composition of assets and liabilities on its balance-sheet it is more likely that it is incapable of maintaining the equilibrium state. Significant shifts from this equilibrium state indicate financial distress
By Miika Oskar Jokinen – Student ID: 1105638
14
Gambler’s ruin theory (see Scott, 1981; Morris, 1998)
-In this approach, the firm can be thought of as a gambler playing repeatedly with some probability of loss, continuing to operate until its net worth goes to zero (bankruptcy) -With an assumed initial amount of cash, in any given period, there is a net positive probability that firm’s cash flow will be consistently negative over a run of periods, ultimately leading to bankruptcy
Cash Management Theory (see Aziz et al., 1988; Laitinen and Laitinen, 1998)
-Short-term management of corporate cash balances is a major concern of every firm, and the inability of a firm to pay its short-term obligations will cause bankruptcy
Credit risk theories (including JP Morgan’s CreditMetrics, Moody’s KMV model (see Black and Scholes, 1973; Merton, 1973), CSFB’s CreditRisk+ (see Crédit Suisse, 1997), and McKinsey’s CreditPortfolio View (see Wilson, 1997a, b, 1998)
-Credit risk theories are linked to the Basel I and Basel II accords and mostly refer to financial firms -Credit risk is the risk that any borrower/counterparty will default, for whatever reason. Following Basel II guidelines, a number of recent attempts have been made to develop internal assessment models of credit risk. These models and their respective risk predictions are based on economic theories of corporate finance and are collectively referred to as credit risk theories -For example: JP Morgan’s CreditMetrics and Moody’s KMV models rely on option pricing theorya, whereby default is endogenously related to capital structure and the firm may default on its obligations if the value of its assets falls below a critical level (determined by the credit risk model) -CSFM’s CreditRisk+ follows a framework of actuarial science to derive the loss distribution of a bond/loan portfolio where the default is assumed to follow an exogenous Poisson process. The model captures the essential characteristics of credit default events and allows explicit calculation of a full loss distribution for a portfolio of credit exposures -McKinsey’s CreditPorfolio View model takes a macroeconomic approach to risk measurement. Credit cycles follow business cycles closely, with the probability of default being a function of variables such as the unemployment rate, interest rates, growth rate, government expenses, foreign exchange rates, and aggregate savings, so that a worsening economy should be followed by an increase in the incidence of downgraded security ratings and defaults
Note: a An option is a financial claim that gives the holder a right to buy (call option) or sell (put option) an underlying asset in the future at a pre-determined exercise price. Merton [10] recognised that the model could be applied as a pricing theory for corporate liabilities in general. Option pricing as a valuation model for investment under uncertainty, “real options”, has been developed by Dixit and Pindyck [11]. Table 3. Theoretical prediction models, adapted from [8].
By Miika Oskar Jokinen – Student ID: 1105638
15
3.3 Artificially intelligent expert system models (AIES)
Different types of AIES models
Model Main features
Recursively partitioned decision trees (an inductive learning model) (see Friedman, 1977; Pompe and Feelders, 1997)
-A form of supervised learning in which a program learns by generalising from examples (thereby mimicking the behaviour of many human experts) -This kind of learning is exploited by decision tree procedures that use recursive portioning decision rules to transform a “training” sample of data -In bankruptcy classification the training sample is recursively partioned into a decision tree in which the final nodes contain firms of only one type, bankrupt or healthy
Case-based reasoning (CBR) models (see Kolodner, 1993)
-CBR solves a new classification problem with the help of similar previously solved cases -CBR programs can be applied directly to bankruptcy prediction by application of its typical four-stage procedure of (1) identification of a new problem, (2) retrieval of solved cases from a “case library”, (3) adaptation of solved cases to provide a solution to the new problem, and (4) evaluation of the suggested solution and storage in the case of library for future use
Neural networks (NN) (see Salchenberger et al., 1992; Coats and Fant, 1993; Yang et al., 1999)
-Neural networks perform classification tasks in a way intended to emulate brain processes -The network consists of “neurons” that are nodes with weighted interconnections. The network consists of an input layer, hidden layers and an output layer with neurons in each layer. Each node in the input layer receives information about firms (such as financial ratios) and outputs a signal to the hidden layer, which processes the inputs in the neurons and then sends output signals to the output layer, which determines the probability of failure
Genetic algorithms (GA) (see Shin and Lee, 2002; Varetto, 1998)
-Based on the idea of genetic inheritance and Darwinian theory of natural evolution (survival of the fittest), GAs use a stochastic search technique to find an optimal solution to a given problem from a large number of solutions -GAs execute this search process in three phases: genetic representation and initialisation, selection, and genetic operation (crossover and mutation). The process continues until the actual population converges towards increasingly homogenous strings -In order to solve a classification problem such as bankruptcy, GAs are used to form a set of rules or conditions. These conditions are associated with certain cut-off values. Based on these conditions, the model predicts whether the firm is likely to go bankrupt
By Miika Oskar Jokinen – Student ID: 1105638
16
Rough sets model (see Pawlak, 1982; Ziarko, 1993; Dimitras et al. 1999)
-The aim of rough sets theory is to classify objects with imprecise information -In a rough sets model, knowledge about the objects is presented in an information table that functions as a decision table. The table contains sets of conditions and decision attributes that are used to derive the decision rules by inductive learning principles. Every new object (for example, a firm) can then be classified (healthy or in financial distress) by matching their characteristics with the set of derived rules
Table 4. Artificially intelligent expert system prediction models, adapted from [8].
3.4 Usage frequency of different bankruptcy prediction models
From Figure 2 it can be seen that multivariate discriminant analysis (MDA) and Logit regression models
are the most popular methods in the area bankruptcy prediction. They are not only the most popular
ones, but also simple to understand and the first models that were developed. Neural networks are in
third place with regard to the frequency of model usage.
By Miika Oskar Jokinen – Student ID: 1105638
17
Figure 2. Relative frequency of occurrence of different bankruptcy prediction methods [9].
3.3.1 Neural network architecture
Artificial neural networks (ANNs) have been inspired by the biological nervous system and the human
brain. They both consist of simple building blocks called neurons that are highly interconnected and
are capable of simple computations, although the artificial neurons are much simpler than their
By Miika Oskar Jokinen – Student ID: 1105638
18
biological counterparts [12]. The network consists of an input layer, hidden layers and an output layer
as shown by Figure 3. The network designed for this project comprises only one hidden layer as it
suffices for most classification problems as shown by [13], [14], [15] and [16]. The output layer
consists of a single neuron that outputs a value between 0 and 1 depending on the likelihood of
failure.
Figure 3. Basic layout of a neural network model. Adapted from [15].
The neurons in the hidden layer and the output layer are similar in the way that they both receive
inputs p1, p2,…, pR of the input vector p, which are multiplied by weights w1, w2,…, wm of the weight
matrix W [12]. The weighted inputs and a bias term b are summed by the summer of the neuron,
P1
P2
P3
PR
H
2
Input layer Hidden layer
Output node
Feed-forward
Backpropagation
By Miika Oskar Jokinen – Student ID: 1105638
19
which results to the summer output n:
, where small italic letters are scalars (n, b), small bold nonitalic letters are vectors (p) and capital
BOLD nonitalic letters (W) matrices.
The summer output n is passed to the transfer function f, which produces the scalar neuron output a.
This is further illustrated by Figure 4. The transfer function is typically chosen by the designer. A
commonly used transfer function is the sigmoid function:
( )
Figure 4. Inputs, outputs and computations of a single neuron. Adapted from [16].
W2
W1
W3
W4
∑
b
A single neuron in the hidden
layer or the output layer
f n a
By Miika Oskar Jokinen – Student ID: 1105638
20
Now the outputs of the neurons can be written as
( )
The input vector p is determined by the financial variables inputted to the network, but the weights
and biases are adjustable parameters, which makes the network capable of learning and recognising
patterns [12]. The weights can be adjusted through the use of training algorithms. Artificial networks
are typically classified into two distinctive types based on the type of training they use: supervised
and unsupervised. Supervised training requires input and output training pairs. A common type of
supervised training uses the backpropagation algorithm, which requires the calculation of the
difference between the predicted output and the desired output. Supervised training is performed in
retrospect as past data is required on bankrupt and healthy companies. Contrary to supervised
networks, the unsupervised training algorithms require only input vectors for training. The self-
organising feature map developed by [17] is an example of unsupervised learning, which processes
the input vectors into similar output clusters [18]. Lee, et al. [18] performed experiments in
bankruptcy prediction with both a supervised Levenberg-Marquardt backpropagation algorithm and
an unsupervised self-organise feature map, and concluded that the supervised training method
outperformed the unsupervised training algorithm. Hence, the training algorithm in this project is
based on a supervised training algorithm, more specifically the Levenberg-Marquardt
backpropagation.
As mentioned earlier, in order to train the network the difference between the predicted output and
desired output needs to be calculated. This is done using a performance index. A common
By Miika Oskar Jokinen – Student ID: 1105638
21
performance index is the mean squared error (MSE):
( ) ∑( ) ( )
, where x is a vector of the scalar parameters that are being adjusted (weights or biases in this case),
Q is the number of targets (in the case of bankruptcy prediction, it is the number of companies in the
dataset), t is the vector of target values (values 0 or 1 depending on whether the company has gone
bankrupt or not) and a is a vector of the predicted output values [19].
3.3.2 Previous studies on bankruptcy prediction using neural networks
ANNs have been widely studies in the area of bankruptcy prediction. The majority of studies analyse
financial ratios with variants of the backpropagation algorithm, which is often compared to statistical
models. In this section, based on an overview of previous research by Zhang, et al. [20] a review on
the use of ANNs in business failure prediction is presented.
Odom & Sharda [21] were the first to use ANNs in business failure prediction. They used a three-layer
feed-forward network and compared the results to those of multivariate discriminant analysis (MDA).
They concluded that with a 50/50 split of bankrupt/non bankrupt companies in the training, ANNs
outperformed the MDA model [20].
The study done by Odom & Sharda [21] spurred an interest in academia to perform further research
on the use of ANNs in business failure prediction. Rahimian, et al. [22] used the same data set as
Odom & Sharda [21] for three different network designs: a conventional backpropagation network,
Athena and Perceptron. The study focused on improving the performance of the backpropagation
By Miika Oskar Jokinen – Student ID: 1105638
22
algorithm by varying the network training parameters [20].
Tam & Kiang [23] presented significant results that were based on earlier research done by one of the
coauthors [24], who completed a detailed analysis of the potential and limitations of neural networks
in a variety of business classification problems. They compared ANNs to statistical methods including
linear discriminant analysis, logistic regression, k nearest neighbour and a machine learning method
of decision tree induction. They concluded that in most cases neural networks produce higher
accuracies and are more robust for evaluating bank status than other available classification models
[20].
3.3.2.1 Type I and type II errors
Salchengerger, et al. [25] analysed different cutoff values in classification decision, and how it affects
real life costs related to type I and type II errors. A type I error (false positive) signifies the
classification of a healthy company as bankrupt, and in a type II error (false negative) a bankrupt
company is classified as healthy. Salchengerger, et al. [25] also found that ANNs perform as well as or
better than statistical logit models for 6, 12 and 18 months before bankruptcy of savings and loan
institutions. Boritz & Kennedy [26] and Boritz, et al. [27] investigated the effect of different types of
ANNs on the levels of type I and type II errors. They found that the optimal estimaton theory based
network has the lowest level of type I errors and the highest level of type II errors. The results also
indicated that backpropagation networks have intermediate levels of type I and II errors while
conventional statistical methods tend to have high type I and low Type II error levels [20].
By Miika Oskar Jokinen – Student ID: 1105638
23
3.3.2.2 Sample size
The sample sizes in bankruptcy prediction studies have varied significantly, largely due to the
limitations of available data and lack of transparency with regard to input data in published papers.
One of the very few studies that provide their data sample was done by Ignizio & Soltys [28]. The
sample sizes vary from 36 firms by Fletcher & Goss [29] to over 1000 firms used by Altman, et al. [30].
3.3.2.3 Different accounting policies
Kerling [31] compares business failure forecasting between France and USA in a related study to [32]
that compares multilayer perceptron networks (MLP) and Kohonen’s learning vector quantizer
network to discriminant analysis. Kerling concludes that there is no significant difference in the
correct classification rates between the American and French companies although different
accounting rules and financial ratios are employed [20]. However, Morris [33] denotes the impact of
different accounting policies, especially resulting from data manipulation and managerial “numbers
making” to meet short-term goals, which is exemplified by Warren Buffett [34].
3.3.2.3 Neural network performance
Leshno & Spector [35] assess the forecasting capability of various neural network models. They vary
the neural network architecture, number of iterations and data span of the input data set. The key
findings of the paper are (1) the forecasting capability depends on the training sample size, (2)
different training algorithms have a significant effect on both model fitting and hold-out sample
performance and (3) overfitting problems start to occur when the number of iterations increases [20].
3.3.2.4 Neural network variants
Many variants of ANNs and their training algorithms have been developed including hybrid neural
By Miika Oskar Jokinen – Student ID: 1105638
24
network models (combining conventional statistical methods and newer neural network algorithms)
[36], the generalized adaptive neural network algorithm [37], the Madaline algorithm [38] and the
generalized reduced gradient optimizer for ANN training [39] [20]. Newer algorithms based on the
neural network principles have been applied to corporate failure prediction. One of these models is
the adaptive neuro-fuzzy inference system (ANFIS), which has been successfully applied by Akkoç [40]
and Chen [6].
3.3.2.5 Financial variables
Altman [41] was among the first to use financial ratios in the prediction of corporate bankruptcy, and
since then many papers [42] [43] [44] [22] [45] [46] in business failure prediction have utilised the
same five ratios as Altman used. The five ratios Altman used are (1) working capital/total assets (2)
retained earnings/total assets (3) earnings before interest and taxes (EBIT)/total assets (4) market
value equity/book value of total debt (5) sales/total assets. Many other predictor variables have been
utilised. For instance, Raghupathi, et al. [47] employ 13 financial variables and Tam & Kiang [23] use
19 variables. Many authors select an initial sample of financial ratios and reduce the number of input
variables through factor analysis [48], stepwise regression [49] or principal component analysis [50].
The number of variables varies considerably between studies. For example, Leshno & Spector [35] use
41 financial variables whereas Fletcher & Goss [29] and Fanning & Cogger [37] use only three ratios
[20].
3.3.2.6 Criticism of neural networks
While many empirical studies on the performance of ANNs demonstrate the superiority of neural
networks over traditional statistical techniques, the results are not always uniform. For instance, Bell,
By Miika Oskar Jokinen – Student ID: 1105638
25
et al. [51] report disappointing results in predicting bank failures with neural networks, and Boritz, et
al. [27] find that neural networks do not perform any better than traditional statistical techniques
such as logit and discriminant analysis. Also contrary to previous research by [46] and [45], Altman’s
study finds that discriminant analysis produces a slighly better prediction on the hold-out samples
than neural networks. Altman, et al. [30] discuss the great potential of neural networks to recognize
the health of companies but also criticises the black-box approach of ANNs, which calls for further
studies on neural networks [20]. Furthermore, Zhang, et al. [20] point out that most studies use
commercial neural network packages and as a consequence many papers have been published
without clear understanding of the sensitivity of solutions with respect to initial starting conditions
[20]. This is partly the reason why the results vary in the area from 60% prediction accuracy all the
way up to 100% one year before business failure.
3.3.2.7 Other applications of neural networks
Neural networks and backpropagation algorithms have been used for a myriad of applications other
than bankruptcy prediction, too. Chen & Du [48] list applications including investigating long-term
tidal predictions [52], improving customer satisfaction [53] and predicting flank wear in drills [54].
Business applications include market segmentation, credit evaluation, construction demand
forecasting, general forecasting, tourism discrete choice modelling and new product acceptance
research [55].
By Miika Oskar Jokinen – Student ID: 1105638
26
4. Research methodology Neural networks are used to study the correlation between the numerical values of financial ratios
and the probability of insolvency [20]. The stages that the project follows are:
1. Define business failure based on what data is available
2. Choose a data set
3. Divide data set into training, validation and test sets
4. Define an appropriate neural network architecture
5. Train neural network
6. Test and validate results
The stages are elaborated in the following subsections. A great emphasis is put on the transparency of
input data (subsection 4.1), which is a predominant issue in the majority of published papers in the
area of business failure prediction.
4.1 Data set
Table 5 demonstrates how previous studies on neural networks and bankruptcy prediction have drawn
company data from a variety of sources. They have also used varying sample sizes and differing
number of financial ratios as inputs to the neural network model. Karels & Prakash [56] note that the
large diversity of ratios used in bankruptcy prediction is a consequence of the limited theoretical basis
of choosing them. Furthermore, Gibson & Frishkoff [57] emphasise that the financial ratios vary
across industries and based on the accounting methods used [47] [58]. Dimitras et al. [59] further
report the difficulty of uniformity on common language and terminology in business failure prediction
and differences in ratio definitions between authors. Moreover, Dimitras et al. [59] denote that the
By Miika Oskar Jokinen – Student ID: 1105638
27
availability and reliability of data is a major issue in bankruptcy prediction.
Table 5. Data used in neural network studies.
Author and year Databases No. of Failed Companies
No. of Non-failed Companies
No. of financial
ratios used
Zhang, et al. (1999)
New York, American and NASDAQ exchanges, Office of the General Counsel of the Security Exchange Commission, Wall Street Journal Index (bankrupt companies), COMPUSTAT database (non-bankrupt)
110 110 6
Ahn, et al. (2000) Korea Information Service (credit research company)
1200 1200 8
Atiya ( 2001) US firms (database not disclosed) 196 716 11
Charitou, et al. (2004)
Compustat (Global) database, UK insolvency credit database (failed companies), Datastream, Worldscope European Disclosure, Silverplatter: UK Corporations
51 51 26
Lee, et al. ( 2005) Security and Exchange Commission stored in the database of the Korean Investors Service
84 84 5
Chen & Du (2009) Taiwan Stock Exchange Corporation database
34 34 13
Tseng & Hu (2010)
Datastream and FT EXTEL Company Research
32 45 5
Bae (2012) Korea Credit Guarantee Fund 944 944 11
Jackson & Wood (2013)
Thomson One Banker, London Share Price Database
101 6494 3
Lee & Choi (2013) Korea Stock Exchange 138 91 5
Alfaro, et al. [60] report that in failure prediction studies, financial variables are normally selected on
the basis of three criteria: (1) They have been previously used in failure prediction literature, (2) the
information needed to calculate these ratios is available, and (3) the researchers' knowledge of
By Miika Oskar Jokinen – Student ID: 1105638
28
previous studies or on account of preliminary trials. This methodology is followed so that financial
ratios are chosen based on previous studies [41] [61] [6] and within the limits for which financial
ratios are available. Four databases are examined for the purpose of finding financial ratios: Fame,
Amadeus, Bloomberg Professional Service (via Bloomberg Terminal) and Thomson One Banker.
Bloomberg has the widest number of financial variables and companies available, but Fame provides
the best tools to filter companies with missing data and allows the extraction of a larger number of
companies from the database than Bloomberg does. Only one database is used so that a uniform
methodology is used to calculate financial variables and there is no overlapping of the same
companies in the data set.
A compromise between choosing the most appropriate financial ratios and the availability of data had
to be made. The following variables had to be eliminated from the original data set in order to ensure
a large enough number of companies with available data: Stock turnover, creditors’ payment days,
debtors’ turnover, debtor collection days and number of employees.
Financial ratios used as input to the neural network model are:
Profitability ratios:
1. Profit margin %: (Profit (Loss) before Tax/Turnover)*100%. A high profit margin
indicates good control of costs.
2. Return On Capital Employed (ROCE): describes how efficiently a company is using its
capital and is an important factor, especially in capital-intensive sectors. It is calculated
as Earnings Before Interest and Tax (EBIT) /(Total Assets-Current Liabilities).
3. EBIT margin: measures a company’s operating profitability and can be calculated as
By Miika Oskar Jokinen – Student ID: 1105638
29
Earnings Before Interest and Tax/Turnover.
4. Return On Total Assets (ROTA): describes how well a company is using its assets, and
can be calculated as EBIT/Total Assets.
5. EBITDA/Total Assets: Earnings Before Interest, Tax, Depreciation and
Amortisation/Total Assets. The ratio expresses the earning power of its assets and is
an indicator of insolvency when the total liabilities exceed the firm’s assets [41].
6. Net Assets Turnover: Turnover/(Total Assets-Current Liabilities). The ratio depicts the
revenue generating capability of the firm’s assets [41].
Operational ratios:
7. Working Capital/Total Assets: According to Altman [41], this is “the best indicator of
ultimate discontinuance”.
Structure ratios:
8. Liquidity ratio: (Current Assets-(Stock and Work in Progress))/Current Liabilities.
Denotes a company’s ability to pay back its short-term debt obligations.
9. Gearing Ratio: ((Short Term Loans & Overdrafts + Long-Term Liabilities)/Shareholders’
Funds)*100%. Gearing ratio measures the level of financial leverage used by the
company. In other words, this compares how much of the business is funded by the
company’s owner versus the creditors.
10. Current Ratio: Current Assets/Current Liabilities. A type of liquidity ratio.
11. Solvency ratio (asset based): (Shareholders’ Funds)/(Total Assets)*100%
Ratio definitions are based on Fame’s own formulae [62] and [63].
By Miika Oskar Jokinen – Student ID: 1105638
30
The methodology for choosing the data set is as follows:
1. Define failed companies: in this study, companies that have been liquidated or are currently in
liquidation are described as failed. Pursuant to the Insolvency Act, the forms of liquidation are
Creditors’ voluntary liquidation, Member’s voluntary liquidation and Compulsory liquidation
[5] [64].
2. Filter companies so that only firms with data available for the last five years are chosen. See
selection criteria in Appendix 2.
3. Extract data from Fame to Excel.
4. Filter the data set (in Excel) so that companies with missing data are deleted.
5. Keep only companies that have their latest accounts date approximately one year before
liquidation (Excel).
6. Create nine different data sets in Excel:
o Three sets of insolvent companies 1, 2 and 3 years before bankruptcy. Divide each of
the three sets of data into further three sets:
1. Calculate the average, standard deviation and trend of the 11 financial ratios
with 4 years of past data. Previous researchers [65] [66] [67] have used the time
trend, the coefficient of variation and shift away from the trend in the period(s)
prior to failure.
2. Calculate the average, standard deviation and trend with 3 years of past data.
3. Use unprocessed financial ratios one, two and three years before failure.
7. Read in the financial information from Excel to MATLAB.
By Miika Oskar Jokinen – Student ID: 1105638
31
8. Select a sample of healthy and bankrupt companies from a large pool of firms. The ratio of
bankrupt to healthy companies is naturally very low. Hence, the number of bankrupt
companies in the sample is fixed to maximize the size of the data set. The number of bankrupt
companies used in the study was 989. Based on previous research the use of 50/50 split of
bankrupt and healthy companies is recommended so that the model does not learn a bias for
bankrupt or healthy companies [21] [68]. Hence, from a large pool of healthy companies
(23,830) a matching number of healthy companies (989) is chosen. The 989 healthy companies
are extracted from the pool of 23,830 companies so that each bankrupt company is matched
with a firm from the pool of healthy companies with the closest asset size. See the
‘%Choosing data set’ section in the MATLAB code in the Appendix 1 that allocates
these matched companies.
9. Divide data sets into training, validation and testing samples. See Appendix 1, section ‘%
Setup Division of Data for Training, Validation, Testing’.
4.2 Design of the Neural Network model
As Zhang et al. [20] note, there are currently no systematic principles to guide the design of a neural
network model for a particular classification problem. For this reason the best network architecture is
chosen through experiments for this project. When designing the network the following factors have
to be taken into account: hidden layers, hidden nodes, data normalization and training methodology
[20].
Neural networks are characterized by their architectures, which refers to the number of layers and
nodes in each layer, as well the number of arcs. Based on the results from [13] [14] [15] and [16],
By Miika Oskar Jokinen – Student ID: 1105638
32
networks with one hidden layer are adequate for most classification problems. Therefore, the
network architecture used in this study employs only one hidden layer. For classification problems,
such as bankruptcy prediction, the number of input nodes is determined by the number of predictor
variables. For this reason, the network developed has twelve input nodes in the first layer
corresponding to twelve financial variables. In addition, node biases are used to enhance the training
capability of the network and MATLAB default transfer functions (log-sigmoid) for pattern recognition
problems are employed in the hidden layer and the output neuron. Zhang, et al. [20] point out how
the number of hidden nodes is not easy to determine a priori and experimentation is required to
decide the number of nodes. Prior experiments are carried out to determine the optimum number of
neurons. The number of hidden neurons is varied between 1 and 15 and since ten produces the best
results, it is used for the neural network architecture.
In this study a particular form of neural network structure is used, a multi-layer feed-forward network
(same as multi-layer perceptron network), which is the most common structure in forecasting
applications. The chosen training algorithm will update the link weights in the network so that it
minimises the mean squared error (MSE) between the desired and actual output values. The most
widely used training method is the backpropagation algorithm which, in effect, is a gradient steepest
descent method. Because the steepest descent method suffers from slow convergence and
inefficiency, a variation of the conventional backpropagation method is utilised: the Levenberg-
Marquardt backpropagation algorithm. It uses the steepest descent method when the MSE is large,
and switches to the Gaussian-Newton method when the error function approaches the local minima
By Miika Oskar Jokinen – Student ID: 1105638
33
[69] [20] [18].
Even though the Levenberg-Marquardt often produces lower mean squared errors than other training
algorithms [70] the training of neural networks is a nonlinear minimization problem, and
mathematically speaking, global solutions cannot be guaranteed. Hence, it is possible that a
backpropagation network stops training at a local minima. However, the network can be reinitialized
with different starting weights, and hence the starting position on the error surface is also changed.
When the network is retrained with the new initial weights, it usually works correctly. Therefore, the
network is trained for each data set ten times and best architecture chosen to obtain the maximum
performance [19].
To see the steps in adjusting weights and biases with the Levenberg-Marquardt backpropagation
algorithm, refer to Appendix 3.
4.2.1 Neural network structure
The final neural network is similar to the network configuration given in Figure 5 by [71], with the only
variation in the number of inputs:
By Miika Oskar Jokinen – Student ID: 1105638
34
Figure 5. Neural network structure used in the project with the only variation in the number of inputs [71].
4.3 Cross-validation
A common practice in the neural network papers is to divide the entire data set into a training (in-
sample) set and a test (out-of-sample) set. This approach is adopted in the project. The in-sample data
is used to train the neural network with the Levenberg-Marquardt training algorithm, and the test set
comprises unseen company data that is used to assess the predictive capability of the network [20].
The test set results are then compared to a number of business failure prediction models developed
by other authors.
By Miika Oskar Jokinen – Student ID: 1105638
35
5. Results This section presents the prediction results from one year before the failure up to three years before
failure. Different data sets were developed based on the number of years used in the development of
the data sets. The first data set used four years of past data with the arithmetic mean, standard
deviation and trend over the four years. The same is done for three years of past data and finally
ratios using only one year of past data are considered.
Results are given as overall correct classification percentages but also with two types of error classes.
The classification errors are divided into two types since each will have different consequences in real
life applications if classified incorrectly. Error type I signifies the classification of a healthy company as
bankrupt, and error type II represents the failure to identify a bankrupt company which is instead
predicted as healthy [23]. The #-sign indicates the number of companies that were classified correctly
or incorrectly.
5.1 One year before failure
Training Validation Test
# % # % # %
1 year before
bankruptcy
Data over past 4 years (trend,
standard deviation, average)
Correct classification 1059 76.5 % 214 72.1 % 224 75.4 %
Type I error (false positive) 180 13.0 % 40 13.5 % 40 13.5 %
Type II error (false negative) 145 10.5 % 43 14.5 % 33 11.1 %
Data over past 3 years (trend,
standard deviation, average)
Correct classification 963 69.6 % 194 65.3 % 211 71.0 %
Type I error (false positive) 232 16.8 % 51 17.2 % 50 16.8 %
Type II error(false negative) 189 13.7 % 52 17.5 % 36 12.1 %
1 year of past data
Correct classification 911 65.8 % 186 62.6 % 193 65.0 %
By Miika Oskar Jokinen – Student ID: 1105638
36
(unprocessed) Type I error (false positive) 201 14.5 % 51 17.2 % 48 16.2 %
Type II error (false negative) 272 19.7 % 60 20.2 % 56 18.9 %
Table 6. Prediction results one year before bankruptcy.
5.2 Two years before failure
Training Validation Test
# % # % # %
2 years before
bankruptcy
Data over past 4 years (trend,
standard deviation, average)
Correct classification 930 67.2 % 191 64.5 % 198 66.9 %
Type I error (false positive) 189 13.7 % 37 12.5 % 43 14.5 %
Type II error (false negative) 265 19.1 % 68 23.0 % 55 18.6 %
Data over past 3 years (trend,
standard deviation, average)
Correct classification 1294 93.5 % 263 88.9 % 261 88.2 %
Type I error (false positive) 62 4.5 % 23 7.8 % 18 6.1 %
Type II error(false negative) 28 2.0 % 10 3.4 % 17 5.7 %
1 year of past data
(unprocessed)
Correct classification 1247 90.1 % 248 83.8 % 259 87.5 %
Type I error (false positive) 98 7.1 % 29 9.8 % 24 8.1 %
Type II error (false negative) 39 2.8 % 19 6.4 % 13 4.4 %
Table 7. Prediction results two years before bankruptcy.
Figure 6 and Figure 8 demonstrate how the output values of the neural network model behave when
the prediction results are accurate (Figure 6) versus inaccurate (Figure 8).
Figure 6 demonstrates the real outputs on the left and predicted outputs on the right. The two top
graphs take into consideration all companies (training, validation and test sets) and the ones on the
bottom take into account only the hold-out sample companies that were used in testing the accuracy
of the prediction and applicability to real life scenarios.
By Miika Oskar Jokinen – Student ID: 1105638
37
The two graphs on the right hand side of Figure 6 show the predicted output values with blue dots.
The threshold value of 0.5, which is demonstrated by the red line, determines whether the company
is predicted to fail or not. If the value is above 0.5 then the company is assigned the value 1 indicating
a predicted failure. On the other hand if the value is below 0.5 the company is assigned the value 0
and predicted to stay in operation. The blue dots represent the values outputted by the neural
network algorithm and green dots represent the actual binary prediction value. The advantage of
assigning values between zero and one instead of pure binary values is the possibility of assigning
more than one category, such as introducing an ‘undetermined’ category, which represents
companies that require further investigation beyond financial ratios. See Appendix 1, section ‘%
Introduction of a three category model: 'bankrupt', 'undetermined' and 'healthy'‘ for how this can be
performed.
Figure 6 that demonstrates the prediction output scatter two years before failure should be
contrasted with Figure 8 that illustrates the scatter of output data for the neural network algorithm
three years before failure. From Figure 6 it is clear that when the prediction accuracy is high the data
are more clustered around the actual binary values of 1 and 0, whereas in Figure 8 the output data are
widely scattered and only a slight tendency can be observed. This demonstrates the randomness of
output three years before failure and depicts the incapability of the model developed to predict
failure far in the future.
By Miika Oskar Jokinen – Student ID: 1105638
38
Figure 6. Data scatter for 2 years before failure with 3 years of past data.
Figure 7 shows an example of the algorithm learning rate and how the algorithm overlearns the
patterns in the training data set. As a consequence, the weights and biases of the network are
adjusted so that the network architecture produces excellent results for the training data set but
produces poor results if new data is introduced to the network. This phenomenon is called overfitting.
A validation data set is used to stop this overfitting by checking the Mean Squared Error (MSE) at
every iteration. When the validation data set reaches its minimum error, the architecture of the
network stops training and leads to the optimum performance.
By Miika Oskar Jokinen – Student ID: 1105638
39
Figure 7. Performance development.
By Miika Oskar Jokinen – Student ID: 1105638
40
5.3 Three years before failure
Training Validation Test
# % # % # %
3 years before
bankruptcy
Data over past 4 years (trend,
standard deviation, average)
Correct classification 887 64.1 % 175 58.9 % 192 64.6 %
Type I error (false positive) 260 18.8 % 60 20.2 % 59 19.9 %
Type II error (false negative) 237 17.1 % 62 20.9 % 46 15.5 %
Data over past 3 years (trend,
standard deviation, average)
Correct classification 840 60.7 % 164 55.4 % 183 61.8 %
Type I error (false positive) 397 28.7 % 86 29.1 % 74 25.0 %
Type II error(false negative) 147 10.6 % 46 15.5 % 39 13.2 %
1 year of past data
(unprocessed)
Correct classification 914 66.0 % 181 61.1 % 195 65.9 %
Type I error (false positive) 197 14.2% 46 15.5 % 42 14.2 %
Type II error (false negative) 273 19.7% 69 23.3 % 59 19.9 %
Table 8. Prediction results three years before bankruptcy.
5.4 Error types I and II
When examining the type I and type II error levels for test sets 1,2 and 3 years before failure (see
Table 6Table 7Table 8), it can be seen that there are more type I errors when the data sample contains
three or four years of past data. The only exception is at two years before failure, where data over
four years contains more type II errors. Contrastingly, with one year of past data, there are more type
II errors, the only exception being at two years before failure. These results indicate that when the
company’s financial state is observed during a longer period of time, the results are “pessimistic” in
the way that more healthy companies are predicted to go bankrupt than vice versa. Likewise, when
the company’s financial condition is examined with only one year of past data, the model is more
By Miika Oskar Jokinen – Student ID: 1105638
41
prone to be “optimistic” by categorizing bankrupt companies as healthy.
Figure 8. Data scatter for 3 years before failure with 3 years of past data.
By Miika Oskar Jokinen – Student ID: 1105638
42
6. Analysis and discussion of results The best results (prediction accuracy of 88.2% with the hold-out sample) are obtained two years
before bankruptcy when three years of past data are used with the average, standard deviation and
trend of ratios. Close to this result comes the test set prediction accuracy of 87.5%, which is obtained
with one year of unprocessed data. However, data with the average, standard deviation and trend of
ratios over four years produces the most consistent results. See Figure 9. Although the failure is the
result of a specific policy of the firm for a number of years, the financial variables should be inspected
over time to provide full information about the firm’s progress. This is manifested with 4 years of past
data. The average calculated provides an overview of the financial ratio, the standard deviation
indicates whether the company has produced steady results over a number of years and the trend
indicates whether the value of the ratio has been declining or increasing in the past few years [59].
Based on Figure 9, it seems that the symptoms of failure start to appear mostly two to three years
before bankruptcy, which is picked up by more recent data (3 years and 1 year of past data) but does
not emerge using data that are taken over four years. This would also explain why the data over four
and three years produce higher prediction accuracies one year before bankruptcy than data over only
one year.
By Miika Oskar Jokinen – Student ID: 1105638
43
Figure 9. Prediction results 1, 2 and 3 years before bankruptcy with a varying number of years of past data used.
6.1 Analysis of financial ratios
The data also suggest that something peculiar happens one year before bankruptcy: it almost
appears to show that the financial ratios go into a healthier direction. For instance, when observing
Figure 10, even though it is only a very small subsample, it seems to suggest that the working capital of
many companies grows relative to the size of their total assets, which is the opposite of what Beaver
[72] suggests with his graph reproduced as Figure 11.
0,0 %
10,0 %
20,0 %
30,0 %
40,0 %
50,0 %
60,0 %
70,0 %
80,0 %
90,0 %
100,0 %
1 1,2 1,4 1,6 1,8 2 2,2 2,4 2,6 2,8 3
4 years of data
3 years of data
1 year of data
By Miika Oskar Jokinen – Student ID: 1105638
44
Figure 10. The development of the working capital/total assets ratio for a subsample from 4 years to 1 year before failure.
If the working capital/total assets ratio of bankrupt companies (Figure 10)
is contrasted to the same ratio of healthy companies (Figure 12), it can be
seen that this instantaneous difference between healthy and bankrupt
companies is not as apparent as Beaver’s [72] Figure 11 indicates.
However, it seems that for many of the healthy companies there are no
such radical changes in the value of the working capital/total assets ratio,
which is captured by studying the slope of the curves and standard
deviation.
-40,00
-20,00
0,00
20,00
40,00
60,00
80,00
100,00
120,00
1,00 1,50 2,00 2,50 3,00 3,50 4,00
%
Years before failure
Working Capital/Total Assets
Blayfield Limited
Shield Publishing Limited
Nomura GP Limited
Mansef Limited
Wotton Heritage TradingCompany
Mainline Dairies Limited
Arthur Wilkinson Limited
Figure 11. The development of working capital /total assets ratio from 5 years to 1 year before bankruptcy. Adapted from [72].
By Miika Oskar Jokinen – Student ID: 1105638
45
Figure 12. The development of the working capital/total assets ratio for a subsample over 3 years for healthy companies.
This type of abnormal behavior, relative to previous research by Beaver [72], appears for other ratios
too. If the current ratio of a small subsample of bankrupt companies (see Figure 13) is compared to
Beaver’s [72] analysis of the current ratio (see Figure 14), the two exhibit different performances.
-40,00
-20,00
0,00
20,00
40,00
60,00
80,00
100,00
120,00
1 1,5 2 2,5 3 3,5 4
%
Years to failure
Working Capital/Total Assets
Old Bury Hill Block One Limited
Visioncare Mobile OpticiansLimited
A.T. Boiler Services Limited
Balloon Meet Support ServicesLimited
Almack Holding Partnership GPLimited
Churchill House Flat ManagementCO. Limited
10 York Road Montpelier BristolManagement Company Limited
Linksgold Ltd
Durkheim Press Limited
By Miika Oskar Jokinen – Student ID: 1105638
46
Figure 13. The development of the current ratio for a subsample from 4 years to 1 year before failure.
Despite this behaviour, the relatively small current ratios from 1 to 4
years before bankruptcy indicate weak financial strength. The model
developed in this project is able to detect this with the average of
the current ratio over 3 and 4 years. However, further research into
the behavior of financial ratios a few years before bankruptcy would
potentially give an indication as to which are the best variables to be
used for the prediction model.
0,00
10,00
20,00
30,00
40,00
50,00
60,00
70,00
80,00
90,00
1,00 1,50 2,00 2,50 3,00 3,50 4,00
Years before failure
Current ratio
Blayfield Limited
Shield Publishing Limited
Nomura GP Limited
Mansef Limited
Wotton Heritage TradingCompany
Mainline Dairies Limited
Arthur Wilkinson Limited
Eastbanks Limited
Woodbridge Construction Limited
Figure 14. The development of the current ratio from 5 years to 1 year before bankruptcy. Adapted from [72].
By Miika Oskar Jokinen – Student ID: 1105638
47
6.2 Comparison to previous studies
6.2.1 Neural networks
Number of years before failure NN [73] NN [73] NN [74]
NN (6)
NN [75]
NN [76]
0-1 - - - - - -
1 80,95 74,6 83,33 71,4 76,7 88,2
2 65,63 78,13 76,19 64,6 73,85 84,3
3 57,14 66,67 75 68,8 72,12 74,2 Table 9. Bankruptcy prediction results for six different neural network models.
Figure 15. Comparison of the project’s results to previous neural network models represented in Table 9.
From Figure 15 it can be seen that 4 years of data provides slightly below average results compared to
other bankruptcy prediction studies conducted with neural networks. However, two years before
bankruptcy, the results are considerably above the average. This inconsistency in results could be
avoided with a combination of methods used in the study. However, problems start to occur when
the model is put into real use as it is impossible to know when exactly the bankruptcy is going to
happen. A possible solution is to run the company data with both four and three years of data: if
three years of data gives a high probability of default it is likely that the company will fail in two years,
whereas high probability of default with four years of data indicates a more imminent failure. The
0
20
40
60
80
100
1 1,5 2 2,5 3
Pre
dic
tio
n A
ccu
racy
%
Years before bankruptcy
Neural Networks
Series1
Series2
Series3
Series4
Series5
Series6
4 years of data
3 years of data
By Miika Oskar Jokinen – Student ID: 1105638
48
model is not really robust enough to predict bankruptcy three years before the failure. Similar
observations are made with regard to prediction accuracies when compared to other prediction
models in the sections from 6.2.2 to 6.2.4.
6.2.2 Discriminant analysis
Number of years before failure
Linear DA [75]
Logistic DA [75]
Probit [75]
MDA [76]
0-1 - - - -
1 75,57 63,07 62,5 88,2
2 74,31 73,85 71,56 85,2
3 74,52 72,12 73,08 75,3 Table 10. Bankruptcy prediction results for four different discriminant methods.
Figure 16. Comparison of the project’s results to discriminant methods represented in Table 10.
6.2.3 Logistic regression
Number of years before failure
Logistic regression [71]
Logistic regression [76]
0-1 - -
1 77,3 86,9
2 85 84,7
3 - 77,5
0
10
20
30
40
50
60
70
80
90
100
1 1,5 2 2,5 3
Pre
dic
tio
n A
ccu
racy
%
Years before bankruptcy
Discriminant analysis and variations
Linear DA
Logistic DA
Probit
MDA
4 years of data NN
3 years of data NN
1 year of data NN
By Miika Oskar Jokinen – Student ID: 1105638
49
Table 11. Bankruptcy prediction results for two different logistic regression models.
Figure 17. Comparison of the project’s results to logistic regression models represented in Table 11.
6.2.4 Other methods
Number of years before failure
Isotonic separation [75]
Linear Programming Discrimination [75]
Learning Vector Quantization [75]
ID3/C4.5 decision tree [75]
OCI decision tree [75]
Rough set [75]
Support Vector Machine [76]
0-1 - - - - - - -
1 76,7 76,57 77,27 79,55 78,98 80,68 88,9
2 74,31 76,15 75,23 74,77 76,15 79,36 85,6
3 73,08 73,56 77,88 71,63 73,56 76,92 78,6 Table 12. Bankruptcy prediction results for a number of different prediction methods.
0
20
40
60
80
100
1 1,5 2 2,5 3
Pre
dic
tio
n A
ccu
racy
%
Years before bankruptcy
Logistic regression
Series1
Series2
4 years of data NN
3 years of data NN
1 year of data NN
By Miika Oskar Jokinen – Student ID: 1105638
50
Figure 18. Comparison of the project’s results to methods represented in Table 12.
6.3 Application of the prediction model
In addition to helping governments and regulatory bodies, the bankruptcy prediction model is a useful
tool in a vast number of areas including bank lending, assessing the credit risk of bank loan portfolios,
assessing the going concern status of a firm by accounting firms and supply chain risk management.
Bank lending: Banks need to be able to predict the possibility of the default of a potential
counterparty before they extend a loan. This also helps banks to estimate a fair value of the interest
rate of a loan. Piesse, et al. [5] note how the prediction tools can help banks to make sounder lending
decisions, which results in significant savings [5].
Many smaller banks use the ratings published by the credit rating agencies, such as Moody’s and
Standard & Poor’s. However, these ratings tend to be reactive rather than predictive because credit
rating agencies usually wait until they have a considerably high confidence/evidence to support their
0
10
20
30
40
50
60
70
80
90
100
1 1,5 2 2,5 3
Pre
dic
tio
n A
ccu
racy
%
Years before bankruptcy
Other methods
Isotonic separation
Linear ProgrammingDiscrimination
Learning Vector Quantization
ID3/C4.5 Decision Tree
OCI Decision Tree
Rough Set
Support Vector Machine
By Miika Oskar Jokinen – Student ID: 1105638
51
decision. There is a need, therefore, to develop early warning mechanisms that predict failures before
it is too late to react, as was the case in the financial crisis that started in 2007.
Benefits to accounting firms: The Statement on Auditing Standards (SAS) 59 involves the assessment
of the going concern status of an entity. According to SAS 59 a company or other entity is assumed to
be of going concern if it is expected to continue in existence for the foreseeable future [47]. Hence, a
good decision support system is needed so that the auditors can make confident predictions about
the future status of a company [77] [78].
Supply chain risk management: According to McKinsey & Company [1], companies have been
struggling in the recent years with the increasing financial instability of their suppliers. Hence,
companies benefit from an early-warning model that enables them to diversify their supply resources
if there is a reason for concern. For instance, Reuters report that Boeing has a constant concern of
supply disruptions due to bankruptcies [2]. Dun & Bradstreet (D&B), who provide software for the
aerospace industry, claim that they can “project 92 percent of all U.S. bankruptcies at least six months
in advance” [3]. In a similar way, the prediction model developed in this project is highly useful for
large corporations with increasingly wide, global and complex supply chains.
6.3.1 Error types I and II
It should be noted that both error types I and II have costs related to them. In a type I error (false
positive), where the healthy company is classified as bankrupt, the investor misses out on a good
investment. In error type II (false negative), where a bankrupt company is classified as healthy, the
investor can lose significant amounts in the case of default - potentially their whole investment [79].
As was observed in section 5.3, the model is more prone to classify healthy companies as bankrupt
By Miika Oskar Jokinen – Student ID: 1105638
52
when it is fed with data over a longer period of time. Vice versa, when the data sample consists of
data over one year, the model tends to be overly “optimistic” by predicting bankrupt companies to be
healthy. In order to avoid large losses by investing in companies that are going to fail, the model
should be fed with the average, standard deviation and trend of financial ratios that are calculated
with three or four years of past data.
6.4 Advantages of neural networks
Advantages
NNs are able to learn any complex non-linear mapping
As non-parametric methods, NNs do not make a priori assumptions about the distribution of the data
NNs are very flexible with respect to incomplete, missing and noisy data/NNs are “fault tolerant”
Neural network models can be easily updated. Therefore, they are suitable for dynamic environments
NNs overcome some limitations of other statistical methods.
Hidden nodes, in feed-forward supervised NN models can be regarded as latent/unobservable variables.
NNs can be highly automated, minimizing human involvement.
Table 13. A list of advantages of neural networks. Adapted from [55].
Neural networks work well as universal approximators and in the case of bankruptcy prediction they
have the advantage of being able to see correlations and combinations of financial ratios. They are
also quick to implement and can be highly automated. Table 13 lists frequently quoted advantages of
ANNs in business applications.
By Miika Oskar Jokinen – Student ID: 1105638
53
6.5 Disadvantages of neural networks
Despite the many advantages of neural networks, they still do have limitations. Table 14 lists the most
commonly quoted disadvantages that occur in published research.
Limitations/disadvantages
NNs lack theoretical background concerning explanatory capabilities/NNs as “black boxes”
The selection of the network topology and its parameters lack theoretical background. As a result, it is still a matter of “trial and error”.
Neural networks learning process can be very time consuming.
Neural networks can overfit the training data, becoming useless in terms of generalization. However, using validation samples, this can be overcome.
There is no explicit set of rules to select a suitable NN learning algorithm.
NNs are too dependent on the quality and quantity of data available.
NNs can get stuck in local minima during the training process
NN techniques are still rapidly evolving and they are not robust enough yet
NNs lack classical statistical properties. Confidence intervals and hypothesis testing are not available Table 14. A list of the disadvantages of neural networks. Adapted from [55].
One of the major limitations of neural networks is the lack of statistical concepts that have been
developed for them [80]. It should be noted that it is rare for papers in bankruptcy prediction to
scrutinize the samples that have been used as inputs. Harford [81] notes how it is routine, when
examining a pattern in data, to ask whether such a pattern might have emerged by chance. If it is
unlikely that the observed pattern has occurred at random, the pattern is called “statistically
significant”. This issue is not really addressed in bankruptcy prediction papers, probably partly
because the data that is available is limited, but also because it is easy to achieve high prediction
accuracies with a biased sample. The author of this project was able to achieve a 95% prediction
accuracy one year before bankruptcy simply by selecting healthy firms with high total assets, and
bankrupt companies with lower total assets. See the section ‘% Uncomment the following lines to get
By Miika Oskar Jokinen – Student ID: 1105638
54
a data sample with companies with determined total asset sizes (e.g. total asset sizes of the chosen
companies are above £100,000th).’ that creates such a data sample. The issue of sample error and
bias is further addressed by [82] and [83].
Because supervised networks must be provided with input/target pairs, their training and testing is
performed in a retrospective mode. Hence, in today’s fast-changing, real-time-based business
environment the neural networks have to be constantly updated and sometimes there can be
problems in finding readily available information that is required in the development of supervised
neural networks [18].
Furthermore, neural networks cannot be used to determine causal relationships between financial
ratios and the likelihood of bankruptcy. This is a result of the “black box” approach of neural networks
[18]. Because of the black-box approach, the abnormal behavior of financial ratios that was observed
in section 6 cannot be explained with the neural network model and, therefore, other tools should be
considered for this task.
6.6 Limitations of financial ratios
Foster [84] notes that there is not necessarily a one-to-one correspondence between the non-
distressed/distressed and the non-bankrupt/bankrupt companies. For instance, if the company is
large and if the failure of such a firm has a significant impact on the economy, the government may
play an important role in the rescue of such a distressed firm. This in turn can lead to skewed results
by the prediction model, as a company on the brink of failure does not fail as a result of a government
bailout. The extent of government rescues are even larger in the not-for-profit sector and hence, for
By Miika Oskar Jokinen – Student ID: 1105638
55
instance, the failure of universities is extremely rare [5].
As corporate failures can occur for a plethora of reasons, financial ratios do not necessarily have a
clear correlation with the company’s financial health. Piesse, et al. [5] demonstrate how insolvency
can be caused by a variety reasons, adapted from [85]: (1) Low and declining profitability, (2)
inappropriate diversification: moving into unfamiliar industries or failing to move away from declining
ones, (3) import penetration into the firm’s home markets, (4) difficulties controlling new or
geographically dispersed operations and (5) failure to eliminate actual or potential loss-making
activities. The insolvency can also be a result of contractual problems or issues in the financial
structure of the company: (6) Inadequate financial control over contracts, (7) adverse changes in
contractual arrangements, (8) deteriorating financial structures, (9) over-trading in relation to the
capital base and (10) inadequate control over working capital [5].
As is demonstrated above, it is evident that poor financial ratios by themselves do not cause
bankruptcies but are rather symptoms of external or internal causes. In order to predict the
companies’ bankruptcies at an earlier stage, external factors such as macroeconomic and qualitative
variables should be taken into account. This would help to reduce the high reliance on the quality of
financial statements by the current prediction models [86]. Rose, et al. [87] were able to explain over
90% of changes in the rate of business failures over the business cycle using the following variables
[33]:
1. Change in a stock exchange index
2. Private investment/Gross National Product
By Miika Oskar Jokinen – Student ID: 1105638
56
3. Post tax company profits/value added by companies (e.g. wages, profits, etc.)
4. Two measures of interest rates
5. Retail sales/Gross National Product
However, simply understanding the rate of change in business failures does not immensely help in
bankruptcy prediction as Morris [33] points out. His paper reports that the inclusion of
macroeconomic variables entails many difficulties: (1) samples are often relatively small, (2) data are
mostly pooled over a short period of time and (3) there is a selection bias towards including failures
at low points in the business cycle [33].
By Miika Oskar Jokinen – Student ID: 1105638
57
7. Recommendations for Further Studies
7.1 Macroeconomic and qualitative measures
A highly unreasonable number of studies have focused on developing more and more complicated
models that predict financial failures [88], but only a few studies have emphasized the importance of
the input data to the models. This is partly due to the limitations of data that is available [59]. When
the majority of academic research has focused on the prediction models, the commercial world has
realized the importance of externalities of the prediction model inputs. For instance, Growth Science
found that “about 80% of the predictive value for a startup has to do with externalities - market,
customers, competitors, et cetera” [89].
Dimitras, et al. [59] reinforces this point: A company’s “performance and survival are influenced by
several factors; e.g. the environment and its changes, as well as national and international economic
conditions.” He also notes that a good model, constructed under normal circumstances, may be
unable to predict failure successfully during periods of difficulty. Studies were carried out in the 1980s
on the macro-economic variables for failure prediction by [84], Rose et al. [87] and Mensah [90] who
noted that different economic environments as well as different sectors lead to different models for
the prediction of failure [59].
Liou & Smith [91] use both financial ratios and macroeconomic variables in their multivariate
discriminant model. The macroeconomic variables are Gross Domestic Product (GDP), Industrial
Production Index (IPI), Interest Rate, Inflation, Retail Price Index (RPI) and FTSE All Share Index. Liu
[92] emphasises the importance of using interest rates. However, both authors fail to achieve
satisfactory results.
By Miika Oskar Jokinen – Student ID: 1105638
58
In addition to macroeconomic variables, many qualitative factors affect the success and failure of a
company, such as the quality of management, personnel, products, equipment, etc. [59] Some
researchers have accounted for these qualitative factors, including Zopounidis [93] who employed a
set of 'strategic criteria' to assess the risk of failure of French firms. These variables were: quality of
management, research and development level, diversification stage, market trend, market
niche/position, cash out method and world market share [59]. Similar measures were used by Shaw &
Gentry [94], while Peel, et al. [95] accounted for qualitative aspects with the lag in reporting accounts
of a firm, the number of director resignations and appointments and the changes in directors'
shareholdings [59].
Standard & Poor's Rating Services’ criteria publications demonstrate how the credit agency’s ratings
are determined. It is interesting to note how the credit agency’s approach differs from that of
academia and published research. Figure 19 illustrates how Standard & Poor’s produce corporate
credit ratings and Error! Reference source not found. describes factors and variables used by them
more in detail
By Miika Oskar Jokinen – Student ID: 1105638
59
Figure 19. Standard & Poor’s framework for identifying the credit risk of a company [97].
Corporate credit risk analysis factors
Business risk Financial risk
Country risk Accounting
Industry characteristics Corporate governance/Risk tolerance/Financial
policies
Company position Cash-flow adequacy
Product portfolio Capital Structure/Asset Protection
Marketing Liquidity/Short-term factors
Technology
Cost efficiency
Strategic and operational management competence
Profitability
Peer group comparisons Table 15. Factors used in the credit risk analysis by Standard & Poor’s. Adapted from Langohr & Langohr (2008), source [98].
It should be noted from Figure 19 how the company’s competitive position and its microenvironment
play an important role in the credit rating. Such factors, that describe these competitive industry
By Miika Oskar Jokinen – Student ID: 1105638
60
aspects, can be analysed through the popular methodology of Porter’s five forces as developed by
Porter [96]. However, the tricky part is to transfer such qualitative analysis into numerical values that
can be inputted into a neural network model. The other advantage that Standard & Poor's model has
over published research is their highly industry specific models. While academia investigates large
business areas such as manufacturing, again due to the data availability limitations, the credit agency
examines specific sectors such as ‘Metals and Mining’ or ‘Solar’ manufacturing [97].
7.2 Big Data
Big data is currently a hot topic in the business world that describes the huge volumes of data
generated by traditional business activities and from new sources such as social media [98].
ZestFinance is a company that has applied the concept of big data to assess a borrower’s risk of
default using 70,000 variables. The CEO of ZestFinance argues that the more accurately a lender can
price a loan, the cheaper such a loan can be offered to low-risk but low-income borrowers. Note the
enormous difference in the number of variables being analysed, from the standard sample of around
ten normally used in bankruptcy prediction to 70,000 by ZestFinance [99]. Maybe there is potential
for big data in bankruptcy prediction?
7.3 Behaviour of financial ratios
Based on a comparison to the study conducted by Beaver [72], the financial ratios examined in this
project exhibit behaviour that is different to Beaver’s [72] analysis. Further research on the
performance of financial variables a few years before bankruptcy would potentially give an indication
as to what are the best variables to be used in the prediction of bankruptcy and how the input data
should be pre-processed.
By Miika Oskar Jokinen – Student ID: 1105638
61
8. Conclusions This project successfully built and applied a neural network model to the area of business failure
prediction. Best prediction results with a hold-out sample were obtained two years before the failure
with a prediction accuracy of 88.2%. This accuracy was obtained with three years of past data that
was used to calculate the average, standard deviation and trend of 11 financial variables. However,
the results were inconsistent when fewer years of data were used as inputs. A possible solution to the
inconsistent results is to run the model with both four and three years of past data. If three years of
data gives a high probability of default, it is likely that the company will fail in two years, whereas high
probability of default with four years of data indicates a sooner failure. The model is not really robust
enough to predict bankruptcy three years before the failure.
A broad review of bankruptcy prediction studies was covered and a new method of processing input
data was applied. This included the processing of input data with three and four years of past data by
calculating the average, standard deviation and trend of the financial ratios.
The author concedes that the availability of published financial data is a large limitation, but a focus
on the sample data would potentially yield better prediction results. This approach has been picked
up by the commercial world including companies such as Growth Science and ZestFinance who have
taken a very different approach in comparison to the academic research. Other recommendations for
future research include analysing the performance of financial ratios a few years before bankruptcy.
By Miika Oskar Jokinen – Student ID: 1105638
62
8.1 Project costing
The cost breakdown of this project is presented in Table 16. The final cost of the project is approximately
£5,600.
Type of Cost Rate Quantity Total cost
Academic’s time
£50/hour 20 weeks x 1hr/week=20 hours
£1,000
Student’s time £15/hour 30 weeks x 10hr/week=300 hours
£4,500
Printing £0.25/page 250 pages (includes draft and final reports)
£62.5
Total £5,562.5 Table 16. Costs of the project.
Acknowledgements I would like to thank my friend Adrian Radillo for his support and help in the project and Helen McNamara for
giving valuable feedback on the report.
By Miika Oskar Jokinen – Student ID: 1105638
63
References
[1] McKinsey & Company, "McKinsey on Supply Chain: Select Publication," January 2011. [Online]. Available:
https://www.mckinsey.com/~/media/mckinsey/dotcom/client_service/Retail/Articles/779922_McKinse
y_on_Supply_Chain_Select_Publications_20111.ashx. [Accessed 6 April 2014].
[2] A. Scott, "Insight: As Boeing, Airbus factories hum, suppliers get rattled," Reuters, 3 March 2014.
[Online]. Available: http://www.reuters.com/article/2014/03/04/us-boeing-suppliers-insight-
idUSBREA2212J20140304. [Accessed 9 April 2014].
[3] C. Adams, "Supply Chain: Problems and Solutions," Aviation Today, 1 June 2009. [Online]. Available:
http://www.aviationtoday.com/am/issue/cover/Supply-Chain-Problems-and-
Solutions_32278.html#.U0U3fPmSyDl. [Accessed 9 April 2014].
[4] University of Warwick, "University of Warwick Library catalogue," University of Warwick, 2014. [Online].
Available:
http://webcat.warwick.ac.uk/search~S1/?searchtype=X&searcharg=business+failure+prediction&search
scope=9&sortdropdown=-
&SORT=DZ&extended=0&SUBMIT=Search&searchlimits=&searchorigarg=Xbusiness+failure+prediction%
26SORT%3DD. [Accessed 23 April 2014].
[5] J. Piesse, C.-F. Lee, H.-C. Kuo and L. Lin, "Corporate Failure: Definitions, Methods, and Failure Prediction
Models," Encyclopedia of Finance, pp. 477-490, 2006.
[6] M.-Y. Chen, "A hybrid ANFIS model for business failure prediction utilizing particle swarm optimization
and subtractive clustering," Information Sciences, vol. 220, pp. 180-195, 2013.
[7] The Insolvency Service, "Insolvency Statistics Archive: Company liquidations in England and Wales, 1960
to present," 2013. [Online]. Available:
http://www.insolvencydirect.bis.gov.uk/otherinformation/statistics/historicdata/HDmenu.htm.
[Accessed 6 April 2014].
[8] C. Zopounidis and A. I. Dimitras, Multicriteria Decision Aid Methods for the Prediction of Business
Failure, Boston; Dordrecht; London: Kluwer Academic Publishers, 1998.
[9] M. A. Aziz and H. A. Dar, "Predicting corporate bankruptcy: where we stand?," Corporate Governance,
vol. 6, no. 1, pp. 18-33, 2006.
[10] R. Merton, "Theory of rational option pricing," Bell Journal of Economics and Management Science, vol.
4, pp. 141-183, 1973.
By Miika Oskar Jokinen – Student ID: 1105638
64
[11] A. Dixit and R. Pindyck, Investment under Uncertainty, Princeton, NJ: Princeton University Press, 1994.
[12] M. T. Hagan, H. B. Demuth and M. Beale, Neural Network Design, Boston; London: PWS Publishing
Company, 1996.
[13] G. Cybenko, "Approximation by superpositions of a sigmoidal function," Mathematics of control, signals
and systems, vol. 2, no. 4, pp. 303-314, 1989.
[14] K. Hornik, "Approximation capabilities of multilayer feedforward networks," Neural networks, vol. 4, no.
2, pp. 251-257, 1991.
[15] R. P. Lippmann, "An introduction to computing with neural nets," ASSP Magazine, IEEE, vol. 4, no. 2, pp.
4-22, 1987.
[16] E. Patuwo, M. Y. Hu and M. S. Hung, "Two-Group Classification Using Neural Networks," Decision
Sciences, vol. 24, no. 4, pp. 825-845, 1993.
[17] T. Kohonen, "Self-organized formation of topologically correct feature maps," Biological Cybernetics, vol.
43, no. 1, pp. 59-69, 1982.
[18] K. Lee, D. Booth and P. Alam, "A comparison of supervised and unsupervised neural networks in
predicting bankruptcy of Korean firms," Expert Systems with Applications, vol. 29, no. 1, pp. 1-16, 2005.
[19] M. Caudill, Understanting neural networks: computer explorations Vol. 1., Cambridge, Mass.; London:
MIT Press, 1992.
[20] G. Zhang, M. Y. Hu, B. E. Patuwo and D. C. Indro, "Artificial neural networks in bankruptcy prediction:
General framework and cross-validation analysis," European Journal of Operational Research, vol. 116,
pp. 16-32, 1999.
[21] M. D. Odom and R. Sharda, "A Neural Network Model for Bankruptcy Prediction," in Proceedings of the
IEEE International Conference on Neural Networks, San Diego, CA, 1990.
[22] E. Rahimian, S. Singh, T. Thammachote and R. Virmani, "Bankruptcy prediction by neural network," in
Neural Networks in Finance and Investing: Using Artificial Intelligence to Improve Real-World
Performance, R. Trippi and E. Turban, Eds., Chigago, IL, Probus, 1993, pp. 159-176.
[23] K. Y. Tam and M. Y. Kiang, "Managerial Applications of Neural Networks: The Case of Bank Failure
Predictions," Management Science, vol. 38, no. 7, pp. 926-947, 1992.
[24] K. Y. Tam, "Neural network models and the prediction of bank bankruptcy," OMEGA, vol. 19, no. 5, pp.
429-445, 1991.
By Miika Oskar Jokinen – Student ID: 1105638
65
[25] L. Salchenberger, E. Cinar and N. Lash, "Neural Networks: A New Tool for Predicting Thrift Failures*,"
Decision Sciences, vol. 23, no. 4, pp. 899-916, 1992.
[26] J. E. Boritz and D. B. Kennedy, "Effectiveness of neural network types for prediction of business failure,"
Expert Systems with Applications, vol. 9, no. 4, pp. 503-512, 1995.
[27] J. E. Boritz, D. B. Kennedy and A. d. M. e. Albuquerque, "Predicting corporate failure using a neural
network approach," Intelligent Systems in Accounting, Finance and Management, vol. 4, pp. 95-111,
1995.
[28] J. P. Ignizio and J. R. Soltys, "Simultaneous design and training of ontogenic neural network classifiers,"
Computer & Operations Research, vol. 23, no. 6, pp. 535-546, 1996.
[29] D. Fletcher and E. Goss, "Forecasting with neural networks: An application using bankruptcy data,"
Information and Management, vol. 24, pp. 159-167, 1993.
[30] E. I. Altman, G. Marco and F. Varetto, "Corporate distress diagnosis: Comparisons using linear
discriminant analysis and neural networks (the Italian experience)," Journal of Banking and Finance, vol.
18, pp. 505-529, 1994.
[31] M. Kerling, "Corporate distress diagnosis - An international comparison," in Neural Networks in Financial
Engineering, A. P. N. Refenes, Y. Abu-Mostafa, J. Moody and A. Weigend, Eds., Singpore, World
Scientific, 1996, pp. 407-422.
[32] T. Podding, "Bankruptcy prediction: A comparison with discriminant analysis," in Neural Networks in the
Capital Markets, A. P. N. Refenes, Ed., Chichester, Wiley, 1995, pp. 311-324.
[33] R. Morris, Early Warning Indicators of Corporate Failure: A critical review of previous research and
further empirical evidence, Aldershot; Brookfield USA; Singapore, Sidney: Ashgate, 1997.
[34] L. A. Cunningham, The Essays of Warren Buffett: Lessons for Investors and Managers, 3rd ed., Singapore;
Chichester: Wiley, 2009.
[35] M. Leshno and Y. Spector, "Neural network prediction analysis: The bankruptcy case," Neurocomputing,
vol. 10, pp. 125-147, 1996.
[36] K. C. Lee, I. Han and Y. Kwon, "Hybrid neural network models for bankruptcy predictions," Decision
Support Systems, vol. 18, pp. 63-72, 1996.
[37] K. M. Fanning and K. Cogger, "A comparative analysis of artificial neural networks using financial distress
prediction," Intelligent Systems in Accounting, Finance and Management, vol. 3, pp. 241-252, 1994.
By Miika Oskar Jokinen – Student ID: 1105638
66
[38] W. Raghupathi, "Comparing neural network learning algorithms in bankruptcy prediction," International
Journal of Computational Intelligence and Organizations, vol. 1, no. 3, pp. 179-187, 1996.
[39] M. J. Lenard, P. Alam and G. R. Madey, "The application of neural networks and a qualitative response
model to the auditor's going concern uncertainty decision," Decision Science, vol. 26, no. 2, pp. 209-226,
1995.
[40] S. Akkoç, "An empirical comparison of conventional techniques, neural networks and the three stage
hybrid Adaptive Neuro Fuzzy Inference System (ANFIS) model for credit scoring analysis: The case of
Turkish credit card data," European Journal of Operational Research, vol. 222, no. 1, pp. 168-178, 2012.
[41] E. I. Altman, "Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy," The
Journal of Finance, vol. 23, no. 4, pp. 589-609, 1968.
[42] P. Coats and L. F. Fant, "Recognizing financial distress patterns using a neural network tool," Financial
Management, pp. 142-155, 1993.
[43] R. C. Lacher, P. K. Coats, S. C. Sharma and L. F. Fant, "A neural network for classifying the financial health
of a firm," European Journal of Operations Research, vol. 85, pp. 53-65, 1995.
[44] M. Odom and R. Sharda, "A neural network model for bankruptcy predicion," 1990.
[45] R. Sharda and R. L. Wilson, "Neural network experiments in business-failure forecasting: Predictive
performance measurement issues," International Journal of Computational Intelligence and
Organizations, vol. 1, no. 2, pp. 107-117, 1996.
[46] R. L. Wilson and R. Sharda, "Bankruptcy prediction using neural networks," Decision Support Systems,
vol. 11, pp. 545-557, 1994.
[47] W. Raghupathi, L. L. Schkade and B. S. Raju, "A Neural Network Approach to Bankruptcy Prediction," in
Proceedings of the IEEE 24th Annual Hawaii International Conference on System Sciences, 1991.
[48] W.-S. Chen and Y.-K. Du, "Using neural networks and data mining techniques for the financial distress
prediction model," Expert Systems with Applications, vol. 36, no. 2, pp. 4075-4086, 2009.
[49] L. M. Salchengerger, E. M. Cinar and N. Lash, "Neural networks: A new tool for predicting thrift failures,"
Decision Sciences, vol. 23, no. 4, pp. 899-916, 1992.
[50] Y. Alici, "Neural networks in corporate failure prediction: The UK experience," in Neural Networks in
Financial Engineering, A. P. N. Refenes, Y. Abu-Mostafa, J. Moody and A. Weigend, Eds., Singapore,
World Scientific, 1996, pp. 393-406.
By Miika Oskar Jokinen – Student ID: 1105638
67
[51] T. B. Bell, G. S. Ribar and J. Verchio, "Neural nets vs. Logistic regression: A comparison of each model's
ability to predict commercial bank failures," in Proceedings of the 1990 Deloitte Touche/University of
Kansas Symposioum on Auditing Problems, 1990, pp. 29-53.
[52] T. L. Lee, "Back-propagation neural network for long-term tidal predictions," Ocean Engineering, vol. 31,
no. 2, pp. 225-238, 2004.
[53] W.-J. Deng, W.-C. Chen and W. Pei, "Back-propagation neural network based importance-performance
analysis for determining critical service attributes," Expert Systems with Applications, vol. 34, no. 2, pp.
1115-1125, 2008.
[54] S. S. Panda, D. Chakraborty and S. K. Pal, "Flank wear prediction in drilling using back propagation neural
network and radial basis function network," Applied Soft Computing, vol. 8, no. 2, pp. 858-871, 2008.
[55] A. Vellido, P. J. Lisboa and J. Vaughan, "Neural networks in business: a survey of applications," Expert
Systems with Applications, vol. 17, no. 1, pp. 51-70, 1999.
[56] G. V. Karels and A. J. Prakash, "Multivariate Normality and Forecasting of Business Bankruptcy," Journal
of Business Finance & Accounting, vol. 14, no. Winter, pp. 573-593, 1987.
[57] C. H. Gibson and P. A. Frishkoff, Financial Statement Analysis: Using Financial Accounting Information,
3rd ed., Boston: Kent Publishing Company, 1986.
[58] R. R. Trippi and E. Turban, Neural Networks in Finance and Investing, Chigago, Illinois; Cambridge,
England: Probus Publishing Company, 1993.
[59] A. I. Dimitras, S. H. Zanakis and C. Zopounidis, "A survey of business failures with an emphasis on
prediction methods and industrial applications," European Journal of Operational Research, vol. 90, no.
3, pp. 487-513, 1996.
[60] E. Alfaro, N. García, M. Gámez and D. Elizondo, "Bankruptcy forecasting: An empirical comparison of
AdaBoost and neural networks," Decision Support Systems, vol. 45, no. 1, pp. 110-122, 2008.
[61] W. H. Beaver, M. F. McNichols and J.-W. Rhie, "Have Financial Statements Become Less Informative?
Evidence from the Ability of Financial Ratios to Predict Bankruptcy," Review of Accounting Studies, vol.
10, no. 1, pp. 93-122, 2005.
[62] Fame, "Formula of Accounts, Ratios and Trends," 2014. [Online]. Available:
https://webhelp.bvdep.com/Robo/BIN/Robo.dll?project=63_EN&newsess=1&refer=https%3A//fame2.b
vdep.com/version-201442/Search.QuickSearch.serv%3F_CID%3D1%26context%3D34F67CVHL2KCFTC.
[Accessed 6 April 2014].
By Miika Oskar Jokinen – Student ID: 1105638
68
[63] Investopedia, "Investopedia," 2014. [Online]. Available: http://www.investopedia.com/. [Accessed 14
April 2014].
[64] legislation.gov.uk, "legislation.gov.uk," 1968. [Online]. Available:
http://www.legislation.gov.uk/ukpga/1986/45/contents. [Accessed 6 April 2014].
[65] P. A. Meyer and H. W. Pifer, "Prediction of bank failures," Journal of Finance, vol. 25, no. 4, pp. 853-868,
1970.
[66] I. G. Dambolena and S. J. Khoury, "Ratio stability and corporate failure," The Journal of Finance, vol. 35,
no. 4, pp. 1017-1026, 1980.
[67] P. Falbo, "Credit scoring by enlarged discriminant analysis," OMEGA, vol. 19, no. 4, pp. 275-289, 1991.
[68] L. Zhou, "Performance of corporate bankruptcy prediction models on imbalanced dataset: The effect of
sampling methods," Knowledge-Based Systems, vol. 41, pp. 16-25, 2013.
[69] G. Zhang, B. E. Patuwo and M. Y. Hu, "Forecasting with artificial neural network:: The state of the art,"
International Journal of Forecasting, vol. 14, no. 1, pp. 35-62, 1998.
[70] MathWorks, "MathWorks," 2014. [Online]. Available:
http://www.mathworks.co.uk/help/nnet/ug/choose-a-multilayer-neural-network-training-
function.html. [Accessed 02 April 2014].
[71] K. Tam, "Neural Network Models and the Prediction of Bank Bankruptcy," Journal of Management
Science, vol. 19, no. 5, pp. 429-445, 1991.
[72] W. H. Beaver, "Financial Ratios as Predictors of Failure," Journal of Accounting Research, vol. 4, pp. 71-
111, 1966.
[73] A. F. Atiya, "Bankruptcy Prediction for Credit Risk Using Neural Networks: A Survey and New Results,"
IEEE Transactions on Neural Networks, vol. 12, no. 4, pp. 929-935, 2001.
[74] A. Charitou, E. Neophytou and C. Charalambous, "Predicting corporate failure: empirical evidence for the
UK," European Accounting Review, vol. 13, no. 3, pp. 465-497, 2004.
[75] Y. U. Ryu and T. Y. Wei, "Firm bankruptcy prediction: Experimental comparison of isotonic separation
and other classification approaches.," Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE
Transactions on, vol. 35, no. 5, pp. 727-737, 2005.
[76] X.-F. Hui and J. Sun, " An application of support vector machine to companies’ financial distress
prediction," Modeling decisions for artificial intelligence, pp. 274-282, 2006.
By Miika Oskar Jokinen – Student ID: 1105638
69
[77] C. Harris, "An Expert Decision Support System for Auditor Going Concern Evaluation," The University of
Texas, Arlington, 1989.
[78] R. R. Trippi and E. Turban, Neural Networks in Finance and Investing, Chigago,Illinois; Cambridge,
England: Probus Publishing Company, 1993.
[79] S. H. Penman, Financial statement analysis and security valuation, 5th ed., New York: McGraw-Hill, 2013,
2013.
[80] S. A. Hamid and Z. Iqbal, "Using neural networks for forecasting volatility of S&P 500 Index future
prices," Journal of Business Research, vol. 57, no. 10, pp. 1116-1125, 2004.
[81] T. Harford, "Big data: are we making a big mistake?," Financial Times, 28 March 2014.
[82] J. P. A. Ioannidis, "Why Most Published Research Findings are False," PLoS Med, vol. 2, no. 8, 2005.
[83] T. Economist, "Trouble at the lab," The Economist, 19 October 2013.
[84] G. Foster, Financial Statement Analysis, 2nd ed., New Jersey: Prentice-Hall, 1986.
[85] B. Rees, Financial analysis, London: Prentice-Hall, 1990.
[86] F. J. Ohlhorst, Big Data Analytics: Turning Big Data into Big Money, New York: Wiley, 2012.
[87] P. S. Rose, W. T. Andrews and G. A. Giroux, "Predicting business failure: A macroeconomic perspective,"
Journal of Accounting, Auditing and Finance, vol. 6, no. 1, pp. 20-31, 1982.
[88] R. P. Kumar and V. Ravi, "Bankrupcy prediction in banks and firms via statistical and intelligent
techniques - A review," European Journal of Operational Research, vol. 180, no. 1, pp. 1-28, 2007.
[89] T. Hayes, "This Prediction Algorithm Can Tell If Your Startup Will Fail," 2013. [Online]. Available:
http://www.fastcolabs.com/3021903/this-prediction-algorithm-can-tell-if-your-startup-will-fail.
[Accessed 5 April 2014].
[90] Y. M. Mensah, "An examination of the stationarity of multivariate bankruptcy prediction models: a
methodological study," Journal of Accounting Research, pp. 380-395, 1984.
[91] D.-K. Liou and M. Smith, "Macroeconomic Variables and Financial Distress," Journal of Accounting -
Business & Management, vol. 14, pp. 17-31, 2007.
[92] J. Liu, "Macroeconomic determinants of corporate failures: evidence from the UK," Applied Economics,
vol. 36, no. 1, pp. 939-945, 2004.
By Miika Oskar Jokinen – Student ID: 1105638
70
[93] C. Zopounidis, "A multicriteria decision making methodology for the evaluation of the risk of failure and
an application," Foundations of Control Engineering, vol. 12, no. 1, pp. 45-67, 1987.
[94] M. J. Shaw and J. A. Gentry, "Using an expert system with inductive learning to evaluate business loans,"
Financial Management, vol. 17, no. 3, pp. 45-56, 1988.
[95] M. J. Peel, D. Peel and P. F. Pope, "Predicting corporate failure - Some results for the UK corporate
sector," OMEGA, vol. 14, no. 1, pp. 5-12, 1986.
[96] M. E. Porter, "How competitive forces shape strategy," Harvard Business Review, pp. 21-38, 1979.
[97] Standard & Poor's, "Table Of Contents: Standard & Poor's Corporate Ratings Criteria," 2014. [Online].
Available:
http://www.standardandpoors.com/prot/ratings/articles/en/us/?articleType=HTML&assetID=12453662
44892. [Accessed 4 April 2014].
[98] The Financial Times, "Financial Times Lexicon," 2014. [Online]. Available:
http://lexicon.ft.com/Term?term=big-data. [Accessed 7 April 2014].
[99] S. Armstrong and J. Medeiros, "The New Financiers," The Wired: UK EDITION, p. 132, 13 September
2013.
[100] V. G. Karels and J. A. Prakash, "Multivariate Normality and Forecasting of Business Bankruptcy," Journal
of Business Finance & Accounting, vol. 14, pp. 573-593, 1987.
[101] R. H. Jackson and A. Wood, "The Performance of Insolvency Prediction and Credit Risk Models in the UK:
A Comparative Study," The British Accounting Review, vol. 45, no. 3, pp. 183-202, 2013.
[102] J. K. Bae, "Predicting financial distress of the South Korean manufacturing industries," Expert Systems
with Applications, vol. 39, no. 10, pp. 9159-9165, 2012.
[103] S. Lee and W. S. Choi, "A multi-industry bankruptcy prediction model using back-propagation neural
network and multivariate discriminant analysis," Expert Systems with Applications, vol. 40, no. 8, pp.
2941-2946, 2013.
[104] B. S. Ahn, S. S. Cho and C. Y. Kim, "The integrated methodology of rough set theory and artifical neural
network for business failure prediction," Expert Systems with Applications, vol. 18, no. 2, pp. 65-74,
2000.
[105] H. M. Langohr and P. T. Langohr, The rating agencies and their credit ratings : what they are, how they
work and why they are relevant, 2008 ed., Chichester, West Sussex, England; Hoboken, New Jersey: John
Wiley, 2008.
By Miika Oskar Jokinen – Student ID: 1105638
71
[106] Standard & Poor's, "Corporate Ratings Criteria," Standard & Poor's, 2006.
[107] F.-M. Tseng and Y.-C. Hu, "Comparing four bankruptcy prediction models: Logit, quadratic interval logit,
neural and fuzzy neural networks," Expert Systems with Applications, vol. 37, no. 3, pp. 1846-1853, 2010.
[108] K. J. Lee and K.-s. Shin, "Bankruptcy Prediction Modeling Using Multiple Neural Network Models,"
Knowledge-Based Intelligent Information and Engineering Systems, vol. 3214, pp. 668-674, 2004.
[109] K. Tam and M. Y. Kiang, "Managerial application of neural networks: The case of bank failure
predictions," Management Science, vol. 38, no. 7, pp. 926-947, 1992.
[110] Standard & Poor's, "Criteria | Corporates | General: Corporate Methodology," 2013. [Online]. Available:
http://www.standardandpoors.com/prot/ratings/articles/en/us/?articleType=HTML&assetID=12453633
89556. [Accessed 5 April 2014].
[111] M. E. Porter, "The five competitive forces that shape strategy," Harvard business review, vol. 86, no. 1,
pp. 25-40, 2008.
[112] W. Y. Huang and R. P. Lippmann, "Comparisons between neural net and conventional classifiers," in IEEE
First International Conference on Neural Networks, vol. IV, pp. 485–493, San Diego, CA, 1987.
[113] R. Sharda and R. L. Wilson, "Performance Comparison Issues in Neural Network Experiments for
Classification Problems," Wailea, HI, 1993.
[114] F. Y. Lin and S. McClean, "A data mining approach to the prediction of corporate failure," Knowledge-
Based Systems, vol. 14, no. 3-4, pp. 189-195, 2001.
[115] W. R. Klecka, Discriminant Analysis, London: Sage Publications, 1981.
[116] G. S. Maddala, Limited Depedent and Qualitative Variables in Econometrics, Cambridge: Cambridge
University Press, 1983.
[117] P. T. Theodossiou, "Alternative models for assessing the financial condition of business in Greece,"
Journal of Business Finance and Accounting, vol. 18, no. 5, pp. 697-720, 1991.
[118] D. N. Gujarati, Basic Econometrics, Singapore: McGraw-Hill, 1998.
[119] E. S. Page, "Continuous inspection schemes," Biometrika, vol. 41, pp. 100-114, 1954.
[120] J. D. Healy, "A note on multivariate CUSUM procedures," Technometrics, vol. 29, no. 4, pp. 409-412,
1987.
By Miika Oskar Jokinen – Student ID: 1105638
72
[121] E. Kahya and P. Theodossiou, "Predictin corporate financial distress: a time-series CUSUM
methodology," Review of Quantitative Finance and Accounting, vol. 13, pp. 323-345, 1999.
[122] H. Theil, "On the use of information theory concepts in the analysis of financial statements,"
Management Science, vol. 15, no. 9, pp. 459-480, 1969.
[123] B. Lev, "Decomposition measures for financial analysis," Financial Management, pp. 56-63, 1973.
[124] P. J. Booth, "Decomposition measure and the prediction of financial failure," Journal of Business Finance
& Accounting, vol. 10, no. 1, pp. 67-82, 1983.
[125] J. Scott, "The probability of bankruptcy: a comparison of empirical predictions and theoretic models,"
Journal of Banking & Finance, vol. 5, no. 3, pp. 317-344, 1981.
[126] A. Aziz, D. Emanuel and G. Lawson, "Bankruptcy prediction – an investigation of cash flow," Journal of
Management Studies, vol. 25, no. 5, pp. 419-437, 1988.
[127] F. Black and M. Scholes, "The pricing of options and corporate liabilities," Journal of Political Economy,
vol. 81, pp. 637-654, 1973.
[128] Crédit Suisse, Credit Risk: A Credit Risk Management Framework, New York: Crédit Suisse Financial
Products, 1997.
[129] T. Wilson, "Portfolio credit risk (I)," Risk Magazine, October 1997a.
[130] T. Wilson, "Portfolio credit risk (II)," Risk Magazine, November 1997b.
[131] T. Wilson, "Portfolio credit risk," FRBNY Economic Policy Review, pp. 71-82, October 1998.
[132] J. H. Friedman, "A recursive partitioning decision rule for nonparametric classification," IEEE Trans.
Computers, vol. 26, no. 4, pp. 404-408, 1977.
[133] P. Pompe and A. Feelders, "Using machine learning, neural networks, and statistics to predict,"
Microcomputers in Civil Engineering, vol. 12, no. 4, pp. 267-276, 1997.
[134] J. Kolodner, Case-Based Reasoning, San Mateo, CA: Morgan Kaufmann Publishers Inc., 1993.
[135] P. Coats and L. Fant, "Recognizing financial distress patterns using a neural network tool," Financial
Management, pp. 142-155, 1993.
[136] Z. Yang, M. Platt and H. Platt, "Probabilistic neural networks in bankruptcy prediction," Journal of
Business Research, vol. 44, no. 2, pp. 67-74, 1999.
By Miika Oskar Jokinen – Student ID: 1105638
73
[137] K. Shin and Y. Lee, "A genetic algorithm application in bankruptcy prediction modelling," Expert Systems
with Applications, vol. 23, no. 3, pp. 321-328, 2002.
[138] F. Varetto, "Genetic algorithms applications in the analysis of inslovency risk," Journal of Banking &
Finance, vol. 22, no. 10, pp. 1421-1439, 1998.
[139] Z. Pawlak, "Rough sets," International Journal of Computer & Information Sciences, vol. 11, no. 5, pp.
341-356, 1982.
[140] W. Ziarko, "Variable precision rough set model," Journal of computer and system sciences, vol. 46, no. 1,
pp. 39-59, 1993.
By Miika Oskar Jokinen – Student ID: 1105638
74
Appendices
Appendix 1 – MATLAB code
% Miika Jokinen - ID:1105638
% Third Year project - University of Warwick - School of Engineering
% Solve a Pattern Recognition Problem with a Neural Network
% Script generated by NPRTOOL
% Created Sun Mar 23 16:16:20 GMT 2014
%
% This script assumes these variables are defined:
%
% ActualInput - input data.
% ActualOutput - target data.
% Choosing data set
clear all
B=xlsread('I:\Third Year Project_6.2.2014\Input Data\test set 18\Bankrupt
-5 (1 year before bankruptcy)','filtered', 'G2:G990'); % read in the total asset
values of the bankrupt companies, B=total assets of bankrupt companies
[BNumericData, BcompanyNames, BRawData]=xlsread('I:\Third Year
Project_6.2.2014\Input Data\test set 18\Bankrupt -5 (1 year before bankruptcy)'...
,'filtered', 'B2:B990'); % read in extra information about the
bankrupt companies such as their names
H=xlsread('I:\Third Year Project_6.2.2014\Input Data\test set 18\Healthy -
5 (large and small assets), 3 years before bankruptcy’,'filtered', 'E2:E23831'); %
read in the total asset values of the healthy companies, H=total assets of healthy
companies
[HNumericData, HcompanyNames, HRawData]=xlsread('I:\Third Year
Project_6.2.2014\Input Data\test set 18\Healthy -5 (large and small assets), 1
year before bankruptcy’,'filtered', 'B2:B23831');% read in extra information about
the healthy companies such as their names
Hmatlab13x3=xlsread('I:\Third Year Project_6.2.2014\Input Data\test set
18\Healthy -5 (large and small assets), 1 year before bankruptcy','12x3matlab'); %
reading in the financial ratios of healthy companies, 33 columns in total, the
arithmetic mean, standard deviation and trend of 11 financial ratios
Bmatlab13x3=xlsread('I:\Third Year Project_6.2.2014\Input Data\test set
18\Bankrupt -5 (1 year before bankruptcy)','12x3matlab'); % reading in the
financial ratios of bankrupt companies, 33 columns in total, the arithmetic mean,
By Miika Oskar Jokinen – Student ID: 1105638
75
standard deviation and trend of 11 financial ratios
LengthB=length(B);
LengthH=length(H);
% Start Company matching based on asset sizes: Find a healthy company for
each bankrupt company so that both have approximately the same total asset sizes
Final=zeros(1,LengthB); % form an array of the right size
index=zeros(1, LengthB); % form an array of the right size
FinalCompanyNames=cell(2,LengthB); % create a cell structure, which is
later used to index chosen companies and write down the names of the chosen
companies
start=2;
Theta=0;
for i=1:LengthB
k=start; % 'start' indicates the starting point of the next loop in
terms of the healthy company total assets, k indicates the subscript of the
healthy companies
if and(B(i) < H(1), i==1)
Final(i)=H(1);
else
while B(i) >= H(k) % increase k until you find a healthy company
that has a smaller asset value than the bankrupt company
k=k+1;
end
if and(and(B(i)-H(k-1) <= H(k)-B(i),k>start),i>1) % choose closest
asset size of the healthy company from the values above and below the bankrupt
company's asset size
Theta = H(k-1); % Theta denotes the chosen total asset value
start=k;
elseif i==1
Theta=H(1);
else
Theta = H(k);
start=k+1;
end
end
Final(i)=Theta; % Final is a vector with the total asset sizes of the
matched healthy companies
index(i)=start-1; % index is a vector showing which companies were
chosen from the healthy
% company data set that have approximately a matching assets size
By Miika Oskar Jokinen – Student ID: 1105638
76
with the bankrupt companies
FinalCompanyNames(1,i)={index(i)}; % assigns the first row of the
cell structure with the
FinalCompanyNames(2,i)={HcompanyNames(index(i))};
% assigns the second row of the cell structure with the names of
the companies that were chosen
end
% Double checking that right companies were chosen for the data set/index
CompanyIndex=find(ismember(H, Final));
sameValue=find(diff(H)==0);
faultyIndex=CompanyIndex(find(ismember(CompanyIndex, sameValue)));
FinalCompanyIndex=find(ismember(faultyIndex,CompanyIndex));
% outputs the number of columns/ratios in Bmatlab13x1
RowLength=size(Bmatlab13x3,2);
% Input data set with approximately matching asset sizes
ActualInput=zeros(2*LengthB,RowLength);
ActualInput(1:LengthB,:)=Bmatlab13x3;
ActualInput((LengthB+1):2*LengthB,:)=Hmatlab13x3(index(1:LengthB),:);
%creating an input data set called ActualInput
ActualOutput=zeros(1,2*LengthB)';
ActualOutput(1:LengthB,:)=ones(1,LengthB);
% End matching companies with approximately same total asset sizes
% Uncomment the following lines to get a randomised data set
% ActualInput=zeros(2*LengthB,RowLength);
% ActualInput(1:LengthB,:)=Bmatlab13x3;
% ActualInput((LengthB+1):2*LengthB,:)=datasample(Hmatlab13x3,LengthB,
'Replace',false); %creating an input data set called ActualInput
% Uncomment the following lines to get a data sample with companies with
determined total asset sizes (e.g. total asset sizes of the chosen companies are
above £100,000th).
% BankruptCompanies=and(B>10000,B<20000); % assigning value 1 to all
bankrupt companies with total asset size over £100,000th
% IndexBankruptCompanies=find(BankruptCompanies);% find which companies
have been assigned the value 1, outputs the numbers of the columns where ones
By Miika Oskar Jokinen – Student ID: 1105638
77
exist
% LengthIndex=length(IndexBankruptCompanies);
% ActualInput=zeros(2*LengthIndex,RowLength);
%
ActualInput(1:LengthIndex,:)=datasample(Bmatlab13x3(IndexBankruptCompanies,:),
LengthIndex, 'Replace',false);
% HealthyCompanies=and(H>10000, H<20000); % assigning value 1 to all
companies with total asset size over £100,000th
% IndexHealthyCompanies=find(HealthyCompanies); % find which companies
have been assigned the value 1, outputs the numbers of the columns where ones
exist
%
ActualInput((LengthIndex+1):2*LengthIndex,:)=datasample(Hmatlab13x3(IndexHealthyCo
mpanies,:),LengthIndex, 'Replace',false);
% ActualOutput=zeros(1,2*LengthIndex)';
% ActualOutput(1:LengthIndex,:)=ones(1,LengthIndex);
% End choosing data set
% Change column vectors (default when extracting data from Fame database) into row
vectors
inputs = ActualInput';
targets = ActualOutput';
% Create a Pattern Recognition Network
hiddenLayerSize = 10;
net = patternnet(hiddenLayerSize);
% Choose Input and Output Pre/Post-Processing Functions
% For a list of all processing functions type: help nnprocess
net.inputs{1}.processFcns = {'removeconstantrows','mapminmax'};
net.outputs{2}.processFcns = {'removeconstantrows','mapminmax'};
% Setup Division of Data for Training, Validation, Testing
% For a list of all data division functions type: help nndivide
net.divideFcn = 'dividerand'; % Divide data randomly
net.divideMode = 'sample'; % Divide up every sample
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
By Miika Oskar Jokinen – Student ID: 1105638
78
% For help on training function 'trainscg' type: help trainscg
% For a list of all training functions type: help nntrain
net.trainFcn = 'trainlm'; % Levenberg-Marquardt
% Choose a Performance Function
% For a list of all performance functions type: help nnperformance
net.performFcn = 'mse'; % Mean squared error
% Choose Plot Functions
% For a list of all plot functions type: help nnplot
net.plotFcns = {'plotperform','plottrainstate','ploterrhist', ...
'plotregression', 'plotfit'};
% Train the Network
[net,tr] = train(net,inputs,targets);
% Test the Network
outputs = net(inputs);
errors = gsubtract(targets,outputs);
performance = perform(net,targets,outputs)
% Recalculate Training, Validation and Test Performance
trainTargets = targets .* tr.trainMask{1};
valTargets = targets .* tr.valMask{1};
testTargets = targets .* tr.testMask{1};
trainPerformance = perform(net,trainTargets,outputs)
valPerformance = perform(net,valTargets,outputs)
testPerformance = perform(net,testTargets,outputs)
% View the Network
view(net)
% Calculating overall prediction accuracy, including training, testing and
validation
Threshold=0.5;
logic=[outputs>Threshold];
PredictionAccuracyBinary=sum([logic==ActualOutput'])/length(logic);
% Calculating testing prediction accuracy with two categories: healthy and
bankrupt
By Miika Oskar Jokinen – Student ID: 1105638
79
testOutputs=outputs .*tr.testMask{1};
logic1=(testOutputs>Threshold).*tr.testMask{1};
Y = round((15/100)*length(outputs));
PredictionAccuracyBinaryTest=sum([logic1==testTargets])/Y;
lengthTest=length (Y);
% Introduction of a three category model: 'bankrupt', 'undetermined' and 'healthy'
Threshold1=1/3;
Threshold2=2/3;
category1=and(outputs>Threshold1,outputs<Threshold2)*0.5; % Values between 0.33
and 0.66 are assigned value 0.5, else 0
category2=outputs>Threshold2; % Values above 0.66 are assigned value 1, else zero
category=category1+category2; % No need to do category 3 for values below 0.33
because they have already been assigned the value 0
RightPredictions=category==ActualOutput'; % Vector with right predictions. however
0.5s are zero now, need to change them to ones
[rowsOfHalves,CoordinatesOfHalves,valueOfHalves]=find(category==0.5); % Find the
coordinates of 0.1
RightPredictions(CoordinatesOfHalves)=1; % Changes 0.5s to 1
PredictionAccuracy3Categories=sum(RightPredictions)/length(ActualOutput'); % Adds
up all the ones/right predictions and divides by the number of columns/financial
ratios
% Calculating testing prediction accuracy for three categories
testOutputs=outputs .*tr.testMask{1};
Threshold1=1/3;
Threshold2=2/3;
category1Test=and(testOutputs>Threshold1,testOutputs<Threshold2)*0.5.*tr.testMask{
1}; % testOutputs are predicted outputs with the testing mask
category2Test=testOutputs>Threshold2 .*tr.testMask{1};
categoryTest=(category1+category2) .*tr.testMask{1};
RightPredictionsTest=categoryTest==ActualOutput'; % When NaN, i.e. mask value is
compared(logic == used) with 0 or 1, the resulting value is always 0
[rowsOfHalves1,CoordinatesOfHalves1,valueOfHalves1]=find(categoryTest==0.5);
RightPredictionsTest(CoordinatesOfHalves1)=1;
Y1 = round((15/100)*length(outputs));
PredictionAccuracy3CategoriesTest=sum(RightPredictionsTest)/Y1;
lengthTest1=length (Y1);
% Plots data scatter figures. Figures 7 and 9 produced are with this code
figure
subplot(2,2,1); % 2=number of plots vertically, 1=number of plots
By Miika Oskar Jokinen – Student ID: 1105638
80
horizontally,next line works on the first plot
plot(ActualOutput'); % Real outputs
title ('Real Output');
xlabel ('Company samples');
ylabel('Healthy 0/Bankruptcy 1')
subplot(2,2,2);
plot(outputs,'.');
title ('All Predictions')
xlabel ('Company samples');
ylabel('Healthy 0/Bankruptcy 1')
hold all
plot(logic,'.');
hold all
x=(1:length(ActualOutput'));
y=Threshold;
plot(x,y,'r-','LineWidth', 2);
% Plots data scatter figures for testing sets. Figures 7 and 9 are produced with
this code
subplot(2,2,3);
plot(ActualOutput'); % real outputs
title ('Real Output');
xlabel ('Company samples');
ylabel('Healthy 0/Bankruptcy 1');
subplot(2,2,4);
plot(testOutputs,'.');
title ('Test Predictions')
xlabel ('Company samples');
ylabel('Healthy 0/Bankruptcy 1');
hold all
plot(logic1,'.');
hold all
x=(1:length(testTargets));
y=Threshold;
plot(x,y,'r-','LineWidth', 2);
% Plot confusion matrix, which includes total prediction accuracies and type
% I and type II error levels for training, validation and test sets
figure, plotconfusion(trainTargets,outputs, 'Training', valTargets,outputs,
'Validation',testTargets,outputs,'Test',targets,outputs,'All')
% Other plots: Uncomment the lines below to enable various plots.
% figure, plotperform(tr)
By Miika Oskar Jokinen – Student ID: 1105638
81
% figure, plottrainstate(tr)
% figure, plotroc(targets,outputs)
% figure, ploterrhist(errors)
% Uncomment the lines below to simulate and use the network for real data
% The script assumes that a new data set xNew is defined.
% yNew=net(xNew);
% logicNew=[ynew>Threshold]; %%logicNew-value of 1 indicates bankruptcy
Appendix 2 – Input data selection criteria on the Fame databse Table 17. Input data selection criteria for bankrupt companies.
Product name Fame
Update number 297
Software version 54.00
Data update 26/03/2014 (n° 7622)
Username War-851
Export date 27/03/2014
Cut off date 31/03
Step result Search result
1. Date of liquidation/dissolution: on and after 01/01/1980 and up to and including 27/03/2014 (Dissolution, In liquidation)
5,011,990 5,011,990
2. EBITDA/Total Assets: All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
8,888,346 5,011,976
3. Turnover/Total Assets: All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
8,888,346 5,011,976
4. Working Capital/Total Assets: All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
8,888,346 5,011,976
5. Return on Capital Employed (%): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
190,111 58,276
6. Profit margin (%): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
113,282 28,042
7. Current ratio (x): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
1,220,137 26,232
By Miika Oskar Jokinen – Student ID: 1105638
82
8. Gearing (%): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
237,416 9,043
9. Liquidity ratio (x): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
1,209,395 9,030
10. Return on Total Assets (%): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
207,913 9,030
11. Net Assets Turnover (x): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
102,686 9,028
12. Solvency ratio (Asset based) (%): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
1,461,319 9,028
13. EBIT margin (%): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
119,049 8,567
14. Total Assets: All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
1,732,613 8,567
Boolean search : 1 And 2 And 3 And 4 And 5 And 6 And 7 And 8 And 9 And 10 And 11 And 12
And 13 And 14
TOTAL 8,567
Table 18. Input data selection criteria for healthy companies.
Product name Fame
Update number 297
Software version 54.00
Data update 26/03/2014 (n° 7622)
Username War-851
Export date 27/03/2014
Cut off date 31/03
Step result Search result
1. Date of liquidation/dissolution: on and after 01/01/1980 and up to and including 27/03/2014 (Dissolution, In liquidation)
5,011,990 0
2. EBITDA/Total Assets: All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
8,888,346 8,888,346
By Miika Oskar Jokinen – Student ID: 1105638
83
3. Turnover/Total Assets: All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
8,888,346 8,888,346
4. Working Capital/Total Assets: All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
8,888,346 8,888,346
5. Return on Capital Employed (%): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
190,111 190,111
6. Profit margin (%): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
113,282 95,766
7. Current ratio (x): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
1,220,137 90,144
8. Gearing (%): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
237,416 37,444
9. Liquidity ratio (x): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
1,209,395 37,410
10. Return on Total Assets (%): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
207,913 37,410
11. Net Assets Turnover (x): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
102,686 37,402
12. Solvency ratio (Asset based) (%): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
1,461,319 37,401
13. EBIT margin (%): All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
119,049 36,658
14. Total Assets: All companies with a known value, Last available year, Last year -1, Last year -2, Last year -3, Last year -4, Last year -5, for all the selected periods
1,732,613 36,658
15. Active/Inactive: Active 3,020,836 27,811
Boolean search : 2 And 3 And 4 And 5 And 6 And 7 And 8 And 9 And 10 And 11 And 12 And 13 And 14 And 15
TOTAL 27,811
By Miika Oskar Jokinen – Student ID: 1105638
84
Appendix 3 - Levenberg-Marquardt training algorithm
Figure 20 by [12] is used to establish the notation that is used in the description of the Levenberg-
Marquardt training algorithm in this section.
Figure 20. Neural network structure and notation used in the Levenberg-Marquardt algorithm derivation [10].
The steps in adjusting weights and biases with the Levenberg-Marquardt backpropagation algorithm
are the following [12]:
1. Present all the financial variable inputs to the network and compute the corresponding network
outputs with the following equations: and ( ) for m=0, 2,…, M-
1, . Then calculate the errors with and sum of squared error over all inputs using
( ) ∑ ( ) ( )
∑
∑ ∑ ( )
∑ ( )
, where is the jth
By Miika Oskar Jokinen – Student ID: 1105638
85
element of the error for the qth input and target pair and is the ith error scalar.
, where a=output from transfer function, p=input/financial variable, f=transfer function, W=weight
matrix, b=bias vector, superscript m= layer number, M=number of layers in the network, =is the
error vector for q input/target pairs and is the jth component of the error for the qth input/target
pair, F(x)=performance index (MSE), x= vector of network weights and biases
2. Compute the Jacobian matrix as follows:
( )
[
]
, where =Sth neuron in layer m, R=Rth input, =weight of the connection from first input to
first neuron of first layer, =weight of connection from first input
to first neuron of the second
layer and so on.
Initialise the Levenberg-Marquardt backpropagation with
(
)
By Miika Oskar Jokinen – Student ID: 1105638
86
, where ( )
[ (
)
( )
( )]
and ( )
( )
so (
)is
a derivative with respect to the net input n that can be defined as ∑
Then calculate the Marquardt sensitivities with the recurrence relations
(
)( )
Using the following equation, augment the individual matrices into the Marquardt sensitivities:
[ |
| | ]
When is a weight compute the elements of the Jacobian matrix as follows:
[ ]
And when is a bias the elements of the Jacobian matrix can be calculated as follows:
[ ]
3. In order to obtain , solve the equation [ ( ) ( ) ]
( ) ( ) .
, where is an addition to the Gauss-Newton method. When its value is increased the Levenberg-
Marquardt algorithm approaches the steepest descent algorithm and when decreased near to zero
By Miika Oskar Jokinen – Student ID: 1105638
87
the algorithm becomes Gauss-Newton.
4. With recalculate the sum of squared errors. If the new sum of squared error is smaller
than was computed in step one with
( ) ∑ ( ) ( )
∑
∑ ∑ ( )
∑ ( )
, then divide by
and let . Then return to step one. If the sum of squared errors is not reduced, then
multiply by and return to step three.
By adding the term to the Gauss-Newton method, the algorithm provides a good compromise
between the speed of Newton’s method and the guaranteed convergence of the steepest descent
algorithm [12].