Post on 30-Mar-2018
88
CHAPTER 5
SOFTWARE DEFECTS CLASSIFICATION USING
FB-MLP NEURAL NETWORK
5.1 Introduction
High quality of software is ensured by Software reliability and
Software quality assurance. Both these concepts are drawn in
throughout the software development and maintenance process. The
activities like the performance analysis, functional tests, quantifying
time and budget along with measurement of metrics are used to
ensure quality [24]. In addition; code reviews, key personnel
assignment and automatic test-case generation are the other
strategies that are applied to reach the high reliability [19].
Time, budget and mean time to failure are some of the factors
with which software quality is gauged. Software quality becomes too
expensive when Alpha and Beta testing is used to improve the quality
of software, though achieving zero defects is not possible. So software
quality modeling deals not only with the software being of desired
quality and defect free to the possible extent but also keeps budget
and time in control. Quantifiable software metrics used in defect
prediction models are successful in predicting defects in modules.
Defect prediction models are based on comparing two variables; one
independent variable captured in the form of process metric and one
89
dependent variable which indicates whether there could be fault or
not in the module. Researchers use independent variables either from
previous projects or from the current project and use it in the model
to predict fault in the module.
When it is not possible to replace legacy systems, the
enhancement of predicting is widely practised. It is also cost effective
method of maintaining legacy systems.
5.2 Proposed Neural Network model – Fuzzy Bell Multi Layer
Perceptron Neural Network (FB-MLP Neural Network)
The proposed neural network Fuzzy Bell Multi Layer Perceptron
(FB-MLP) Neural network is a modification of existing Multi layer
perceptron (MLP) model wherein with the introduction of bell function,
a fuzzy logic based hidden layer exploiting the advantages of
membership function.
Fuzzy logic advantages:
The main advantage of fuzzy logic is that it mimics human decision
making to handle vague concepts and ability to deal with imprecise or
imperfect information. And the ability to rapid compute due to its
intrinsic parallel processing nature. Fuzzy logic resolves conflicts by
collaboration, propagation and aggregation. It also handles improved
knowledge representation and uncertainty reasoning, modeling of
90
complex, non-linear problems and natural language
processing/programming capability
Fuzzy logic limitations:
Though fuzzy logic finds numerous applications it is highly
abstract and heuristic and thus need experts for rule discovery (data
relationships). The major disadvantage is the lack of self-organizing &
self-tuning mechanisms of neural networks.
Advantages of Neural network:
The main advantage of neural networks is that there is no need
to know data relationships. It has self-learning capability and self-
tuning capability. Neural networks are applicable to model various
systems.
Limitations of Neural network:
Critics of NN say neural network is a "garbage in garbage out"
system, which is extreme. The limitations faced by neural network are
that it is unable to handle linguistic information and cannot manage
imprecise or vague information. The inability to combine numeric data
with linguistic or logical data is another major disadvantage of neural
network. It is difficult to reach global minimum even by complex BP
learning and relys on trial-and-errors to determine hidden layers and
nodes.
91
Neurofuzzy:
Neurofuzzy refers to the combination of fuzzy set theory and neural
networks with the advantages of both. Neuro fuzzy can handle any
kind of information (numeric, linguistic, logical, etc.). It can manage
imprecise, partial, vague or imperfect information and resolve conflicts
by collaboration and aggregation self-learning, self-organizing and
self-tuning capabilities. There is no need of prior knowledge of
relationships of data and mimics human decision making process.
Fast computation using fuzzy number operations is obtained in
neurofuzzy systems.
5.2.1 Methodology
The proposed neural network consists of two layers with the
first layer being a tanh activation function and the second layer
containing the bell fuzzy activation function. Figure 5.1 shows the
scaled down version of the proposed model. Table 5.1 shows the
actual input neurons, hidden neurons and other parameters of the
proposed model.
92
Figure 5.1 : The proposed MLP model
Multilayer Perceptrons (MLPs) are feed forward neural networks
trained with the standard backpropagation algorithm. MLP can be
trained to learn to transform input data into a preferred response, and
are widely used for modeling prediction problems (K. Hornik, et al.,
1989) [48]. Backpropagation computes the sensitivity of the output
with respect to each weight in the network, and modifies each weight
by a value that is proportional to the sensitivity.
Suppose the total number of hidden layers is L. The input layer
is considered as layer 0. Let the number of neurons in hidden layer l
be Nl, l = 1,2,. . . . L. Let l
ijw represents the weight of the link between
the jth neuron of the l − 1th hidden layer and ith neuron of the lth
hidden layer, and l
i be the bias parameter of ith neuron of the lth
Output Input layer
Hidden layer with tanh activation
Weight
kijw
( )ju n
( )y n
Hidden layer with proposed Bell fuzzy
activation and fuzzy membership
93
hidden layer. Let xi represent the ith input parameter to the MLPNN.
Let 1
iy
be the output of ith neuron of the lth hidden layer, which can be
computed according to the standard MLPNN formulas as
LlNiywfy l
l
i
l
j
l
ij
N
j
l
i
l
,...,1,,...1,.1
1
1
,,,...,1, 0
0
NNNixy xxii
where f(.) is the activation function. Let vki represent the
weight of the link between the ith neuron of the Lth hidden layer and
the kth neuron of the output layer, and βk be the bias parameter of the
kth output neuron. The outputs of MLPNN can be computed as,
1
.LN
L
k ki i k
i
y v y
1,......, yk N
A neural network model can be developed through a process
called training. Suppose the training data consists of Np sample pairs,
{ (xp; dp), p=1, 2, . . . N}, where xp and dp are Nx- and Ny-dimensional
vectors representing the inputs and the desired outputs of the neural
network, respectively. Let w be the weight vector containing all the Nw
weights of the neural network. The objective of training is to find w
94
such that the error between the neural network predictions and the
desired outputs are minimized, minE(w) where
p yN
p
N
k
pkppk dwxywE1 1
2)),((2
1)(
pN
p
p we1
),(2
1
and dpk is the kth element of vector dp, ypk (xp,w) is the kth output
of the neural network when the input presented to the network is xp.
The term ep(w) is the error in the output due to the pth sample.
The backpropagation algorithm applies a correction jiw n
to
the synaptic weight jiw n
during the nth iteration. The correction
jiw n applied to
jiw n is defined by the delta rule:
ji
ji
e nw n
w n
where is the learning-rate parameter of the backpropagation
algorithm. The negative represents the gradient descent in weight
space to reduce the value of the error.
A good training algorithm will cut down the training time, while
accomplishing better accuracy. Thus training process is a significant
95
feature of the ANNs, where representative examples of the information
are continuously presented to the network so that it can incorporate
this information within its structure.
The Fuzzy Bell Multi Layer Perceptron (FB-MLP) Neural network
proposed in this paper uses the criteria specified in Table 5.1.
Table 5.1: Design metrics of the proposed Fuzzy Bell Multi Layer
Perceptron (FB-MLP) Neural network model
Parameters Values
Input Neuron 20
Output Neuron 1
Number of Hidden Layer 2
Number of processing elements – first layer 6
Transfer function of first hidden layer tanh
Learning rule momentum
Number of processing elements-second layer 2
Transfer function of second hidden layer bellfuzzy
Learning Rule of hidden layer Momentum
96
The fuzzy bell membership function is given by
F(x)= 12
0
21
1w
w
wx
Where x is the input and wi is the weight.
The membership function of the proposed method is derived
using the input membership function illustrated in Figure 5.2 and
5.3. The definition point is shown in table 5.2 and table 5.3.
Figure 5.2 : Input membership function
97
Table 5.2 : Definition points of input membership function.
Term Name Definition Points (x, y)
very_low (0, 1) (0.16666, 1) (0.33334, 0)
(1, 0)
Low (0, 0) (0.16666, 0) (0.33334, 1)
(0.5, 0) (1, 0)
medium (0, 0) (0.33334, 0) (0.5, 1)
(0.66666, 0) (1, 0)
High (0, 0) (0.5, 0) (0.66666, 1)
(0.83334, 0) (1, 0)
very_high (0, 0) (0.66666, 0) (0.83334, 1)
(1, 1)
Figure 5.3 : Weight membership function
98
Table 5.3 : Definition points of weight membership function.
Term Name Definition Points (x, y)
very_negative (0, 1) (0.16666, 1) (0.33334, 0)
(1, 0)
negative (0, 0) (0.16666, 0) (0.33334, 1)
(0.5, 0) (1, 0)
Zero (0, 0) (0.33334, 0) (0.5, 1)
(0.66666, 0) (1, 0)
positive (0, 0) (0.5, 0) (0.66666, 1)
(0.83334, 0) (1, 0)
very_positive (0, 0) (0.66666, 0) (0.83334, 1)
(1, 1)
The output membership function obtained is shown in figure
5.4 with the definition point for output illustrated in Table 5.4.
Figure 5.4 : Output membership function
99
Table 5.4 : Definition points of output membership function
Term Name Definition Points (x, y)
very_low (0, 0) (0.125, 1) (0.25, 0)
(1, 0)
Low (0, 0) (0.125, 0) (0.25, 1)
(0.375, 0) (1, 0)
medium_low (0, 0) (0.25, 0) (0.375, 1)
(0.5, 0) (1, 0)
medium (0, 0) (0.375, 0) (0.5, 1)
(0.625, 0) (1, 0)
medium_hig
h
(0, 0) (0.5, 0) (0.625, 1)
(0.75, 0) (1, 0)
High (0, 0) (0.625, 0) (0.75, 1)
(0.875, 0) (1, 0)
very_high (0, 0) (0.75, 0) (0.875, 1)
(1, 0)
The rules' 'if' part describes the situation, for which the rules
are designed. The 'then' part describes the response of the fuzzy
system in this situation. The degree of support (DoS) is used to weigh
each rule according to its importance. Using Mean of Maximum
defuzzification the rules obtained are shown in table 5.5. The 3-D plot
of the same is shown in Figure 5.6. Table 5.5 shows the relationship
between linguistic input and the membership function. Rules
100
generated improve the processing time as the membership function
reduces the data dimension.
Table 5.5 : Rules generated using Fuzzy system with Degree of support (DoS)
IF THEN
X W DoS out
very_low very_negative 0.21 very_low
very_low very_negative 0.36 low
very_low very_negative 0.41 medium_low
very_low very_negative 0.64 medium
very_low very_negative 0.82 medium_high
very_low very_negative 0.16 high
very_low very_negative 0.52 very_high
very_low Negative 0.86 very_low
very_low Negative 0.51 low
very_low Negative 0.14 medium_low
very_low Negative 0.16 medium
very_low Negative 0.87 medium_high
very_low Negative 0.26 high
very_low Negative 0.68 very_high
very_low Zero 0.78 very_low
very_low Zero 0.41 low
very_low Zero 0.24 medium_low
very_low Zero 0.56 medium
very_low Zero 0.71 medium_high
101
IF THEN
very_low Zero 0.69 high
very_low Zero 0.59 very_high
very_low Positive 0.63 very_low
very_low Positive 0.09 low
very_low Positive 0.95 medium_low
very_low Positive 0.24 medium
very_low Positive 0.57 medium_high
very_low Positive 0.00 high
very_low Positive 0.98 very_high
very_low very_positive 0.13 very_low
very_low very_positive 0.30 low
very_low very_positive 0.93 medium_low
very_low very_positive 0.02 medium
very_low very_positive 0.92 medium_high
very_low very_positive 0.94 high
very_low very_positive 0.56 very_high
Low very_negative 0.43 very_low
Low very_negative 0.82 low
Low very_negative 0.65 medium_low
Low very_negative 0.85 medium
Low very_negative 0.63 medium_high
Low very_negative 0.30 high
Low very_negative 0.30 very_high
Low Negative 0.58 very_low
102
IF THEN
Low Negative 0.46 low
Low Negative 0.59 medium_low
Low Negative 0.40 medium
Low Negative 0.25 medium_high
Low Negative 0.79 high
Low Negative 0.63 very_high
Low Zero 0.19 very_low
Low Zero 0.51 low
Low Zero 0.24 medium_low
Low Zero 0.26 medium
Low Zero 0.20 medium_high
Low Zero 0.92 high
Low Zero 0.05 very_high
Low Positive 0.61 very_low
Low Positive 0.37 low
Low Positive 0.41 medium_low
Low Positive 0.81 medium
Low Positive 0.11 medium_high
Low Positive 0.92 high
Low Positive 0.09 very_high
Low very_positive 0.20 very_low
Low very_positive 0.34 low
Low very_positive 0.11 medium_low
Low very_positive 0.80 medium
103
IF THEN
Low very_positive 0.38 medium_high
Low very_positive 0.09 high
Low very_positive 0.07 very_high
Medium very_negative 0.16 very_low
Medium very_negative 0.14 low
Medium very_negative 0.76 medium_low
Medium very_negative 0.74 medium
Medium very_negative 0.27 medium_high
Medium very_negative 0.69 high
Medium very_negative 0.61 very_high
Medium Negative 0.36 very_low
Medium Negative 0.09 low
Medium Negative 0.80 medium_low
Medium Negative 0.82 medium
Medium Negative 0.23 medium_high
Medium Negative 0.56 high
Medium Negative 0.56 very_high
Medium Zero 0.02 very_low
Medium Zero 0.38 low
Medium Zero 0.32 medium_low
Medium Zero 0.24 medium
Medium Zero 0.20 medium_high
Medium Zero 0.91 high
Medium Zero 0.38 very_high
104
IF THEN
Medium Positive 0.54 very_low
Medium Positive 0.62 low
Medium Positive 0.40 medium_low
Medium Positive 0.38 medium
Medium Positive 0.85 medium_high
Medium Positive 0.70 high
Medium Positive 0.39 very_high
Medium very_positive 0.84 very_low
Medium very_positive 0.64 low
Medium very_positive 0.42 medium_low
Medium very_positive 0.28 medium
Medium very_positive 0.12 medium_high
Medium very_positive 0.69 high
Medium very_positive 0.29 very_high
High very_negative 0.52 very_low
High very_negative 0.44 low
High very_negative 0.26 medium_low
High very_negative 0.41 medium
High very_negative 0.55 medium_high
High very_negative 0.58 high
High very_negative 0.93 very_high
High Negative 0.65 very_low
High Negative 0.45 low
High Negative 0.77 medium_low
105
IF THEN
High Negative 0.19 medium
High Negative 0.48 medium_high
High Negative 0.55 high
High Negative 0.41 very_high
High Zero 0.46 very_low
High Zero 0.49 low
High Zero 0.98 medium_low
High Zero 0.80 medium
High Zero 0.77 medium_high
High Zero 0.07 high
High Zero 0.74 very_high
High Positive 0.02 very_low
High Positive 0.58 low
High Positive 0.42 medium_low
High Positive 0.84 medium
High Positive 0.75 medium_high
High Positive 0.14 high
High Positive 0.27 very_high
High very_positive 0.19 very_low
High very_positive 0.27 low
High very_positive 0.82 medium_low
High very_positive 0.45 medium
High very_positive 0.16 medium_high
High very_positive 0.63 high
106
IF THEN
High very_positive 0.67 very_high
very_high very_negative 0.71 very_low
very_high very_negative 1.00 low
very_high very_negative 0.85 medium_low
very_high very_negative 0.92 medium
very_high very_negative 0.09 medium_high
very_high very_negative 0.19 high
very_high very_negative 0.46 very_high
very_high Negative 0.44 very_low
very_high Negative 0.94 low
very_high Negative 0.10 medium_low
very_high Negative 0.19 medium
very_high Negative 0.91 medium_high
very_high Negative 0.13 high
very_high Negative 0.91 very_high
very_high Zero 0.97 very_low
very_high Zero 0.88 low
very_high Zero 0.52 medium_low
very_high Zero 0.01 medium
very_high Zero 0.13 medium_high
very_high Zero 0.29 high
very_high Zero 0.05 very_high
very_high Positive 0.01 very_low
very_high Positive 0.07 low
107
IF THEN
very_high Positive 0.61 medium_low
very_high Positive 0.76 medium
very_high Positive 0.70 medium_high
very_high Positive 0.10 high
very_high Positive 0.39 very_high
very_high very_positive 0.96 very_low
very_high very_positive 0.77 low
very_high very_positive 0.87 medium_low
very_high very_positive 0.23 medium
very_high very_positive 0.94 medium_high
very_high very_positive 0.77 high
very_high very_positive 0.99 very_high
Figure 5.5: Plot of the inputs vs the output
108
The learning capability and the generalization capability of the
proposed neural network model is calculated using the performance
measure of the mean square error(MSE). The MSE is given by
EN
yO
MSE
E
J
n
I
ijij
0 0
2
where
E is the number of processing elements.
N is the number of examplars.
O is the desired output for examplar i at processing element j.
Y is the obtained output for examplar i at processing element j.
The tanh will squash the range of each neuron between -1 and 1. The
tanh activation function is given by
where i is the sum of the input patterns.
5.3 Results and Discussions
The proposed FB-MLP neural network algorithm was implemented
using Visual Studio and the classification accuracy measured. Figure
5.7 shows the plot of MSE vs Epoch for various alpha (learning rate)
and momentum values.
tanh( )i i
i i
e ei
e e
109
Figure 5.6 : MSE versus No.of Epoch
From figure 5.6 it is seen that that for a very high learning rate of 0.3 (
typical values being 0.001 to 1) and 0.6 momentum ( Normal values
between 0 to 1) the convergence is fast and efficient with as low as
150 iterations.
The classification accuracy obtained along with the accuracy
obtained from previous chapter is tabulated in Table 5.6 and Figure
5.7.
110
Table 5.6 : Classification accuracy for Random tree, CART and
BL regression Vs FB-MLP on KC1 dataset
KC1 dataset with
proposed
Normalization
Random tree 94.55
CART 96.79
Bayesian logistic
regression 95.67
Existing MLP 94.28
Proposed FB- MLP 98.2
The proposed method classification accuracy improves over existing
MLP network by 3.92% which is considerable. However the
convergence of the proposed method is faster due to fuzzification.
Figure 5.7 : The classification accuracies obtained and compared with
other methods.
111
The sensitivity and specificity plot is shown in figure 5.8
Figure 5.8 : The sensitivity and specificity
The sensitivity and specificity of the proposed method converges
very efficiently. This is highly desirable as the variance is extremely
low and the capability of the software to detect faulty modules is
extremely high. The proposed model performance is better than other
techniques found in literature.
5.4 Conclusion
In this chapter it was proposed to implement a modified Multi
Layer Perceptron Neural Network (MLP-NN) for software defect
classification based on metrics that can be easily measured from
software modules. The proposed method is compared with other
methods found in literature and is shown in Figure 5.9.
112
Figure 5.9 : Proposed method compared with other methods
found in literature.