A Probabilistic Software Risk Assessment and Estimation ...The assessment and estimation of software...

Procedia Computer Science 54 ( 2015 ) 353 – 361

Available online at www.sciencedirect.com

1877-0509 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).Peer-review under responsibility of organizing committee of the Eleventh International Multi-Conference on Information Processing-2015 (IMCIP-2015)doi: 10.1016/j.procs.2015.06.041

ScienceDirect

Eleventh International Multi-Conference on Information Processing-2015 (IMCIP-2015)

A Probabilistic Software Risk Assessment and Estimation Modelfor Software Projects

Chandan Kumar∗ and Dilip Kumar YadavDepartment of Computer Applications, National Institute of Technology Jamshedpur, Jharkhand 831 014, India

Abstract

Software risk management plays a vital role in successful software project management. In fact, all the phases of the softwaredevelopment life cycle (SDLC) are potential sources of software risks since it involves hardware, software, technology, people,cost, and schedule. There are a number of software risk factors that affect the whole software development process. However,finding the correlation between risk factors and project outcome is the main focus of present research on software risk analysis.In this paper, a probabilistic software risk estimation model is proposed using Bayesian Belief Network (BBN) that focuseson the top software risk indicators for risk assessment in software development projects. In order to assess the constructedmodel, an empirical experiment has been performed, based on the data collected from software development projects used by anorganization.© 2015 The Authors. Published by Elsevier B.V.Peer-review under responsibility of organizing committee of the Eleventh International Multi-Conference on InformationProcessing-2015 (IMCIP-2015).

Keywords: Software risk management; Software risk factor; Software project management; Software development life cycle(SDLC); Bayesian belief network (BBN).

1. Introduction

Software risk management is a very complex and critical job in software project development. The software riskmanagement study1, 2 showed that industry-wide, only 16.2% of software projects are on time and on budget. Therest of the, 52.7% are delivered with reduced functionality and 31.1% are cancelled before completion. The mainreason for this large amount of less quality software and failure of software projects is the lack of proper software riskmanagement. Therefore, software risk management1–3 has been used widely because of risk management in softwaredevelopment will ensure that software project have the efficiency and it will be completed on time, within budgetand developed software have high quality. Boehm1 proposed principles and practices of software risk management.In this work, Boehm outlines the six phases, i.e. risk identification, analysis, prioritization, management, resolutionand monitoring of risk management. Dedolph2 studied the role of software risk management practices at LucentTechnologies in order to understand why risk management is often neglected, and he discussed examples of successfulrisk management. Freimut et al.3 study the implementation of software risk management in an industrial setting.They proposed Riskit, a systematic risk management method, and showed that Riskit provides benefit for the risk

∗Corresponding author. Tel.: +91-9334130036.E-mail address: [email protected]

© 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).Peer-review under responsibility of organizing committee of the Eleventh International Multi-Conference on Information Processing-2015 (IMCIP-2015)

http://crossmark.crossref.org/dialog/?doi=10.1016/j.procs.2015.06.041&domain=pdf

http://crossmark.crossref.org/dialog/?doi=10.1016/j.procs.2015.06.041&domain=pdf

354 Chandan Kumar and Dilip Kumar Yadav / Procedia Computer Science 54 ( 2015 ) 353 – 361

management team with acceptable costs. McConnell4 stated that 50–70% chance of project success will increase ifonly 5% of the total project budget is expensed on risk management.

In literature, it is found that subjective analysis or expert judgment is one of the methods which are generally usedin project risk management5. These types of method are based on the experience of an expert, but the experience of theexpert is not readily shared among different teams within an organization. Therefore, it is critical to develop perfectmodeling techniques that can provide more objective, repeatable, and observable decision – making support for riskmanagement. Among various existing modeling techniques, the Bayesian belief network (BBN) has concerned6–8.BBN has excellent ability in representing and reasoning with uncertainties. The correlation among risk factor isof greater interest among industry experts in software project risk planning because it can determine the causalfactor that directly affect project outcomes. Most of the software project risk, analysis research focused on thediscovery of correlations between risk factors and project outcomes9–12. In this paper, a model is proposed forsoftware risk assessment using the top ranked software risk indicator9. Our main objectives are: First, to design thecausal relationships among the top ranked software risk indicator using BBN. Second, constructing an empiricalBBN model for software project risk analysis based on software risk factor, which can be used in software riskassessment.

The rest of the paper is organized as follows. Section 2 provides a brief introduction of BBN. Section 3 describes theproposed model and the modeling concept. Section 4 presents the experiment and results. Finally, Section 5 representsthe conclusion and limitations of the proposed model.

Nomenclature

RS RS Requirement stabilityRC Requirement clarityRD Requirement dependenceRCom Requirement complexityRL Reuse levelIL Interface levelNPL Nos. of programming languagePS Product stabilityDIS Difficult level to implement securityEDP Experience on the development processDIA Development infrastructure availabilityDSA Development software availabilityPMEL Project manager experience level on managingPDF Project dependence levelML Maturity levelM−L Motivation levelERO Effective role of organizationTF Team focusTO Turn overTKL Team knowledge levelTEL Team experience levelTS Team sizeP−S Project sizeFF Financial feasibilityEDL External dependence levelCE Client experienceCPL Client participation level


2. Bayesian Belief Network

A Bayesian belief network (BBN) models the causal relationships of a system or dataset and provides a graphicalrepresentation of this causal structure through the use of directed acyclic graphs (DAGs) with nodes and edges. TheDAG representation, then provides a framework for inference and prediction. The nodes represent random variableswith probability distributions, while edges represent weighted causal relationships between the nodes. Each node has aprobability of having a certain value. A directed edge exists from a parent to a child. Each child node has a conditionalprobability table based on parental values. Bayesian belief network is based on Bayes’ Theorem, developed by theRev. Thomas Bayes, an 18th century mathematician13, and it is expressed as:

P(R/S) = P(S/R)P(R)

P(S)(1)

Equation 1 is the basic form of Bayes rule. Bayes rule is interpreted in terms of updating the belief (posteriorprobability of each possible state of a variable, that is, the state probabilities after considering all the available evidence)about a hypothesis R in the light of new evidence S. So the posterior belief P(R/S) is calculated by multiplying theprior belief P(R) by the likelihood P(S/R) that S will occur if R is true.

A BBN consists of two parts 1) Qualitative part: It represents the relationships among variables by the way of adirected acyclic graph, and 2) Quantitative part: It specifies the probability distributions associated with every node ofthe model. An example of BBN for taking the decision of purchasing the smart phone is shown in Fig. 1. The qualitativepart, i.e. the causal relationships between the 4 random variables (nodes) specifications (S), brand (B), performance ofsmart phone (P) and decision to purchase (D) is made. The casual relationships are: S → P , B → P and P → D.

The quantitative part of BBN i.e. the probability distribution associated with every node. Each node has a set ofpossible values called its state space. Here every node has two states like: specifications have two states: ‘high’ and‘low’; brand has two states: ‘good’ and ‘bad’; performance of smart phone has two states: ‘high’ and ‘low’; anddecision for purchasing has two states: ‘yes’ and ‘no’. For each node, there is the need to specify the node probabilitytable (NPT). Figure 1 also shows the NPTs of — P(S), P(B), P(P|S, B), and P(D|P). When the evidence is appliedto the network decision is made. For example, if the smart phone specification is “high” and the brand of smart phoneis “good” then the performance of smart phone will be “high” similarly if the performance of smart phone is “high”then the decision of purchasing the smart phone will be “yes”.

A BBN has several advantages:

i) It is a probabilistic (% chance) approach.ii) BBN can be developed with little data and quickly.

iii) BBN handles the situations where some data entries are missing or unavailable.iv) BBN can be used to model causal relationships.v) Expert data can be easily incorporated to the BBN.

vi) Group model – building can also be developed.vii) The update is easy whenever new knowledge is available.

3. Proposed Methodology

The architecture of the proposed model is shown in the Fig. 2. In the proposed model, the level of productengineering is predicted based on the measure of requirement specification, design and implementation, integration ofsoftware and hardware components and tests. Similarly the level of development environment is predicted based onthe measure of the development process, development system, management process, management methods and workenvironment. The level of program constraint is predicted based on the measure of resources, contract and programinterfaces.

The following steps are involved in this proposed methodology:

A. Selection of top ranked software risk indicator metrics in software project development.B. Construct the causal relationships among the software risk indicator metrics.


Fig. 1. An example of BBN.

Fig. 2. Proposed model.

C. Construct the node probability table for each node (metrics) of the model.D. Calculate the probability value of software risk for the project.

3.1 Selection of top ranked software risk indicator metrics

A number of software risk assessment and estimation model using the software risk factor has been proposed13–15.The assessment and estimation of software risk from these models may be useful for software project management.Almost all existing software risk assessment and estimation model has considered numbers of software risk factorsamong these risk factors some less important. However, assessment and estimation of software risk by taking all therisk factor have some drawbacks like: computationally complex, more expensive processing cost. Selection of mostimportant software risk factor could improve the assessment and estimation accuracy. In relevance to this issue, JulioMenezes et al.9 run a survey through questionnaire among the professionals and academics with expertise in projectmanagement and risk management in the area of software engineering. We have selected the top ranked software riskfactor from this survey and reproduced in Table 1.

3.2 Construct the causal relationships among the software risk indicator metrics

In BBN, causal relationships among the nodes can be constructed from historical data, experimental observationand with the help of domain expert. Here, causal relationships among the software risk indicator metrics is constructedbased on the software development risk Taxonomy16. The complete proposed model is constructed with the help ofNetica tool22. The constructed complete proposed model is shown in Fig. 3.


Table 1. Top software risk indicator of software development.

Fig. 3. Causal relationships of software risk indicators.

3.3 Construct the node probability table for each node (metrics) of the model

In BBN, the causal relationships between variables are defined by probability functions that receive input as aset of values of the parent nodes and calculate the given node’s probability. These probability functions are commonlyrepresented by tables – namely, node probability tables (NPTs). Designing the NPT data is one of the fundamentalissues associated with the BBN. There are no guidelines or rules that can be used to develop the NPT data that isappropriate for all types of problems. For example, manually defining NPT for Bayesian belief networks is a complextask and takes exponentially large effort. Several methods have been proposed in the literature17–20 to reduce thiscomplex task of defining NPT manually. In fact, Node information is stored in the domain expert in the form ofknowledge that can be determined through the qualitative value of the software metrics in the form of low, mediumand high and from the qualitative value of software metrics the corresponding probability can be generated with less


Fig. 4. NPT of work environment.

effort. Therefore we have constructed the node probability table for all the nodes using the qualitative value of softwaremetric, experimental observations and by the domain expert21. Constructed NPT of node work environment (W−E) isshown in Fig. 4.

4. Experiments and Results

In order to estimate the software risk from the proposed model 12 software projects data sets are used. The softwareprojects data are shown in Table 2 where the qualitative value of considered software risk indicators is represented interms of low(L), medium(M) and high(H). We applied the project data to the proposed model. The model producedthe probability value of software risk. The result is shown in Table 3 from these probability values we can identify thehigh risky or low risky projects.

The result is shown in Table 3 from these probability values we can identify the high risky or low risky projects.From the Fig. 5, it can be observed that the project no. 1, 4, 6, 8, and 11 are the high risky projects whereas projectno. 3, 7 and 10 are the medium risky projects and project no. 2, 5, 9 and 12 are the low risky projects. The probabilisticvalue of software risk will help the software project manager in decision making for example, if probability is: Low –very unlikely that the project will lead to fail. Medium – There is a 50–50 chance that the project will lead to fail.High – The chances is very high that the project will lead to fail.

The proposed model of software risk assessment and estimation does not depend on any specific software costestimation model and it generates the overall probability of risk for the software projects.

4.1 Model validation

The estimated value of software risk and actual value of the software risk is shown in Table 4. To validate theproposed model, commonly used and suggested evaluation measures23 have been taken:

• Mean Magnitude of Relative Error (MMRE): MMRE is the mean of absolute percentage errors and a measure ofthe spread of the variable z, where z = estimate/actual

MMRE = 1

n

n∑

i=1

|Yi − Yi |Yi

(2)

= 0.03842

where Yi is the actual value and Yi is the estimated value of variable of interest.• Balanced Mean Magnitude of Relative Error (BMMRE): MMRE is unbalanced and penalizes overestimates more

than underestimates. For this reason, a balanced mean magnitude of relative error measure is also considered


Table 2. Assessment of software projects.

Table 3. Probability value of software risk.

Project no. Low Medium High

1 0.27 0.325 0.4052 0.857 0.096 0.0473 0.177 0.517 0.3064 0.046 0.166 0.7885 0.812 0.106 0.0826 0.252 0.365 0.3837 0.197 0.576 0.2378 0.112 0.210 0.6789 0.783 0.119 0.09810 0.216 0.467 0.31711 0.137 0.245 0.61812 0.737 0.145 0.118

which is as follows:

B M M RE = 1

n

n∑

i=1

|Yi − Yi |min(Yi , Yi )

(3)

= 0.03911

The lesser value of MMRE and BMMRE indicates better accuracy of prediction.


Fig. 5. Project wise software risk graph.

Table 4. Estimated value of software risk.

Project no. Risk level Actual risk Estimated risk

1 High 0.4 0.4052 Low 0.8 0.8573 Medium 0.5 0.5174 High 0.75 0.7885 Low 0.82 0.8126 High 0.4 0.3837 Medium 0.55 0.5768 High 0.70 0.6759 Low 0.75 0.78310 Medium 0.5 0.46711 High 0.6 0.61812 Low 0.75 0.737

5. Conclusion

A probabilistic software risk assessment and estimation model is proposed. The model is easy to estimate theprobability value of software risk with help of the qualitative value of software risk indicator. In this work we haveapplied the BBN approach to construct the model as well as to calculate the probability value of software risk.Estimation of software risk focuses on top ranked software risk indicator of software development risk taxonomy andtheir causal relationships. This model is different from existing models because the existing model does not considerthe uncertainty of software risk indicator. The model is evaluated by MMRE and BMMRE and it has been found thatestimation accuracy is much better.

References

[1] Barry W. Boehm, Software Risk Management: Principles and Practices, IEEE Software, vol. 8(1), pp. 32–41, (1991).[2] Michael Dedolph, The Neglected Management Activity: Software Risk Management, Bell Labs Technical Journal, vol. 8(3), pp. 91–95,

(2003).[3] Bernd Freimut, Susanne Hartkopf, Peter Kaiser, Jyrki Kontio and Werner Kobitzsch, An Industrial Case Study of Implementing Software

Risk Management, In Proceedings of the 8th European Software Engineering Conference Held Jointly with 9th ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering, ESEC/FSE-9, pp. 277–287, (2001)

[4] S. McConnell, Software Project Survival Guide: How to Be Sure Your First Important Project Isn’t Your Last, Microsoft Press, Redmond,WA, (1997).

[5] S. Du, M. Keil, L. Mathiassen, Y. Shen and A. Tiwana, Attention-Shaping Tools, Expertise, and Perceived Control in IT Project RiskAssessment, Decision Support Systems, vol. 43(1), pp. 269–283, (2007).

[6] C. Fan and Y. Yu, BBN-Based Software Project Risk Management, Journal of Systems and Software, vol. 73(2), pp. 193–203, (2004).


[7] Lee, Y. Park and J. G. Shin, Large Engineering Project Risk Management Using a Bayesian Belief Network, Expert Systems with Applications,vol. 36(3), pp. 5880–5887, (2009).

[8] Yong Hu, Xiangzhou Zhang, E. W. T. Ngai, Ruichu Cai and Mei Liu, Software Project Risk Analysis Using Bayesian Networks with CausalityConstraints, Decision Support Systems, vol. 56, pp. 439–449, (2013).

[9] Julio Menezes Jr., Cristine Gusmao and Hermano Moura, Defining Indicators for Risk Assessment in Software Development Projects, CLEIElectronic Journal, vol. 16(1), (2013).

[10] J. Drew Procaccino, J. Verner, S. Overmyer and M. Darter, Case Study: Factors for Early Prediction of Software Development Success,Information and Software Technology, vol. 44(1), pp. 53–62, (2002).

[11] J. Jiang and G. Klein, Software Development Risks to Project Effectiveness, Journal of Systems and Software, vol. 52(1), pp. 3–10, (2000).[12] L. Wallace, M. Keil and A. Rai, Software Project Risks and their Effect on Outcomes, Communications of the ACM, vol. 47(4), pp. 68–73,

(2004).[13] Mohd Sadiq and Mohd Shahid, A Systematic Approach for the Estimation of Software Risk and Cost using EsrcTool, CSIT, vol. 1(3),

pp. 243–252, (2013).[14] Vittorio Cortellessa, Katerina Goseva-Popstojanova, Ajith R. Guedem, Ahmed Hassan and Walid Abdelmoez, Model-Based Performance

Risk Analysis, IEEE Transactions on Software Engineering, vol. 31(1), pp. 3–20, (2005).[15] Shareeful Islam, Software Development Risk Management Model - A Goal-Driven Approach, Technical Report, (2012).[16] M. J. Carr, S. L. Konda, I. Monarch, F. C. Ulrich and C. F. Walker, Taxonomy Based Risk Identification, Software Engineering Institute,

Carnegie Mellon University, USA, (1993).[17] K. Huang and M. Henrion, Efficient Search-Based Inference for Noisy-OR Belief Networks, Twelfth Conference on Uncertainty in Artificial

Intelligence, pp. 325–331, (1996).[18] F. J. Dıez, Parameter Adjustment in Bayes Networks: The Generalized Noisy Or-Gate, Proc. Ninth Conference on Uncertainty in Artificial

Intelligence, D. Heckerman and A. Mamdani, eds, pp. 99–105, (1993).[19] B. Das, Generating Node Probabilities for Bayesian Networks: Easing the Knowledge Acquisition Problem, arXiv preprint cs/0411034,

(2004).[20] N. E. Fenton, M. Neil and J. S. Caballero, Using Ranked Nodes to Model Qualitative Judgments in Bayesian Networks, IEEE Transactions

on Knowledge and Data Engineering, vol. 19, pp. 1420–1432, (2007).[21] C. Kumar and D. K. Yadav, A Method for Developing Node Probability Table Using Qualitative Value of Software Metrics, Proc. of 3rd

International conference on Computer, Communication, Control and Information Technology (C3IT), Hooghly, India, 7–8 Feb., IEEE Xplore;pp. 1–5, (2015).

[22] Netica, Available at http://www.norsys.com, (2010).[23] S. Chulani, B. Boehm and B. Steece, Bayesian Analysis of Empirical Software Engineering Cost Models, IEEE Transaction on Software

Engineering, vol. 25, pp. 573–583, (1999).

A Probabilistic Software Risk Assessment and Estimation ...The assessment and estimation of software...

Documents

Transcript of A Probabilistic Software Risk Assessment and Estimation ...The assessment and estimation of software...