Second Replicated Quantitative Analysis

8/10/2019 Second Replicated Quantitative Analysis

1/19

A SECOND REPLICATEDQUANTITATIVE ANALYSIS

OF FAULTDISTRIBUTIONS IN

COMPLEX SOFTWARE

Tihana Ga

runeson,D


2/19

INTRODUCTION

Software Engineering

Importance of replication

Pareto Principle of fault distributions

Effects of difference in time on hypotheses


3/19

PARETO PRINCIPLE

80% of effects due to 20% of causes


4/19

BACKGROUND

Hypothesis analyzed in four groups

Related to Pareto principle of fault distribution

Related to persistence of faults

About effects of module size and complexity on fault proneness

About the quality in terms of fault densities


5/19

CONTEXT OF STUDY

Ericssons Product

Empirical data from five projects

Sequential releases of complex large scale telecommunication prod

Analyzed partapplication part

Written in Programming Language for Exchanges (PLEX)


6/19

TESTING ACTIVITIES

Function TestPerformed locally

System TestPerformed by System Integration and Verification Ce

Site TestPerformed by Network Integration and Verification Organ

OperationFailures during product operations


7/19

DATA COLLECTION

Passively collect data from several resources

Information about modulesquality reports

Information for each module

oModule name

oIdentity and Revision

oModified and Total size of code

oNumber of faults during unit verification

Trouble Reports


8/19

DATA ANALYSIS ANDRESULTS

Analysis of hypothesis done

Results for each group of hypothesis discussed

Relation to other studies elaborated


9/19

TERMINOLOGIES

Rel n,Rel n+1,Rel n+2,Rel n+3,Rel n+4 - Projects during sequentialreleases

Number of units

Number of faults

Type of studyOriginal, Previous replicated study, This replicated s


10/19

HYPOTHESES RELATED TOPARETO PRINCIPLE

HypothesisA small number of modules contain most of the faults deduring prerelease testing

Figure 1 - Modules vs % of prerelease faults


11/19

PARETO PRINCIPLEHYPOTHESIS 2

HypothesisIf a small number of modules contain most of the postrelease faults, then it is because these modules constitute most of thesize.

100 % of post release faultsmodules constituting 50,88,92,50 andof system size

80 % of faults26,39,43,28 and 22% of system size


12/19

HYPOTHESIS RELATED TOPERSISTENCE OF FAULTS

HypothesisHigher incidence of faults in FT implies higher incidencfaults in ST

Scatter plotsrelation of FT faults and ST faults

Pearson coefficient correlation r = 0.86,0.82,0.96,0.83,0.94 indicatestrong correlations


13/19


14/19

HYPOTHESIS ABOUTEFFECTS OF MODULE SIZE

Hypothesis that failed

1. Smaller modules are less likely to be failure-prone than larger onescorrelation between total number of faults and total volume

2. Size metrics are good predictors of pre release faults in a moduleCorrelation coefficient of LOC vs pre release faults are low

3. Size metrics are good predictors of post release faults in a moduleScatter plots of LOC vs post release faults does not reveal anything

4. Size metrics are good predictors of a modules prerelease fault denLinear relationship between size and fault count not observable


15/19


16/19

HYPOTHESES ABOUTQUALITY IN TERMS OF FAU

DENSITYHypothesisFault densities at corresponding phases of testing andoperation remain roughly constant between subsequent major releasesoftware system

Fault densities = Total number of faults/Total volume of code

Fault densitiesapproximately remain same

Consistent resultsindicates process is stable and repeatable

Fault densities decrease as system matures


17/19

STRENGTHS

Real time experiments

Hypothesis based on general metrics

Most hypothesis turn out to be true

Data analyzed in detail


18/19

WEAKNESS

Size-related predictors are not good enough

All factors not considered while calculating fault densities

All hypotheses related to module-size failed

Programming languages not considered


19/19

QUESTIONS?

Second Replicated Quantitative Analysis

Documents

Transcript of Second Replicated Quantitative Analysis