Software Reliability Toolkit Tutorial - SoftRel, LLC · •The size of software keeps getting...

Software Reliability Toolkit Tutorial

Ann Marie Neufelder

SoftRel, LLC

www.softrel.com

[email protected]

© Softrel, LLC 2016

This presentation may not be copied in part or in whole without written permission from Ann Marie

Neufelder

http://www.softrel.com/

mailto:[email protected]

HelpEvery worksheet has at least one online help file link to guide you through the toolkit.

Opening the toolkitThe toolkit is a macro enabled spreadsheet

Opening the toolkit• Prior to launching the software reliability toolkit you must

• Have a recent version of Microsoft Excel

• Make sure that the zip file is unzipped to c:/SWRT folder (note the files that should be extracted in the below figure)

• Enable macros in Microsoft Excel

• Activate the license

• Then launch the toolkit by simply selecting one of the macro enabled files and opening it with Microsoft Excel

Step 1. Predict sizeThe more effective code you have, the more that can go wrong.

More code means more defects which means a higher failure rate.

Step 1. Predict size

• Back in the 1960s software systems could be measured in 100s of lines of code.

• Today they are measured in millions if not 10s of millions lines of code.

• The size of software keeps getting bigger because systems become more and more intelligent.

• With increased intelligence comes more failures due to software.

• Size prediction is the first step because it is a required input for predicting any software reliability figure of merit.

• There are 2 types of software components.

• In-house developed – this method is used for any component in which the source code is available.

• Commercially developed – this method is used for any component in which only the installation is made available

Identify which software components to include in prediction• These components are applicable for a SW reliability prediction

• All software that will be deployed on this system should be included

• All firmware that will be deployed on this system that is configurable and otherwise not represented in a hardware prediction

• The Operating System (this will be a COTS component)

• These components don’t always exist in every system but if they do exist and are deployed with the system they can be included in the prediction• Any Middleware (this will be a COTS component)

• Government Furnished Software (GFS)

• FOSS (Free and Open Source Software)

• COTS (Commercial Off the Shelf Software)

• Typically these components are not included in a SW reliability prediction• Software that is not deployed with the system such as compilers, development tools,

etc.

• BIOS (it is usually deterministic and will either work or not work by the time the software is deployed)

• Firmware that is not configurable (it should be included in hardware predictions)

• Use the “In-house” size prediction tab when the KSLOC estimates are available

• Use the “COTS” size prediction when the vendor is not supplying the source code or the KSLOC estimates

Step 1. Predict size->In-house components size

1. Go to the “1- in-house size”

2. Identify all components developed in house as well as any vendor supplied components in which the source code is part of the deliverable.

3. Enter the name of the software component and the organization responsible for developing that component.


4. For each component, identify the language that it is being developed in. The choices are:

• Assembler – this is most likely only used for low level firmware

• Second generation – this includes C and Fortran, Ada 83, Basic

• Object oriented – this includes C++, C#, Java, Ada 9x, Visual Basic

• Hybrid – this is a mix of second generation and object oriented. This is common for legacy systems.


5. For each component, identify the predicted amount of new, modified and reused code in terms of KSLOC (1000 executable source lines of code).

• New – this code has not been deployed

• Modified – code that has been deployed and is being modified

• Reused – code that has been deployed will be used but not modified

Note that there are several size prediction methods and tools available which are outside of the scope of Frestimate.


6. Once the new KSLOC, modified KSLOC and reused KSLOC are identified, supply a weighting factor for the modified and reused KSLOC.

Weighting factor for new code is 1 or 100%. That means that all new code is 100% effective which also means that 0% of the code has been tested or deployed.

The bigger the weighting factor the more extensive the changes are expected to the code.

Minor changes will have a small weighting factor such as 5-10% (.05 to .10) while extensive changes may have up to 100% (1) weighting factor.


7. For each component, determine if there is any auto-generated code that will be deployed for this system.

Auto generated code is often free of coding related defects but not necessarily free from requirements related defects. I.e. the auto generated code can perform the wrong function perfectly.

Predict the automated generated KSLOC and the weighting factor for effectiveness. Typically the weighting factor is lower than modified code and may even be lower than reused code.


8. For each component, determine the confidence of the predicted KSLOC values. The confidence is used to determine the upper and lower size bounds. The bigger the confidence, the wider the upper and lower size bounds.

Since the size estimate is a predicted value, it is important to establish the confidence in this prediction.

This can be determined by computing the relative accuracy of prior size predictions. For example, if the last increment or release 1000 KSLOC was predicted but the actual size was 1200 KSLOC then the confidence is .2 or 20%.

The closer the code is to being complete the more confidence there is in the size predictions – assuming that the size predictions are regularly updated. Once the code is complete, the size of it can be measured and at that point the KSLOC is an actual versus predicted value.

In-house Component EKSLOC Summary

Scroll to the right and view the computed effective normalized KSLOC which are combined with the defect density predictions to yield the predicted total fielded and testing defects.

The code expansion is determined by the language selected and is used to normalize the EKSLOC.

Step 1. Predict COTS Components Size

1. Go to the “1- COTS size”

2. Identify all Commercial Off The Shelf (COTS) components. Enter the name of the component and the vendor who is supplying that component.

3. Enter the name of the software component and the organization responsible for developing that component.


4. Install the COTS software using the exact installation configuration that is planned for deployment. Identify all application files, .exe files and .dll files.

Measure the number of KB of each of these files and enter it in the third column.

Some COTS LRUs may have several of these files so combine the sizes as needed.

MAKE SURE YOU compute the size in KB and not MB or GB!


5. Assess how many total customers this vendor has for this particular application.

• If the software is mass produced then select A.

• If this COTS package is distributed to 1000 sites or less then select B.

• If this package is distributed in very low volume then select C.


6. Assess how many months this particular version and edition of this COTS package will have been fielded by the time it is fielded with your software application.

Example, you plan to ship the software system in 12 months with a current edition of an Operating System that was just released. So, therefore you will enter 12 in this column for this COTS component.


7. Estimate the confidence of your size prediction.

This input field is what determines the upper and lower bounds for the size prediction.

If you have the COTS software installed then this confidence can be set to 0.

However, if you do not have the COTS software installed and are relying on estimates from the vendor, input at least .5 until you know what the actual size of the COTS installation is.

The confidence is a percentage which can be greater than 1.

COTS Component EKSLOC Summary

Scroll to the right and see the computed normalized effective KSLOC for each COTS component.

These will be multiplied by the associated predicted defect density for each component to yield the predicted testing and fielded defects for each COTS component.

Step 2. Predict defect densitySoftware doesn’t fail as a function of calendar time. Hence, failure rate cannot be predicted in one step because the operational time across different application types can be dramatically different.

First, the defects that will become failures is predicted.

Then the rate at which those defects become failures is predicted next.

Step 2. Predict defect density

• Defect density provides for a normalized measure

• It can be used to compare projects of different duty cycle and different size.

• Smaller projects will have fewer defects than larger projects so it is difficult to predict the defects in a software product without some normalized measure.

• Defect density can be measured with respect any size measure but in industry it is typically measured as defects per EKSLOC (1000 Effective Source Lines of Code).

• Defect density can be measured at a few different milestones as shown next

Step 2. Predict Defect DensityThere are 2 major milestones that defect density can be measured with respect to.

Defect density

milestone

How it is used

Start of testing To predict the total number of defects that will be found during a software system

level test. This does not include defects found during code reviews, unit testing,

software integration testing, or field usage.

Field This is used to predict the total number of defects that will be found once the

software is in operational use continuing to the end of the growth for this

particular software version. Typically this can be from 2 to 8 years depending on

how many installed sites and end users are using the software. This defect

density prediction does not take into account any defects that will be introduced

by subsequent new features or new releases. How to handle subsequent releases

will be discussed in another section.

Combined The above 2 defect densities can be added to determine how many total defects

will be found from the start of testing onward

Overview of available models to predict defect density

Defect density prediction model

Description Applicable industriesNumber of inputs

Industry/application type

A lookup table of industry and application typesThe application types covered span nearly every industry.

One

CMMi Level A lookup table of CMMi levels Not application specific One

SoftRel Shortcut Model

This model has several parameters and is used for more precise predictions as well as performing tradeoffs and planning improvements.

The application types covered span

nearly every industry as well as

nearly every type of software

(firmware, high level software, etc.)

23

SoftRel Full-Scale Model

This model has several parameters and is used for more precise predictions as well as performing tradeoffs and planning improvements. The application types covered span nearly every industry.

The application types covered span

nearly every industry as well as

nearly every type of software

(firmware, high level software, etc.)

96-299

Historical ModelYou have to have historical defect density data from a similar type and scope of project to use this approach.

The data is collected from your

industry/productsan

Rome Laboratory model

Based on several development and test practices

which are still in use even 20 years after the model

was developed. The SoftRel models were originally

based on the Rome Laboratory model which has not

been updated since 1992.

Aircraft, airborne. However, can be

adapted for any industry with some

modifications.

44

Prediction using Closest Match from SoftRel database

This method allows you to compare your SoftRel survey inputs to those in our database and view and select the project from our database that has the most similar set of responses to your project.

Not industry specific.Same as full-scale model

Step 2. Predict defect density->Lookup Tables

• Transition to the worksheet named “2- Application” and “2 – CMMi”.

• These lookup tables have only 1 input and consequently are usually the least accurate prediction.

• These pull-down menus are for informational purposes. Later in step 3, you will have the opportunity to select these models and the appropriate application type or CMMi.

Step 2. Predict defect density->Shortcut

• Transition to the worksheet named “2- Shortcut Survey”.

• The shortcut model has 23 questions. Most of the questions can be answered by persons who have access to a software development plan. None of the questions require knowledge of the design or code.

Shortcut Model Inputs

• View each question one at a time.

• Use the scroll to understand what the question is asking.

• Each question has a specific list of criteria that needs to be met in order to answer affirmatively.

• Answer the question based on what is planned for this particular software release.

• 16 questions are opportunities while 7 questions are related to risks.

• The "net" result is the number of opportunities - number of risks.

• The "net" result is used to predict one of three "Percentile groups"

Response Pulldown Menu0 The answer is not known or is no.5 Somewhat or some criteria for the question is met

1Yes. All of the time and all criteria and prerequisites for the question are met

Step 2. Predict defect density->Shortcut

• The number of opportunities and risks predict the defect density percentile group• 25% - better than average

• 50% - average

• 75% - distressed

• The predicted defect density is a function of the predicted percentile group.

• Both testing and fielded defect density are predicted.

Step 2. Predict defect density->Full-scale

Transition to the “2 – Full-scale survey”. It has 94 questions. When answering each question: • confirm that the required evidence to support an affirmative answer has been met. • confirm that all applicable personnel meet all of the criteria for an affirmative response. • confirm that the prerequisites for the particular question has been met.


Once the survey is complete, scroll to the top of the worksheet and view the

results. There are 16 categories that comprise the total score which is used to

predict one of the seven defect density percentile groups and the fielded and

testing defect density.


Scroll to the right and see the statistics behind each survey question. Each of the

survey question is matrixed to 5 other software assessments. Notice that some

survey questions are related to other assessments while some are not.


Scroll to the right and see how the organizations in our database answered the

same questions. The number of yes and no responses are shown for each

question.


Scroll to the right and see how the organizations within each of the 7 percentile

groups in our database answered the same questions. The 3% group is the world

class group while the 97% is the distressed group. Generally, there is a trend in

the percentage of affirmative responses among the 7 groups for each question.

Three Full-scale modelsSurvey Number of inputs Data required Worksheet name

Full-scale model

94 questions which can be answered

Usually the software development plan is sufficient.

“2- Full-scale Survey”

Full-scale model B

132 new questions plus 69 questions from the Full-scale model.

Requires knowledge of techniques and approaches to development activities which may not be described in the Software Development Plan.

“2- Full-scale Survey Form B”

Full-scale model C

151 new questions plus 70 questions from the full-scale model plus 131 questions from the Full-scale model B.

This survey requires a detailed review of SRS, SDS, code, test plans from past/similar projects as well as detailed knowledge of techniques and approaches to development which may not be typically found in the Software Development Plan.

“2- Full-scale Survey Form C”

• The Full-scale model has 2 other forms – B and C.

• These forms have more questions. When answered accurately and completely these models can provide a more accurate prediction. However these models require more time to complete.

Full-scale model resultsThe table below shows how the scores map to the 7 percentile groups. So, for example, if the score on the Survey A is 250 points then the predicted percentile group is the P97 group because there are not enough points to reach the P90 group.Predicted percentile group

Survey A score range Survey B score range Survey C score range

Not assessed 0-222 0-282 0-328P97 0-222 0-282 0-328P90 223-289 283-386 329-457P75 290-355 387-490 458-587P50 355-443 491-631 588-762P25 444-505 631-729 763-885P10 506-567 730-830 885-1010P3 568 and above 831 and above 1010 and above

The table below shows the average defect densities for each of the 7 percentile groups.Predicted percentilegroup

Average fielded defect density

95% confidence bounds on fielded defect density

Average testing defect density

95% confidence bounds on testing defect density

Not assessed 2.402 0.7775 0.425 0.136P97 2.402 0.7775 0.425 0.136P90 1.119 0.537 11.606 5.455P75 0.647 0.122 2.252 0.450P50 0.239 0.0333 1.846 0.369P25 0.1108 0.0095 1.015 0.203P10 0.0717 0.0141 0.466 0.093P3 0.0219 0.0132 0.211 0.042

Step 2. Predict defect density->Historical Data

• Some organizations have historical data from testing or operations.

• This tab computes the average defect density of your historical data

• Historical data model allows you to input up to 3 sets of historical data from similar software systems that have been operational for at least 3 years.

• You can also enter historical data from testing.

Step 2. Predict defect density->Historical Data

• These are the steps to using historical data to predict defect density.

1. Assess maturity

Determine whether this project is mature enough to be used as historical data. • Testing defect density – You must have an entire set of

data for the entire post integration software testing period.

• Fielded defect density – You must have at least 3 years of field data

2. Gather Size Gather the size of the historical project in terms of EKSLOC.

3. Gather Defects

Gather the number of fielded defects for a historical project.

4. Calibrate Calibrate the historical project to the current project for which you are performing the prediction on.

Gather historical size data

• Determine how big the historical project was in terms KSLOC.

• Find out how much of that code was effective when that project was released.

• Determine the effective size of the historical project in exactly the same way that the effective size is determined for a new project. (See step 1).

• Normalize the actual EKSLOC to assembler as shown in step 1.

Gather historical defects

• If the data is available, count up how many actual operational defects fall into each of the severity categories• Otherwise, count only the defects that were critical enough to result in a

corrective action.

• Do not count new feature requests or other changes.

• Do this for each set of historical data that you have.

• The software will sum all of the defects and compute the defect density as well as the critical defect density.

• The standard deviation and confidence is also computed (if you have more than 2 datasets).

• The testing defect density is computed in the same way.

• Instead of entering in the fielded defects by priority, enter in the testing defects by priority.

Step 2. Predict defect density->Rome Laboratory

Transition to the “2 – Rome Labs survey” worksheet. The steps are:1. Select an application type2. Select the development type3. Answer Yes or No to the development factors

Select an application type

Application type

Explanation Average testing defect density per assembler KSLOC

Airborne Any military aircraft 5.974Strategic Missiles, satellites 4.136Tactical Command and

control3.676

Process control Monitoring software

.919

Production center

4.136

Developmental Experimental 6.434Average 4.596

• The application types provides by the Rome Laboratory model are largely related to military aircraft systems.

• The above values are also in terms of TESTING defect density and not operational/fielded defect density.

Select a development typeDetermine whether this is organic, semidetached or embedded environment.

Development type

Explanation Average Factor

Organic This means that the software engineers represent the end users of the software. Examples would be a software reliability engineer writing software to predict software reliability. The software engineer is a domain expert or a domain user.

0.75

Semidetached This is a mixture of organic and embedded. An example would be an organization developing electronic warfare software that chooses to bring in EW experts to review the requirements, design and test results.

1.0

Embedded This is when the software engineers are not the end users and the organization is large and remote from any real end users. Examples would include most defense or scientific software development environments.

1.3

• Notice that the factor for organic is much smaller than for embedded.

• The Softrel models also indicated that an organic development environment reduces defect density.

• End user domain knowledge by the software engineers is an important development consideration.

The D factor survey

• Answer the survey questions

• Only yes or no answers are allowed.

• The score is computed by the toolkit = 1 - (number of yes answers on survey/43)

• The toolkit then computes the D factor based on the score and the development environment that you selected

Development Environment Computed D Factor

Organic D = (.109*score- .04) / .014

Semi-Detached D = (.008*score + .009)/.013

Embedded D = (.018*score -.003)/.008

There are minimum and maximum values for the D factor. If the D result < .5 then it is set to .5. Also, if the D result > 2.0 it is set to 2.0.

Step 3. Predict total defectsThe total defects must be predicted in order to predict all of the reliability predictions such as MTTF, MTTCF, reliability, availability, etc.

Why do I need to predict the total defects?

• This step of the process doesn't require much work from the analyst as the toolkit already has most of the information required for this step.

• The user only needs to:• Select the model you want to use for each software

component (first 2 tabs)• Select which components you would like to have rolled

into the overall prediction.

• You can choose to see the reliability results for all components combined or for each of them one at a time. Or you may choose to do both.

Step 3. Select model for In-house or COTS components

• For each in-house and COTS component select the Prediction Method. This is one of the defect density models in step 2.

The “2- Select DD model for In-house Components” and the “2 –Select DD Model for COTS Components” have identically the same

setup. The only difference is that one tab lists all of the in-house components and one tab lists all of the COTS components


Prediction Method

Type of Software

CMMi assessment result

Softrel Shortcut assessment result

Softrel Full-scale assessment

Softrel Full-scale B assessment

Softrel Full-scale C assessment

Softrel Closest Database Match Fielded defect density

Historical Fielded defect density

Rome Laboratory fielded defect density

Application type Mandatory Optional Optional Optional Optional Optional Optional Optional OptionalCMMi assessment Optional Mandatory Optional Optional Optional Optional Optional Optional OptionalShortcut assessment Optional Optional Mandatory Optional Optional Optional Optional Optional OptionalFull-scale assessment Optional Optional Optional Mandatory Optional Optional Optional Optional OptionalFull-scale B assessment Optional Optional Optional Mandatory Mandatory Optional Optional Optional OptionalFull-scale C assessment Optional Optional Optional Mandatory Mandatory Mandatory Optional Optional Optional

Closest DB MatchOptional OptionalHighly recommendedMandatory

Highly recommended

Highly recommended

Mandatory Optional Optional

Historical Optional Optional Optional Optional Optional Optional OptionalMandatory Optional

Rome Laboratory Optional Optional Optional Optional Optional Optional Optional Optional Mandatory

Once a model is selected for each component, you only need to select the model assessment for the model selected. See the above matrix.


• For each component select the model and then select the model result for that model.

• The above example shows both the “3 – Select Model for COTS component’s“ and the “3 – Select Model for In-house Components” worksheets.


•Example from previous page:• The application type model is selected for the antivirus

and it’s application type is “average”. • The Full-scale C is selected for the New CSCI and the

result of the survey is the 90th percentile group.• The Modified CSCI is using the closest Database match

model (only available to Frestimate users) and the result is 1.743071292.

• The Reused CSCI is using the historical model and the type in value is .032052633.

• The computer generated CSCI is using the Rome Laboratory model and the type in value is .4352.

• The firmware is using the Full-scale B assessment and the result of that assessment is the 75th percentile group.

Predict defects in in-house or COTS components

• Transition to the “3- Predict COTS defects” and “3- Predict in house defects” worksheets.

• Select Yes Or No to include each in house/COTS component in your final results

• This feature is useful if you wish to generate both combined and individual component predictions

• Scroll to the right to see the predicted defects, defect density and normalized EKSLOC for each component.


The totals/Averages row sums the predicted defects for each of the selected components as discussed on the next slide.


Column header Description

Include in prediction?

Yes – the final results in tabs 4-7 will include this componentNo – the final results in tabs 4-7 will not include this component

Predicted fielded defects

=(Fielded defect density as per model you selected * Normalized EKSLOC as per step 1.)

Upper bound = (Fielded defect density as per model you selected +defect density confidence) * (Normalized EKSLOC as per step 1 + size confidence)

Lower bound = (Fielded defect density as per model you selected -defect density confidence) * (Normalized EKSLOC as per step 1 - size confidence)

Predicted testing defects

=(Testing defect density as per the model you selected * Normalized EKSLOCas per step 1.)

Upper bound = (Testing defect density as per model you selected +defect density confidence) * (Normalized EKSLOC as per step 1 + size confidence)

Lower bound = (Testing defect density as per the model you selected -defect density confidence) * (Normalized EKSLOC as per step 1 - size confidence)

Step 4. Predict defect profileIn this step we predict when those defects will occur over operational time.

Why do I need to predict the defect profile?• In step 3 we predict the total volume of testing defects and the total

volume of fielded (operational) defects. In this step we predict when those defects will occur over operational time.

• This is a prerequisite for predicting failure rate, MTBF, reliability, availability, maintenance staffing, test staffing,

Fielded defect profile

Testing defect profile

Why do I need to predict the defect profile?

• Once the overall defect profile is predicted, there are 3 defect profiles that can be derived from it.

• Each profile is used to predict the other reliability metrics as shown below.

Defect profile What it’s used for

Predicted defects of any severity leveli

Used to predict MTTF

The below profiles are derived from the above profilePredicted Interruptionsi Used to predict MTBIPredicted critical defectsi Used to predict MTTCF which is then

used to predict reliability and availability

Step 4. Predict Defect Profile-> Inputs

• Transition to the “4 – Growth Rate” worksheet

• The growth rate is how fast the defects are predicted to become known

• The growth period is how long it will take before no new defects are becoming observed

Input Definition

Growth rateHow fast defects in the software become known or observed

Growth period

How long (in calendar time) defects are found in the software before there is no more observances of defects FROM THIS PARTICULAR RELEASE

Select the most applicable growth rateGrowth Description Nominal Confidence

boundsNumber of months of growth

% defects removed in first year of operation

Very Slow Very slow deployment of systems

3.3 .4 96 31%

Slow One of a kind system (not more than 3 sites)

4.5 .6 64 57%

Medium Several installed sites but not mass distributed

6 .8 48 78%

Fast Mass distributed 9 1.1 32 97%

• The growth rate and growth period are determined by how many systems will be/are deployed.

• The more systems the faster the growth rate and smaller the growth period.

Select the most applicable growth rateThe below shows the relationship of growth rate and growth period.

If the growth rate is very high then the growth period is very short.

Similarly if the growth rate is very small then the growth period is very long.


Transition to “4- Steps 4-7 Results” worksheet.

The predicted growth rate, confidence bounds and growth period are auto-filled in this page.

Next, identify the below

• The expected ratios of interruptions to failures

• Expected percentage of all defects that are expected to be critical.

Both can be determined via past history from similar projects or by using the Softrel default values.

Interruptions and percentage of critical defects

• Ratio of interruptions to failures• Interruptions are the number of times the system is interrupted but not

down. Typically there are more interruptions than failures.

• The interruptions are computed by simply multiplying the predicted defects by the ratio of interruptions to defects.

• Percentage of defects with severe impact• Severe defects (as used here) are those that result in failures that have no

workaround. These are typically 1 to 10% of the total failures.

• Similarly the number of severe defects is predicted by multiplying the predicted defects by the percentage of defects predicted to be critical.


The “Months to next release” is the approximate number of months in between features releases.

Releases that are corrective action only do not count.

This field is used to determine the maximum feasible reliability growth.

Note that even though the software may grow for 48 months, if you add any new features to the releases prior to the 48 months, the reliability growth resets to include the growth for the new features as well as the growth for the existing features.


The “average amount of effective code in subsequent releases as a percentage of size of this release” is used to determine the future reliability growth after this release.

Typically software releases take 4 or more years to “grow” in reliability.

However, software releases are typically scheduled much more frequently than every 4 years. Hence, it is crucial to establish how far about the releases will be (previous input) and how big the future releases will be.


Identify the expected duty cycle which is the expected number of hours per month that the software will be operating.

For some systems the duty cycle per month may be relatively constant.

For others the duty cycle may be ramping up, down or both.

Some systems may also have sporadic duty cycle in which there are periods of use followed by periods of no use.

Total usage = sum of all duty cycle over total estimated months of growth

Step 4. Predict Defect Profile-> Results

Now that the prediction inputs are entered, scroll down to the bottom of the worksheet and look for the results for step 4. The computations for those results are shown on the next page.

Defect profile calculations and resultsThe below are the calculations for each of the three defect profiles.

Predicted defects of any severity leveli

= (N(Exp(-Q / TF) * T1)-Exp((-Q / TF) * T2))

Predicted Interruptionsi= (NI(Exp(-Q / TF) * T1)-Exp((-Q / TF) * T2))Predicted critical defectsi

= (NC(Exp(-Q / TF) * T1)-Exp((-Q / TF) * T2))

T Duty cycle in month iQ Growth rateTF Growth periodT1 Beginning of the time interval that we are solving for. So, if we

are solving for month 3, then T1 = 2.T2 End of the time interval that we are solving for. So, in the

previous example, T2 = 3.N Number of inherent defects delivered NI Number of interruptions delivered = N * Ratio of interruptions

to defectsNC Number of critical defects delivered = N * percentage of defects

that are criticalThe subscript i indicates that this prediction is for a particular point in time during operation.

Step 5. Predict MTTF, MTBI, MTTCF

Why do I need to predict the failure rate or MTBF?• If your goal is to predict the number of people needed to staff the

software once it’s operational you probably don’t need to predict this

• If your goal is to perform sensitivity analysis then you may not need to predict this

• Otherwise if your goal is to predict reliability or availability or to merge the predictions with the hardware predictions then you will need to predict the failure rate or MTBF first


If steps 1-4 are complete, everything needed to compute the MTTF, MTBI and MTTCF was already input.

Scroll to the right of the defect profile results from step 4 and note the predicted MTTF, MTBI and MTTCF


Predicted MTTFi = T / (N(Exp(-Q / TF) * T1)-Exp((-Q / TF) * T2))Predicted MTBIi = T / (NI(Exp(-Q / TF) * T1)-Exp((-Q / TF) * T2))Predicted MTTCFi = T / (NC(Exp(-Q / TF) * T1)-Exp((-Q / TF) * T2))T Duty cycle in month iQ Growth rateTF Growth periodT1 Beginning of the time interval that we are solving for.

So, if we are solving for month 3, then T1 = 2.T2 End of the time interval that we are solving for. So, in

the previous example, T2 = 3.N Number of inherent defects delivered. This is the result

of step 4 of the prediction process.NI Number of interruptions delivered. This is the result of

step 4 of the prediction process.NC Number of critical defects delivered. This is the result

of step 4 of the prediction process.The subscript i indicates that this prediction is for a particular point in time during operation.

Step 6. Predict AvailabilityIf you have a system that continually operates the availability metric may be more applicable than other reliability figures of merit

Why do I need to predict availability?

• If the system continually operates the availability metric may be more applicable than other reliability figures of merit

• Otherwise, if the system with a well defined mission time the other reliability figures of merit may be more applicable.

• Example of a system that continually operates – refrigerator

• Example of a system that has a defined mission time -dishwasher

Step 6. Predict Availability

If steps 1 through 5 are complete, there is only 1 input required to predict the software availability.

Scroll to the top of worksheet and enter• The Mean Time To Software Restore (MTSWR)

• See next page for instructions on how to compute this

Scroll down to the results and review the software availability predictions for each month after deployment

MTSWRMTSWR is a function of these software related maintenance actions:Maintenance action Relative percentage of defects in

this categoryAverage amount of time in hours for this maintenance action

Restart - restart the software without having to restart the computer or hardware

% defects that can be cleared with a restart

Usually this will be in terms of minutes

Reboot - restart both the software and hardware AND return back to the same processing state

% defects that can be cleared with a reboot

This can be several minutes since hardware and software have to initialize

Workaround % defects that can be avoided with a workaround

If the operator does not know the workaround this can be hours otherwise it will usually be several minutes

New software release % defects that cannot be addressed in operation with a restart, reboot, or workaround

This can be hours or days or even weeks depending on the repeatability of the problem

• The MTSWR is a weighted average of the above percentages.• It's possible that there may be other maintenance actions for your software.• In any case, the MTSWR is a weighted average of all restore actions.• Once you have determined the MTSWR, enter it in the worksheet.

Step 7. Predict ReliabilityThe probability of a success of a the mission time

Why do I need to predict reliability?

• If the system has a well defined mission time this reliability figure of merit may be applicable.

• Otherwise, if the system continually operates the availability metric may be more applicable

• Examples of a system that has a defined mission time –dishwashers, vehicles, aircraft

Step 7. Predict Reliability

If steps 1-5 are complete, there is only one additional input required – the mission time.

Scroll to the top of worksheet and enter the mission time (see next page for instructions)

Scroll down to the results and review the software reliability predictions for each month after deployment

Mission time

• Mission time should not be confused with duty cycle.

• Mission time is from a system perspective.

• For example, if an aircraft is required to fly for 12 hours prior to refueling then the mission time is 12 hours.

• That same aircraft may make several flights per month. However, the mission time is still 12 hours.

Step 7. Predict Reliability->Results

Reliability Predictioni = Exp(-Mission Time /MTTCFi)

• The subscript i indicates that this prediction is for a particular point month of operation after deployment.

• The MTTCF predictions are used as inputs because only those defects that are critical will impact the mission's probability of success while the mission is ongoing.

• This is because it is assumed by definition that only those critical defects will impact the mission and have no workaround.

Step 8. Perform Sensitivity AnalysisStaffing, Test Staffing and Defect Density Improvement

Why do I need to perform sensitivity analysis?

•The sensitivity analyses identify• Practices that were aren’t employing that may have a

visible effect on any of the reliability figures of merit• The staffing needed to maintain the current prediction of

testing and operational defects• Neglecting to support this release may cause future releases to

be late because of unplanned maintenance support

Step 8. Staffing• Contrary to popular belief, the number one reason why a software

project is late is ....THE PREVIOUS PROJECT.

• Projects are late because they start late because people are supporting problems in the field and that support was not in the plan.

• When an organization neglects to plan how many people are needed to support the software, eventually they encounter "Defect Pileup".

• The inputs to the analysis are the results from Step 1-4. • Namely, the total defects predicted for each month after deployment and

the total percentage of defects that are predicted to be in In-house developed components.

• Defects and COTS components cannot be fixed by the In-house development organization, so the below input is used to determine which defects are the responsibility of the In-house development organizations.

Step 8. Staffing

• The staffing worksheet allows you to predict how many people will be needed to support the software once it is deployed.

• Transition to the “8 – Staffing” worksheet

Step 8. Staffing -> Inputs

Input Description

Average corrective action time per defect

Include discover, record, reproduce, isolate, correct, checkout, retest in this time. It is usually at least several hours and sometimes several days.

Hours per month availability for corrective action

Unless you have a dedicated staff to support the software, this will be less than 100%

Step 8. Staffing -> Results

Scroll down to view the staffing results as described on the next page.

Step 8. Staffing-> Results

Month after delivery

Fraction of total defects predicted to be found this month

Total defects predicted to be found this month

Total critical defects predicted to be found this month

Average corrective action persons required to address all defects

Average corrective action persons required to address all critical defects

Starting from the fieldrelease date

Usually more defects are found earlier. This is determined by the growth rate (step 4).

The column to the left * total predicted fielded defects (step 4).

The column to the left * percentage of defects predicted to be severe (as per step 4).

(Corrective action time * total defects predicted to be found this month)/ total corrective action hours available

Corrective action time * total critical defects predicted to be found this month)/ total corrective action hours available

Step 8. Test Staffing-> Inputs

• The test staffing tab is used to determine how many people are needed to support corrective actions during testing

• Steps 1-4 are the only prerequisites

• Enter the above inputs as discussed on the next page

Step 8. Test Staffing-> Inputs

Input Description

Average test hours per month

Multiply the total number of people testing per month by the average number of hours per day spent testing by 22 days per month.

Months of testing Exclude periods of time in which there is no testing. Exclude unit level white box testing.

Average corrective action time per defect

Include discover, record, reproduce, isolate, correct, checkout, retest in this time. It is usually at least several hours and sometimes several days.

Hours per month available forcorrective action during testing

It’s common for software engineers to start working on new code once their code is delivered to the verification and validation engineers. Unless the software engineers are dedicated full time to support during testing, this will be less than 176 hours which is the total number of works hours in a month.

Duty cycle This is the total combined number of operational hours for the software for each month of testing. It’s not uncommon for this to vary from month to month depending on availability of hardware and people.

Step 8. Test Staffing-> Results

Scroll down to view the results of the test staffing. The calculations are shown on the next page.

Step 8. Test Staffing-> Results

• Testing defects profile is computed by using the predicted testing defect density. See steps 2 and 3.

• The above outputs can be used to ensure that there are software engineers available to staff the corrective actions during the testing activities

Month of software system level testing

Average number of in-house persons required to addressall defects*

Average number of in-house persons required to address all critical defects*

From 1 to number of months of testing

(Corrective action time * total defects predicted to be found in testing this month)/ total corrective action hours available

Corrective action time * total critical defects predicted to be found in testing this month)/ total corrective action hours available

Definitions

Definitions

• All definitions and formulas are defined in the technical manuals and help files• Some help files are not provided with the evaluation

edition

• The formulas and inputs are summarized in the next few pages

Definitions

•Software Reliability is a function of• Inherent defects

• Introduced during requirements translation, design, code, corrective action, integration, and interface definition with other software and hardware

• Operational profile• Duty cycle

• Spectrum of end users

• Number of install sites/end users

• Product maturity

Definitions

• Prediction models versus reliability growth models• Prediction models used before code is even written

• Uses empirical defect density data

• Useful for planning and resource management

• Reliability growth models used during a software system level test• Extrapolates observed defect data

• Used too late in process for most risk mitigation

• Useful for planning warranty/field support

Definitions

•Defect density • Normalized measure of software defects • Usually measured at these 2 milestones

• Delivery/operation • also called escaped or latent defect density

• System level testing

• Useful for • Predicting reliability

• Benchmarking

• Improving efficiency and reducing defects

•KSLOC – 1000 executable non-comment, non-blank lines of code

•EKSLOC – Effective size adjusting for reuse and modification

Basic Formulas

• Normalized size – Size normalized to EKSLOC of assembler via use of standard conversion tables

• Delivered Defects (Ndel)= predicted normalized size * predicted delivered defect density

• Critical defects = delivered defects * ratio of defects predicted to be critical in severity

• Testing defects (N0) = predicted normalized size * predicted testing defect density

• Interruptions = (Ratio of restorable events to all others) * Total predicted defects

• Restorable event - Usually the definition of an interruption is based on time in minutes (i.e. if the system can be restored in 6 minutes than it’s an interruption)

• Critical interruptions = interruptions * ratio of defects predicted to be critical in severity

Basic Formulas

• MTTF (i) – Mean Time To Failure at some period in time i= • T/ (N (exp (-Q/TF)*(i-1))-exp((-Q/TF)*(i) )

• N = total predicted defects

• Q = growth rate

• TF = growth period (approximate number of months it takes for all residual defects to be discovered)

• T = duty cycle for period i (this can be > 24/7 if multiple sites)

• MTTCF (i) – Mean Time To Critical Failure

• Same formula as MTTF except that Critical defects is substituted for N

• MTBI (i)– Mean Time Between Interruptions = Same formula as MTTF(i) except that N is substituted by predicted Interruptions

• MTBCI (i)– Same formulas as MTTF(i) except that N is substituted by predicted critical interruptions

• Failure Rate (i) = 1/MTTF(i)

• Critical Failure Rate(i) = 1/MTTCF(i)

• Interruption rate (i) = 1/MTBI(i)

• Critical interruption rate (i) – 1/MTBCI(i)

Basic Formulas• End of Test MTTF = T/N

• End of Test failure rate = N/T

• Reliability(i) = Exp(-mission time /MTBCF(i))• Mission time -duration for which software must continually operate

to complete the mission

• Availability(i) = MTTCF(i) / (MTTCF(i) + MTSWR)

• MTSWR = Weighted average of workaround time, restore time and repair time by predicted defects in each category

• Average MTTF – Average of each point in time MTTF(i) over this release • Similarly for the average MTTCF, Availability, Reliability, failure rate,

critical failure rate, MTBI, MTBCI

• MTTF at next release – Point in time MTTF for the milestone which coincides with the next major release.• Similarly for the MTTCF, Availability, Reliability, failure rate, critical

failure rate, MTBI, MTBCI at next release

About the Shortcut and Full-scale Surveys

• ALL prediction surveys were developed by a research organization that collected and organized lots of defect data from many real projects• SoftRel, LLC has been collecting this since 1993 on more

than 100 real software projects• More than 600 software related characteristics

• Actual fielded and testing defects observed

• Actual normalized size

• Actual capability for on time releases

• Relative cost and time to implement certain practices

•All surveys were developed using traditional statistics and modeling• Predictive models are not novel• The only thing that is relatively novel is applying them to

software defects

Software Reliability Toolkit Tutorial - SoftRel, LLC · •The size of software keeps getting...

Documents

Transcript of Software Reliability Toolkit Tutorial - SoftRel, LLC · •The size of software keeps getting...