09-Software Reliability Improvement

8/4/2019 09-Software Reliability Improvement

1/28


2/28

Software Reliability Improvement

In A Maintenance Environment

Overview of the technique

Product environment assumptions

Use of field data

Two critical measures

Test effectiveness & escaped bugs Improvement approach

Effect on reliability

AGENDA


3/28

Strategy

Use field defect data to improve theability of the pre-release testing to find

the types of bugs that show up in fieldusage of the software.

This type of improvement in the pre-release test effectiveness will lead tohigher reliability in the field (assumingthe bugs found in testing are fixed beforerelease).

The problem with software testing is thatexhaustive testing is not possible.

Test smarter, not harder


4/28

Product Environment There is a fielded software product

(stand-alone or embedded).

New versions are being issued on aregular, periodic basis.

The new versions are incrementalupgrades and enhancements to theprevious version (not next

generation).

Version 1.0 1.1 1.3 1.4 1.5 1.61.2


5/28

Assumptions

Each new version is anincremental change to the

previous version. The amount of change is

roughly the same in eachversion.

The proportion of newfeatures and bug fixes ineach release isapproximately constant.

Each new version sees asimilar usage profile anddegree of usage.

Carriedforward

unchanged

New features &

enhancements

Bug fixes


6/28

Two Critical Measures Needed

Test effectiveness Software reliability

Field data on defects can be used to quantifyboth of these and to direct improvement efforts.


7/28

Availability of Field Data

An environment where there is a

mechanism for reporting bugs from the

field, and it is done. Field service organization

Help desk

Bug tracking system


8/28

Looking At All Stages ofTesting in Concert Inspections (peer reviews)

Unit testing Integration testing

System Testing

Acceptance testing

Filter out bugs Filter out bugs Filter out bugs


9/28

Basic Method Look at bugs reported (from the field)

after the release of a new version.

Bugs reported from actual users.

In a data base (so you can do queries)

Bug reports must have a date/time stamp onthem.

Release of a

new version

Reported bugs Reported bugs


10/28

Test Effectiveness

The purpose of

testing is to find

bugs.

An effective test

process will do

that completely.

A measure of test

effectiveness:

escaped bugs.


11/28

Escaped Bugs

Version

Und

er

Develo

pment


12/28

The Surprise Factor

Problems in the software that are foundbefore release can be fixed or managed in

other ways (put in release notes, workarounds, etc.).

This method assumes that they will befixed.

Problems that are found for the firsttime after release frequently becomecrises because they are a surprise, and

the software is in production use whereproblems can cause down time, customerdis-satisfaction, etc.


13/28

Quantifying Test Effectiveness

Count the number of newly reporteddefects from the field after release of theversion.

Make sure they are not duplicates of

defects previously reported (prior torelease)

This is a zero defects type of metric

(down is better). Trend it from version to version.


14/28

Software Test Effectiveness Trend

Number of New Defects Reported After Release

of A Version

0

20

4060

80

100

120

Ver. 1.0 Ver. 1.1 Ver. 1.2 Ver. 1.3 Ver. 1.4 Ver.1.5 Ver. 1.6

Number

ofDefects

Severity 5 Severity 4 Severity 3 Severity 2 Severity 1

(The trend shown above is an undesirable trend)


15/28

Software Reliability - Definition Definition: The probability of failure-free operation of

the software for a specified period of time in a

specified environment. Key aspects:

Given time period

Specified set of operating conditions

Range of values: 0.000 to 1.000

Example: A software application has a reliability of

0.93 for 24 hours when used in a typical manner.

This means that the software would operate withoutfailure over a 24 hour period for 93 out of 100 of

those periods.

Source: Software Reliability, Musa et al, p.15


16/28

Assumptions Reliability Calculation

Released software is in use continuously. Each new version sees about the same

number of users and about the same

overall use profile.

Examples: 1) Web site, 2) A single system

in extended, continuous use


17/28

Software Reliability Data Gathering

Search the bug data base for bugsreported from field use.

Look at a time period immediately afterrelease of a new version.

Must judiciously select the time period.

Could also do an extended run in the lab.

Count bugs that cause a system failure. Must establish some criteria here May want to categorize them by severity


18/28

Field Data - Example

Spelling error1.65Sept.

10

121

Screen lay-out poor.1.64Sept. 7120

1.61.6

1.5

1.6

1.6

1.6

1.4

1.5

1.5

Version

Entry is lost.3Sept. 5119Report look-up causes crash.1Sept. 4118

Wording is poor.5Sept. 4117

Menu missing a selection.3Sept. 3116

Wrong data displayed.2Sept. 3115

Incorrect temperature calculated.2Sept. 3114

Menu tree problem.3Sept 1113

Screen lay-out4Aug. 31112

Screen lay-out5Aug. 15111

DescriptionSeverityDateNumber

Version 1.6 release date: Sept. 1


19/28

Software Reliability Calculation

Formula (Source: Software Reliability, Musa et al, P.91)

R= exp (-t t )where R= reliability

t= the number of failures/hour

t= the time period for which the

reliability is to be calculated


20/28

Software Reliability Calculation

5 failures in a 7 day (168 hour) period.

The reliability for periods of usage of 24 hours in

length is desired.

We have:t

=5/168 = 0.0298

t

= 24Therefore:

R= exp (-tt) = exp (-0.0298) (24) = 0.489

Example (using the data from the previous table)

What this tells us is that in 100 periods of time that are each

24 hours in length, this software will run failure-free (for all

users) in 48.9 of those 24 hour periods.


21/28

Alternate Reliability Metric If the software versions are not in continuous use, a

different reliability measure must be used.

Mean Time Between Failure (MTBF) Concept of production time and non-production

time

Count failures, as before, but must also determine

the number of hours that the software was inproduction when it incurred those failures.

MTBF = total number of production hours dividedby the total number of failures.

Works best when data from multiple installations isaggregated.


22/28

Improving Test Effectiveness For every escaped bug, do a root cause

analysis.

Determine why it was not caught by thetesting of the version.

Write a test case(s) to catch that and

similar bugs, and include the test case(s)in the regression test suite.

Net effect: Testing becomes more bullet-

proof , and fewer defects are released tothe field.


23/28

Plugging The Holes


24/28

Example

Problem Report: Clean recipes ran at the wrong

time.

Analysis: Happens when running cleans after nwafers.

New test cases for system testing phase:

1) No cleaning2) Clean after every wafer.

3) Clean after every 2 wafers.




25/28

Software Test Effectiveness

Trend - Good

Number of New Defects Reported After Release of AVersion

020

40

60

80

100

120

140

160

Ver. 1.0 Ver. 1.1 Ver. 1.2 Ver. 1.3 Ver. 1.4 Ver. 1.5 Ver. 1.6

NumberofDefects

Severity 5 Severity 4 Severity 3 Severity 2 Severity 1


26/28

Software Reliability Trend

Field Reliability of Released Versions

00.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.91

Ver. 1.0 Ver. 1.1 Ver. 1.2 Ver. 1.3 Ver. 1.4 Ver. 1.5 Ver.1.6

Reliabi

lity


27/28

Summary

Use field defect data to provide metrics onoverall test reliability and softwarereliability. Trend them over time.

For each escaped bug, write a test case(s)that will find that and similar bugs.Include this test case in the pre-release

testing of future versions. Testing becomes oriented toward the types

of problems that actually occur in the field.

This is testing smarter, not harder. Test effectiveness and field reliability will

go up.


28/28

Software Quality First

Jessee Ring Principal Consultant

40119 San Carlos Place

Fremont, CA 94539510-915-2353

[email protected]

www.sqa1st.com

09-Software Reliability Improvement

Documents

Transcript of 09-Software Reliability Improvement