INFO 631 Prof . Glenn Booker

72
www.ischool.drexel.edu INFO 631 Prof. Glenn Booker Week 2 – Reliability Models and Customer Satisfaction 1 INFO631 Week 2

description

INFO 631 Prof . Glenn Booker. Week 2 – Reliability Models and Customer Satisfaction. Reliability Models. Reliability Models are used here to treat software as though we expect predictable performance from using it on a regular basis - PowerPoint PPT Presentation

Transcript of INFO 631 Prof . Glenn Booker

Page 1: INFO  631 Prof . Glenn Booker

www.ischool.drexel.edu

INFO 631 Prof. Glenn Booker

Week 2 – Reliability Models and Customer Satisfaction

1INFO631 Week 2

Page 2: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 2

Reliability Models

• Reliability Models are used here to treat software as though we expect predictable performance from using it on a regular basis

• Hence this assumes we’re dealing with fairly stable requirements, and a well controlled environment

Page 3: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 3

Why use Reliability Models?

• Determine objective quality of the product• Use for planning resources needed to fix

problems

Page 4: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 4

Independent Variable in Reliability Growth Models

• Typical scope of measurement (X axis) includes one of these:– Calendar time (days of testing)– Cumulative testing effort (hours of testing)– Computer execution time (e.g. number of

CPU hours)

Page 5: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 5

Key Dependent Variables for Reliability Growth Models

• Typical dependent variables (Y) include:– Number of defects found per life cycle phase,

or total number ever– Cumulative number of failures over time– Failure rate over time– Time between failures

Page 6: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 6

Terminology• Reliability – probability that system

functions without failure for a specified time or number of natural units in a specified environment– “Natural unit” is related to an output of a

system• Per run of a program• Per hours of CPU execution• Per transaction (sale, shipment)

Page 7: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 7

Terminology

• Availability – probability at any given time that a system functions satisfactorily in a given environment

• Failure intensity – the number of failures per natural or time unit

Page 8: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 8

Software Reliability Modeling

• Requires characterizing and applying– The required major development

characteristics or goals• Reliability• Availability• Delivery date• Life-cycle cost (development,

maintenance, training, etc.)

Page 9: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 9

Software Reliability Modeling– The expected relative use of the software’s

functions (i.e. its operational profile)• Focus resources on functions in proportion to their

use and criticality

Page 10: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 10

Operational Profile

• Operational profile - a complete set of operations with their probabilities of occurrence– Operation = a major system logical task of

short duration which returns control to the system when complete, a.k.a. a scenario

Page 11: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 11

Types of Reliability Models

• Static and Dynamic• Static

– Uses other product characteristics (size, complexity, etc.) to estimate number of defects

– Good for module-level estimates (detailed)– See discussions of size and complexity measures

Page 12: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 12

Types of Reliability Models

• Dynamic– Based on statistical distribution; uses current defect

pattern to estimate future reliability– Good for product-level estimates (large scale)– Includes the Rayleigh and Exponential models

Page 13: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 13

Dynamic Reliability Models• Model entire development process

– Rayleigh model; • Model back-end formal testing

(after coding)– Exponential model

• Both are a function of time or life cycle phase, and are part of the Weibull family of distributions

“back-end” here refers to the later phases of the life cycle

Page 14: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 14

Define

• PDF = Probability Density Function, is the number of defects which will be found per life cycle phase

• CDF = Cumulative Density Function, is the total number of defects which will be found, as a function of life cycle phase

Page 15: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 15

Weibull Model

Let: m = curve family, c = shape parameter, t = time

Then: PDF = (m/t)*(t/c)^m*exp(-(t/c)^m) CDF = 1 - exp(-(t/c)^m) What can this look like? Lots of things!First, fix c=1.5 and look at various ‘m’ values

Page 16: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 16

Weibull for M=0.5 and 1, c=1.5

TIME

5.50

5.20

4.90

4.60

4.30

4.00

3.70

3.40

3.10

2.80

2.50

2.20

1.90

1.60

1.30

1.00

.70

.40

.10

PD

F

1.2

1.0

.8

.6

.4

.2

0.0

PDF for M=0.5

PDF for M=1M=0.5

M=1

Page 17: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 17

M=1.5

M=2

Weibull for M=1.5 and 2, c=1.5

TIME

5.50

5.20

4.90

4.60

4.30

4.00

3.70

3.40

3.10

2.80

2.50

2.20

1.90

1.60

1.30

1.00

.70

.40

.10

PDF

.7

.6

.5

.4

.3

.2

.1

0.0

PDF for M=1.5

PDF for M=2

Notice the Y axis

range is changing

Page 18: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 18

Weibull for M=3 and 5, c=1.5

TIME

5.50

5.20

4.90

4.60

4.30

4.00

3.70

3.40

3.10

2.80

2.50

2.20

1.90

1.60

1.30

1.00

.70

.40

.10

PD

F

1.4

1.2

1.0

.8

.6

.4

.2

0.0

PDF for M=3

PDF for M=5

M=3

M=5

Page 19: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 19

M=7

M=9

Weibull for M=7 and 9, c=1.5

TIME

5.50

5.20

4.90

4.60

4.30

4.00

3.70

3.40

3.10

2.80

2.50

2.20

1.90

1.60

1.30

1.00

.70

.40

.10

PDF

2.5

2.0

1.5

1.0

.5

0.0

PDF for M=7

PDF for M=9

For large ‘m’ values, Weibull

looks like a normal

distribution centered on ‘c’

Page 20: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 20

Rayleigh Model

• The history of defect discovery across the life cycle phases often looks like the Rayleigh probability distribution

• Rayleigh model is a formal parametric model, used to produce estimates of the future defect count

Page 21: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 21

Rayleigh Model

• Rayleigh model and defect origin/found analyses deal with the defect pattern of the entire software development process

• Is a good tool, since it can provide sound estimates of defect discovery from fairly early in the life cycle

Page 22: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 22

Rayleigh Model

Let: m = 2, c = scale parameter, t = time Then:PDF = (2/t)*(t/c)^2*exp(-(t/c)^2)

CDF = Cumulative defect arrival patternCDF = 1 - exp(-(t/c)^2)

Page 23: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 23

Rayleigh Model Assumptions

1. Defect rate during development is correlated with defect rate after release.

2. If defects are discovered and removed earlier in development, fewer will remain in later stages.

In short, “Do it right the first time.”

Page 24: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 24

Rayleigh Model

• The value of ‘c’ determines when the curve peaks– tmax = c/(2) is the peak

• Area up to tmax is where 39.35% of all defects will be found (ideally)

• Now look at influence of ‘c’ value on curve shape

Page 25: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 25

Weibull for M=2, c=1, 1.5, and 2

TIME

5.50

5.20

4.90

4.60

4.30

4.00

3.70

3.40

3.10

2.80

2.50

2.20

1.90

1.60

1.30

1.00

.70

.40

.10

PDF

1.0

.8

.6

.4

.2

0.0

PDF for M=2, c=1

PDF for M=2

PDF for M=2, c=2

, c=1.5

Page 26: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 26

Rayleigh Model Implementation

• Various tools can model the Rayleigh curve– PASW/SPSS (using Regression Module)– SAS– SLIM (by Quantitative Software Management)– STEER (by IBM)

Page 27: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 27

Rayleigh Model Reliability• Statistical reliability relates to confidence interval

of the estimate, which is in turn related to sample size

• Small sample size (only 6 data points per project) means low statistical reliability, often underestimating actual later reliability

• Improve this by using other models and comparing results

Page 28: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 28

PTR Submodel

• A variation on the Rayleigh model can be used for predicting defects which will be found during integration of new software into a system– PTR is Program Trouble Report or Problem

Tracking Report, a common mechanism for defect tracking

• Follows the same idea as Rayleigh

Page 29: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 29

Reliability Growth Models - Exponential Model

• Exponential model is the basic reliability growth model - i.e. reliability will tend to increase over time

• Other reliability models include: Time Between Failure Models and Fault Count Models

Page 30: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 30

Exponential Model• Reliability growth models are based on data

from the formal testing phase– After the software has been completely

integrated (compiled & built)– When the software is being tested with test

cases chosen randomly to approximate an operational (real-world usage) profile

– Testing is customer oriented

Page 31: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 31

Exponential Model

• Rationale is that defect arrival during testing is a good indicator of the reliability of the product when used by customers

• During this testing phase, failures occur, defects are fixed, software becomes more stable, and reliability grows over time

Page 32: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 32

Exponential Model

• Is a Weibull distribution with m = 1• Let: c = scale parameter,

t = time, l = 1/cCDF = 1 - exp(-t/c) = 1 - exp(-lt)PDF = (1/c)*exp(-t/c) = l*exp(-lt)• l is the error detection rate or hazard rate• This form also works for light bulb failures, computer

electrical failures, etc.

Page 33: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 33

Typical Time Between Failure Model Assumptions

• There are N unknown software faults at the start of testing

• Failures occur randomly• All faults contribute equally to failure• Fix time is negligibly small• Fix is perfect for each fault

Page 34: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 34

Time Between Failure Models

• Jelinski-Moranda (J-M) Model– Assumes random failures, perfect zero time fixes, all

faults equally bad• Littlewood Models

– Like J-M model, but assumes bigger faults arefound first

• Goel-Okumoto Imperfect Debugging Model– Like J-M model, but with bad fixes possible

Page 35: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 35

Fault Count Model Assumptions

• Testing intervals are independent of each other

• Testing during intervals is reasonably homogeneous

• Number of defects detected is independent of each other

Page 36: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 36

Fault Count Models• Goel-Okumoto Nonhomogeneous Poisson

Process Model (NHPP)– # of failures in a time period, exponential

failure rate (i.e. the exponential model!)• Musa-Okumoto Logarithmic Poisson

Execution Time Model– Like NHPP, but later fixes have less effect

on reliability

Page 37: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 37

Cumulative Defects versus Cumulative Test Hours

Goel Okumoto model:

m(t) = a*(1 - e-b*t)

l(t) = m’(t) = a*b* e-b*t

where: m(t) = expected number of failures observed at time t l(t) = failure density a = expected total number of defects b = constant

Page 38: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 38

Fault Count Models

• The Delayed S and Inflection S Models– Delayed S: Recognizes time between failure

detection and fix– Inflection S: As failures are detected,

they reveal more failures

Page 39: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 39

Mean Time to Failure (MTTF)

• Mean Time to Failure is the average amount of time using the product between failures

• MTTF = (total run time) / (number of failures)

Page 40: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 40

Software Reliability Modeling:Time Between Failures

• Time between failures is expected to increase, as failures occur and faults are fixed

Execution Time Line0

Failure

Page 41: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 41

Software Reliability Modeling: Time Between Failures

Reliability, R(t) - probability of failure free operation for a specified period of time

Time Since Last Failure (t)

1.0

Rel

iabi

lity

Page 42: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 42

WARNING

• Reliability models can be wildly inaccurate, particularly if based on little and/or irrelevant data (e.g. from other industries, or using bad assumptions)

• Validate estimates with other models and common sense

Page 43: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 43

Reliability Modeling

1. Examine data on a scatter diagram. Look for trends and level of detail.

2. Select model(s) to fit the data.3. Estimate the parameters of each model.4. Obtain fitted model using those parameters.5. Check goodness-of-fit and reasonableness

of models.6. Make predictions using fitted models.

Page 44: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 44

Test Compression Factor

• Defect detection during testing is different from that by customer usage, hence the defect rates may change.

• Result is that fewer defects are found just after product release

• Or, testing is better at finding defects than customer usage

Page 45: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 45

Test Compression Factor

• Hence for maintenance, use reliability models ONLY for defect number or rate, and look for field defect rate patterns to be different from those found during development (number of defects found drops after release, due to less effective customer “testing”)

Page 46: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 46

Customer Satisfaction

Page 47: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 47

Customer Satisfaction

• Customer evaluation of software is the most critical “test”

• Want to understand what their priorities are, in order to obtain and keep their business

Page 48: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 48

Total Quality Management

• Expanded from just product quality to maintaining a long term customer relationship

• 5x cheaper to keep an existing customer than find a new one

• Unhappy customers tell 7-20 people, versus happy customers tell only 3-5 people

Page 49: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 49

Customer Satisfaction Surveys

• Customer call-back after x days• Customer complaints• Direct customer visits• Customer user groups• Conferences

Page 50: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 50

Customer Satisfaction Surveys

• Want representative sample of all customers

• Three main methods– In person interviews

Can note detailed reactionsMay introduce interviewer biasExpensive

Page 51: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 51

Customer Satisfaction Surveys– Telephone interviews

Can still be very validCheaper than in person interviewsLack of interactionLimited audience

– Mail questionnairesHow representative?Low response rateVery cheap

Page 52: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 52

Sampling Methods

• Often can’t survey entire user population• Four methods

– Simple random sampleMust be truly random, not just convenient

– Systematic samplingUse every nth customer from a list

Page 53: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 53

Stratified Sampling

– Group customers into categories (strata); get simple random samples from each category (stratum). Can be very efficient method.

– Can weigh each stratum equally (proportional s.s.) or unequally (disproportional s.s.)

– For unequal, make fraction ~ standard deviation of stratum, and ~ 1/ square root (cost of sampling). F ~ s/sqrt(cost)where “sqrt” is “square root”

“~” means “is proportional to”

Page 54: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 54

Cluster Sampling

• Divide population into (geographic) clusters, then do simple random samples within each selected cluster– Try for representative clusters– Not as efficient as simple random sampling,

but cheaper– Typically used for in-person interviews

Page 55: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 55

Bias

• Look out for sample bias!• E.g. basing a national voting survey on a

Web-based poll

Page 56: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 56

Sample Size

• How big is enough?• Depends on:

– Confidence level (80 - 99%, to get Z)– Margin of error (B = 3 - 5%)

• For simple random sample, also need – Estimated satisfaction level (p), and – Total population size (N = total number

of customers)

Page 57: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 57

What’s ‘Z’?

• ‘Z’ is the critical Z value for a two-sided test of means

• Here we are striving for a sample whose mean customer satisfaction is close enough to the population’s mean – where “close enough” is defined by the Z value

Page 58: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 58

What Confidence Level?

• The results are always subject to the desired confidence level – since we are never perfectly sure of our results– For analysis of medical test results, typically

insist on 99% confidence– Otherwise 95% is commonly used– Software tests may use as low as 80%

Page 59: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 59

Critical Z values

Confidence Level 2-sided critical Z

80% 1.28

90% 1.645

95% 1.96

99% 2.57

Page 60: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 60

Sample Size

• Sample size is given by n = [N*Z^2*p*(1-p)]/[N*B^2 + Z^2*p*(1-p)]

• Note that the sample size depends heavily on the answer we want to obtain, the actual level of customer satisfaction (p)!

Page 61: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 61

Sample Size

• If we choose– 80% confidence level, then Z = 1.28– 5% margin of error, then B = 0.05– and expect 90% satisfaction, then p = 0.90

• n = (N*1.28^2*0.9*0.1)/ (N*0.05^2 + 1.28^2*0.9*0.1)

• n = 0.1475*N/(0.0025*N + 0.1475)Notice for B and p that percents are converted to decimals!

Page 62: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 62

Sample SizeGiven:Z 1.28p 0.9B 0.05

Hence:Z^2 1.6384p(1-p) 0.09B^2 0.0025

Find:N n

10 8.55035520 14.9355850 27.06052

100 37.09996200 45.54935500 52.75873

1000 55.6972410000 58.63655

100000 58.947631000000 58.97892Infinity 58.9824

<- Sampling isn’t very helpful for small populations!

Page 63: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 63

Sample Size

• If don’t know customer satisfaction value ‘p’, use 0.5 as worst-case estimate

• Once the real value of ‘p’ is known, solve for the actual value of B (margin of error)

• Key challenge is finding a truly representative sample

Page 64: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 64

Analysis of Customer Satisfaction Data

• Use five point scale (very satisfied, sat., neutral, dissat., very dissat.)

• May convert to numeric scale; 1=very dissatisfied, 2=dissatisfied, etc.

• Typically use 95% confidence level (Z=1.96), but 80% may be okay to show hint of trend

Page 65: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 65

Presentation of Customer Satisfaction Data

• Make running plot of % satisfied vs time, with +/- margin of error (B)

• Some like to plot percent dissatisfied instead

• May want to break satisfaction into detailed categories, and track each of them separately

Page 66: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 66

Other Satisfaction Notes

• Key issues raised by customers may not be most needed areas of development (e.g. documentation vs reliability)

• Can examine correlation of specific satisfaction attributes to overall satisfaction; is bad X really an indicator of dissatisfied customers?

• Use regression analysis to answer this

Page 67: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 67

CUPRIMDA (per IBM)

• Capability (functionality)• Usability• Performance• Reliability• Installability• Maintainability• Documentation• Availability

Can measure customer satisfaction for each of

these areas, plus overall satisfaction

Page 68: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 68

Multiple Regression

• We have had models with one variable related to another, e.g.Schedule = a*(Effort)^b

• Linear and logarithmic regression can also be done with many variables, like: Overall Satisfaction = a + b*(Usability Sat.) + c*(Performance Sat.) + d*(Reliability Sat.) and so on

Page 69: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 69

Multiple Regression

• This results in estimates of constants a, b, etc.– A linear regression is often better

for real-valued data– Logistical regression is often better for data

which may only have two values (Yes/No, T/F)• Sometimes both are tried to see which

gives the best results

Page 70: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 70

Now What?

• Plot each factor’s regression coefficient (a, b, …) vs. the customer satisfaction level (%) for that factor; then on this plot:

• Determine priorities for improving customer satisfaction from top to bottom (then left to right, if there are equal coefficients)

Page 71: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 71

Non-product Satisfaction

• Many other areas can affect customer satisfaction– Technical solutions - product factors, and

technologies used– Support & Service - availability, knowledge– Marketing - point of contact, information– Administration - invoicing, warranty– Delivery - speed, follow-through– Company image - stability, trustworthiness

Page 72: INFO  631 Prof . Glenn Booker

www.ischool.drexel.eduINFO631 Week 2 72

Next Steps

• Measure and monitor your and competitors’ customer satisfaction– In order to compete, your satisfaction level

must be better than your competition’s• Analyze what aspects are most critical to

customer satisfaction• Determine the root cause of shortcomings• Set quantitative targets, both overall and for

specific aspects• Prepare & implement a plan to do the above