Introduction to Software Testing Why Do We Test Software? Dr. Pedro Mejia Alvarez.

Introduction to Software Testing

Why Do We Test Software?

Dr. Pedro Mejia Alvarez

Testing in the 21st Century Software defines behavior

– network routers, finance, switching networks, other infrastructure

Today’s software market :– is much bigger– is more competitive– has more users

Embedded Control Applications– airplanes, air traffic control– spaceships– watches– ovens– remote controllers

Agile processes put increased pressure on testers– Programmers must unit test – with no training or education!– Tests are key to functional requirements – but who builds those

tests ?Introduction to Software Testing, 2

– PDAs– memory seats – DVD players– garage door openers– cell phones

Industry is going through a revolution

in what testing means to the

success of software products

Software is a Skin that Surrounds Our Civilization

Introduction to Software Testing, 3

Quote due to Dr. Mark Harman


Software Fault : A static defect in the software

Software Error : An incorrect internal state that is the manifestation of some fault

Software Failure : External, incorrect behavior with respect to the requirements or other description of the expected behavior

Faults in software are equivalent to design mistakes in hardware.

Software does not degrade.

Software Faults, Errors & Failures

Fault and Failure Example A patient gives a doctor a list of symptoms

– Failures The doctor tries to diagnose the root cause, the

ailment– Fault

The doctor may look for anomalous internal conditions (high blood pressure, irregular heartbeat, bacteria in the blood stream)– Errors


Most medical problems result from external attacks (bacteria, viruses) or physical

degradation as we age.They were there at the beginning and do not

“appear” when a part wears out.

A Concrete Example


public static int numZero (int [ ] arr){ // Effects: If arr is null throw NullPointerException // else return the number of occurrences of 0 in arr int count = 0; for (int i = 1; i < arr.length; i++) { if (arr [ i ] == 0) { count++; } } return count;}

Fault: Should start searching at 0, not 1

Test 1[ 2, 7, 0 ]Expected: 1Actual: 1

Test 2[ 0, 2, 7 ]Expected: 1Actual: 0

Error: i is 1, not 0, on the first iterationFailure: none

Error: i is 1, not 0Error propagates to the variable countFailure: count is 0 at the return statement

The Term Bug Bug is used informally Sometimes speakers mean fault, sometimes error,

sometimes failure … often the speaker doesn’t know what it means !

This class will try to use words that have precise, defined, and unambiguous meanings


BUG

“It has been just so in all of my inventions. The first step is an intuition, and comes with a burst, then difficulties arise—this thing gives out and [it is] then that 'Bugs'—as such little faults and difficulties are called—show themselves and months of intense watching, study and labor are requisite. . .” – Thomas Edison

“an analyzing process must equally have been performed in order to furnish the Analytical Engine with the necessary operative data; and that herein may also lie a possible source of error. Granted that the actual mechanism is unerring in its processes, the cards may give it wrong orders. ” – Ada, Countess Lovelace (notes on Babbage’s Analytical Engine)

Spectacular Software Failures


Intel’s Pentium FDIV fault : Public relations nightmare

THERAC-25 radiation machine : Poor testing of safety-critical software can cost lives : 3 patients were killed

Mars PolarLander crashsite?

THERAC-25 design

Ariane 5:exception-handlingbug : forced selfdestruct on maidenflight (64-bit to 16-bitconversion: about370 million $ lost)

We need our software to be

dependableTesting is one way to assess

dependability

NASA’s Mars lander: September 1999, crashed due to a units integration fault

Ariane 5 explosion : Very expensive

Northeast Blackout of 2003


Affected 10 million people in Ontario,

Canada

Affected 40 million people in 8 US

states

Financial losses of

$6 Billion USD

508 generating units and 256

power plants shut down

The alarm system in the energy management system failed due to a software error and operators were not informed of

the power overload in the system

Costly Software Failures


NIST report, “The Economic Impacts of Inadequate Infrastructure for Software Testing” (2002)– Inadequate software testing costs the US alone

between $22 and $59 billion annually– Better approaches could cut this amount in half

Huge losses due to web application failures– Financial services : $6.5 million per hour (just in

USA!)– Credit card sales applications : $2.4 million per hour

(in USA) In Dec 2006, amazon.com’s BOGO offer turned

into a double discount 2007 : Symantec says that most security

vulnerabilities are due to faulty softwareWorld-wide monetary loss due to poor software is

staggering

Testing in the 21st Century More safety critical, real-time software Embedded software is ubiquitous … check

your pockets Enterprise applications means bigger

programs, more users Paradoxically, free software increases our

expectations ! Security is now all about software faults

– Secure software is reliable software The web offers a new deployment platform

– Very competitive and very available to more users– Web apps are distributed– Web apps must be highly reliable


Industry desperately needs our inventions !

What Does This Mean?


Software testing is getting more important

What are we trying to do when we test ?

What are our goals ?


Testing Goals Based on Test Process Maturity

Level 0 : There’s no difference between testing and debugging

Level 1 : The purpose of testing is to show correctness

Level 2 : The purpose of testing is to show that the software doesn’t work

Level 3 : The purpose of testing is not to prove anything specific, but to reduce the risk of using the software

Level 4 : Testing is a mental discipline that helps all IT professionals develop higher quality software


Level 0 Thinking Testing is the same as debugging

Does not distinguish between incorrect behavior and mistakes in the program

Does not help develop software that is reliable or safe

This is what we teach undergraduate CS majors


Level 1 Thinking Purpose is to show correctness Correctness is impossible to achieve What do we know if no failures?

– Good software or bad tests?

Test engineers have no:– Strict goal– Real stopping rule– Formal test technique– Test managers are powerless

This is what hardware engineers often expect


Level 2 Thinking Purpose is to show failures

Looking for failures is a negative activity

Puts testers and developers into an adversarial relationship

What if there are no failures?

This describes most software companies.

How can we move to a team approach ??


Level 3 Thinking Testing can only show the presence of

failures

Whenever we use software, we incur some risk

Risk may be small and consequences unimportant

Risk may be great and consequences catastrophic

Testers and developers cooperate to reduce risk

This describes a few “enlightened” software companies


Level 4 ThinkingA mental discipline that increases

quality

Testing is only one way to increase quality

Test engineers can become technical leaders of the project

Primary responsibility to measure and improve software quality

Their expertise should help the developersThis is the way “traditional” engineering works

Where Are You?


Are you at level 0, 1, or 2 ?

Is your organization at work at level 0, 1, or 2 ?

Or 3?

We hope to teach you to become “change agents” in

your workplace …Advocates for level 4 thinking

Tactical Goals : Why Each Test ?


Written test objectives and requirements must be documented

What are your planned coverage levels? How much testing is enough? Common objective – spend the budget … test until the ship-date …–Sometimes called the “date criterion”

If you don’t know why you’re conducting each test, it won’t be very helpful


Here! Test This!

MicroSteff – bigsoftware systemfor the mac

V.1.5.1 Jan/2007

VerdatimDataLifeMF2-HD1.44 MBBig software

program

Jan/2011

My first “professional” job

A stack of computer printouts—and no documentation


Why Each Test ?

1980: “The software shall be easily maintainable”

Threshold reliability requirements?

What fact does each test try to verify?

Requirements definition teams need testers!

If you don’t start planning for each test when the functional requirements are formed, you’ll never know why you’re conducting the test


Cost of Not Testing

Testing is the most time consuming and expensive part of software development

Not testing is even more expensive If we have too little testing effort early,

the cost of testing increases Planning for testing after development

is prohibitively expensive

Poor Program Managers might say: “Testing is too expensive.”

Cost of Late Testing


60

50

40

30

20

10

0

Require

men

ts

Prog /

Unit Te

st

Desig

n

Inte

gratio

n Tes

t

Fault origin (%)

Fault detection (%)

Unit cost (X)

Software Engineering Institute; Carnegie Mellon University; Handbook CMU/SEI-96-HB-002

Assume $1000 unit cost, per fault, 100 faults

$6K

$13K

$20K

$360K

$250K

Syste

m Tes

t

Post-Dep

loym

ent

$100K

Summary: Why Do We Test Software ?


A tester’s goal is to eliminate faults as early as possible

• Improve quality• Reduce cost• Preserve customer

satisfaction

26

Software Testing

Testing is the process of exercising a program with the specific intent of finding errors prior to delivery to the end user.

27

What Testing Showserrors

requirements conformance

performance

an indicationof quality

28

Strategic Approach

To perform effective testing, you should conduct effective technical reviews. By doing this, many errors will be eliminated before testing commences.

Testing begins at the component level and works "outward" toward the integration of the entire computer-based system.

Different testing techniques are appropriate for different software engineering approaches and at different points in time.

Testing is conducted by the developer of the software and (for large projects) an independent test group.

Testing and debugging are different activities, but debugging must be accommodated in any testing strategy.

29

V & V

Verification refers to the set of tasks that ensure that software correctly implements a specific function.

Validation refers to a different set of tasks that ensure that the software that has been built is traceable to customer requirements. Boehm [Boe81] states this another way: – Verification: "Are we building the product

right?" – Validation: "Are we building the right

product?"

30

Who Tests the Software?

developer independent tester

Understands the system

but, will test "gently"

and, is driven by "delivery"

Must learn about the system,

but, will attempt to break it

and, is driven by quality

31

Testing Strategy

System engineering

Analysis modeling

Design modeling

Code generation Unit test

Integration test

Validation test

System test

32

Testing Strategy

We begin by ‘testing-in-the-small’ and move toward ‘testing-in-the-large’

For conventional software– The module (component) is our initial focus– Integration of modules follows

For OO software– our focus when “testing in the small” changes from

an individual module (the conventional view) to an OO class that encompasses attributes and operations and implies communication and collaboration

33

Strategic Issues

Specify product requirements in a quantifiable manner long before testing commences.

State testing objectives explicitly. Understand the users of the software and develop a profile

for each user category. Develop a testing plan that emphasizes “rapid cycle

testing.” Build “robust” software that is designed to test itself Use effective technical reviews as a filter prior to testing Conduct technical reviews to assess the test strategy and

test cases themselves. Develop a continuous improvement approach for the testing

process.

34

Unit Testing

moduleto be

tested

test cases

results

softwareengineer

35

Unit Testing

interface

local data structures

boundary conditions

independent paths

error handling paths

moduleto be

tested

test cases

36

Unit Test Environment

Module

stub stub

driver

RESULTS

interface

local data structures

boundary conditions

independent paths

error handling paths

test cases

37

Integration Testing Strategies

Options:• the “big bang” approach• an incremental construction strategy

38

Top Down Integration

top module is tested with stubs

stubs are replaced one at a time, "depth first"

as new modules are integrated, some subset of tests is re-run

A

B

C

D E

F G

39

Bottom-Up Integration

drivers are replaced one at a time, "depth first"

worker modules are grouped into builds and integrated

A

B

C

D E

F G

cluster

40

Regression Testing

Regression testing is the re-execution of some subset of tests that have already been conducted to ensure that changes have not propagated unintended side effects

Whenever software is corrected, some aspect of the software configuration (the program, its documentation, or the data that support it) is changed.

Regression testing helps to ensure that changes (due to testing or for other reasons) do not introduce unintended behavior or additional errors.

Regression testing may be conducted manually, by re-executing a subset of all test cases or using automated capture/playback tools.

41

Object-Oriented Testing

begins by evaluating the correctness and consistency of the analysis and design models

testing strategy changes– the concept of the ‘unit’ broadens due to

encapsulation– integration focuses on classes and their execution

across a ‘thread’ or in the context of a usage scenario

– validation uses conventional black box methods test case design draws on conventional

methods, but also encompasses special features

42

OO Testing Strategy class testing is the equivalent of unit

testing– operations within the class are tested– the state behavior of the class is examined

integration applied three different strategies– thread-based testing—integrates the set of

classes required to respond to one input or event

– use-based testing—integrates the set of classes required to respond to one use case

– cluster testing—integrates the set of classes required to demonstrate one collaboration

43

WebApp Testing - I

The content model for the WebApp is reviewed to uncover errors.

The interface model is reviewed to ensure that all use cases can be accommodated.

The design model for the WebApp is reviewed to uncover navigation errors.

The user interface is tested to uncover errors in presentation and/or navigation mechanics.

Each functional component is unit tested.

44

WebApp Testing - II

Navigation throughout the architecture is tested. The WebApp is implemented in a variety of different

environmental configurations and is tested for compatibility with each configuration.

Security tests are conducted in an attempt to exploit vulnerabilities in the WebApp or within its environment.

Performance tests are conducted. The WebApp is tested by a controlled and monitored population

of end-users. The results of their interaction with the system are evaluated for content and navigation errors, usability concerns, compatibility concerns, and WebApp reliability and performance.

45

High Order Testing Validation testing

– Focus is on software requirements System testing

– Focus is on system integration Alpha/Beta testing

– Focus is on customer usage Recovery testing

– forces the software to fail in a variety of ways and verifies that recovery is properly performed

Security testing– verifies that protection mechanisms built into a system will, in fact,

protect it from improper penetration Stress testing

– executes a system in a manner that demands resources in abnormal quantity, frequency, or volume

Performance Testing– test the run-time performance of software within the context of an

integrated system

46

Debugging: A Diagnostic Process

47

The Debugging Process

48

Debugging Effort

time requiredto diagnose thesymptom anddetermine thecause

time requiredto correct the errorand conductregression tests

49

Symptoms & Causes

symptomcause

symptom and cause may be geographically separated

symptom may disappear when another problem is fixed

cause may be due to a combination of non-errors

cause may be due to a system or compiler error

cause may be due to assumptions that everyone believes

symptom may be intermittent

50

Consequences of Bugs

damage

mildannoying

disturbingserious

extremecatastrophic

infectious

Bug Type

Bug Categories: function-related bugs, system-related bugs, data bugs, coding bugs, design bugs, documentation bugs, standards violations, etc.

51

Debugging Techniques

brute force / testing

backtracking

induction

deduction

52

Correcting the Error

Is the cause of the bug reproduced in another part of the program? In many situations, a program defect is caused by an erroneous pattern of logic that may be reproduced elsewhere.

What "next bug" might be introduced by the fix I'm about to make? Before the correction is made, the source code (or, better, the design) should be evaluated to assess coupling of logic and data structures.

What could we have done to prevent this bug in the first place? This question is the first step toward establishing a statistical software quality assurance approach. If you correct the process as well as the product, the bug will be removed from the current program and may be eliminated from all future programs.

What if the bug persist and I cannot correct it ? Can I do Fault Tolerance ?

53

Final Thoughts

Think -- before you act to correct Use tools to gain additional insight If you’re at an impasse, get help from someone

else Once you correct the bug, use regression

testing to uncover any side effects

Introduction to Software Testing Why Do We Test Software? Dr. Pedro Mejia Alvarez.

Documents

Transcript of Introduction to Software Testing Why Do We Test Software? Dr. Pedro Mejia Alvarez.