Sommerville, Ammann-Offutt, Mejia, 2009 Software Engineering Slide 1 Verification and Validation.

Sommerville, Ammann-Offutt, Mejia, 2009 Software Engineering Slide 1

Verification and Validation

Objectives

To introduce software verification and validation and to discuss the distinction between them

To describe the program inspection process and its role in V & V

To explain static analysis as a verification technique To describe the Cleanroom software development

process

Topics covered

Introduction to testing Sources of errors Why do we need testing ? Verification and validation planning Software inspections Automated static analysis Cleanroom software development

Nasty question

Suppose you are being asked to lead the team to test the software that controls a new X-ray machine. Would you take that job?

Would you take it if you could name your own price?

What if the contract says you’ll be charged with murder in case a patient dies because of a mal-functioning of the software?

A few spectacular software failuresA few spectacular software failures

Failures in Production Software

NASA’s Mars lander, September 1999, crashed due to a units integration fault—over $50 million US !

Huge losses due to web application failures• Financial services : $6.5 million per hour• Credit card sales applications : $2.4 million per hour

In Dec 2006, amazon.com’s BOGO offer turned into a double discount 2007 : Symantec says that most security vulnerabilities are due to faulty

software Stronger testing could solve most of these problems

World-wide monetary loss due to poor software is World-wide monetary loss due to poor software is staggeringstaggering

Thanks to Dr. Sreedevi Sampath

Bypass Testing Results

— Vasileios Papadimitriou. Masters thesis, Automating Bypass Testing for Web Applications, GMU 2006

Why Does Testing Matter?

NIST report, “The Economic Impacts of Inadequate Infrastructure for Software Testing” (2002)

– Inadequate software testing costs the US alone between $22 and $59 billion annually

– Better approaches could cut this amount in half Major failures: Ariane 5 explosion, Mars Polar

Lander, Intel’s Pentium FDIV bug Insufficient testing of safety-critical software can cost

lives: THERAC-25 radiation machine: 3 dead

We want our programs to be reliable– Testing is how, in most cases, we find out if they

Mars PolarLander crashsite?

THERAC-25 design

Ariane 5:exception-handlingbug : forced selfdestruct on maidenflight (64-bit to 16-bitconversion: about370 million $ lost)

Software is a Skin that Surrounds Our Civilization

Airbus 319 Safety Critical Software Control

Loss of autopilot

Loss of both the commander’s and the co‑pilot’s primary flight and navigation displays

Loss of most flight deck lighting and intercom

Northeast Blackout of 2003

Affected 10 million people in Ontario,

Canada

Affected 40 million people in 8 US

states

Financial losses of

$6 Billion USD

508 generating units and 256

power plants shut down

The alarm system in the energy management system failed due to a software error and operators were not informed of

the power overload in the system

Software testing is getting more Software testing is getting more importantimportant

Testing in the 21st Century We are going through a time of change Software defines behavior

• network routers, finance, switching networks, other infrastructure Today’s software market :

• is much bigger

• is more competitive

• has more users

Agile processes put increased pressure on testers Embedded Control Applications

• airplanes, air traffic control

• spaceships

• watches

• ovens• remote controllers

– PDAs– memory seats – DVD players– garage door openers– cell phones

Industry is going through a revolution

in what testing means to the

success of software products

Testing in the 21st Century

More safety critical, real-time software Enterprise applications means bigger programs, more users Embedded software is ubiquitous … check your pockets Paradoxically, free software increases our expectations ! Security is now all about software faults

• Secure software is reliable software The web offers a new deployment platform

• Very competitive and very available to more users• Web apps are distributed• Web apps must be highly reliable

Industry desperately needs our inventions !Industry desperately needs our inventions !

Mismatch in Needs and Goals

Industry wants testing to be simple and easy• Testers with no background in computing or math

Universities are graduating scientists• Industry needs engineers

Testing needs to be done more rigorously Agile processes put lots of demands on testing

• Programmers must unit test – with no training, education or tools !

• Tests are key components of functional requirements – but who builds those tests ?

Bottom line—lots of crappy Bottom line—lots of crappy softwaresoftware

Here! Test This!

MicroSteff – bigsoftware systemfor the mac

V.1.5.1 Jan/2007

VerdatimDataLifeMF2-HD1.44 MBBig software

program

Jan/2007

My first “professional” job

A stack of computer printouts—and no documentation

Cost of Testing

In the real-world, testing is the principle post-design activity

Restricting early testing usually increases cost

Extensive hardware-software integration requires more testing

You’re going to spend at least You’re going to spend at least half of your development half of your development budget on testing, whether you budget on testing, whether you want to or notwant to or not

State-of-the-Art

30-85 errors are made per 1000 lines of source code extensively tested software contains 0.5-3 errors per 1000

lines of source code testing is postponed, as a consequence: the later an error is

discovered, the more it costs to fix it. error distribution: 60% design, 40% implementation. 66% of

the design errors are not discovered until the software has become operational.

Relative cost of error correction

RE design code test operation

Why Test?

Written test objectives and requirements are rare

What are your planned coverage levels?

How much testing is enough?

Common objective – spend the budget …

If you don’t know If you don’t know whywhy you’re you’re conducting a test, it won’t be very conducting a test, it won’t be very helpfulhelpful

Lessons

Many errors are made in the early phases

These errors are discovered late

Repairing those errors is costly

It pays off to start testing real early

Why Test?

1980: “The software shall be easily maintainable”

Threshold reliability requirements?

What fact is each test trying to verify?

Requirements definition teams should include testers!

If you don’t start planning for each test If you don’t start planning for each test when the functional requirements are when the functional requirements are formed, you’ll never know why you’re formed, you’ll never know why you’re conducting the testconducting the test

Cost of Not Testing

Not testing is even more expensive

Planning for testing after development is prohibitively expensive

A test station for circuit boards costs half a million dollars …

Software test tools cost less than $10,000 !!!

Program Managers often Program Managers often say: “Testing is too say: “Testing is too

expensive.”expensive.”

How then to proceed?

Exhaustive testing most often is not feasible

Random statistical testing does not work either if you want to find errors

Therefore, we look for systematic ways to proceed during testing

Some preliminary questions

What exactly is an error?

How does the testing process look like?

When is test technique A superior to test technique B?

What do we want to achieve during testing?

When to stop testing?

Error, fault, failure

an error is a human activity resulting in software containing a fault

a fault is the manifestation of an error

a fault may result in a failure

When exactly is a failure a failure?

Failure is a relative notion: e.g. a failure w.r.t. the specification document

Verification: evaluate a product to see whether it satisfies the conditions specified at the start:

Have we built the system right?

Validation: evaluate a product to see whether it does what we think it should do:

Have we built the right system?

What is our goal during testing?

Objective 1: find as many faults as possible

Objective 2: make you feel confident that the software works OK

Verification: "Are we building the product right”.

The software should conform to its specification.

Validation: "Are we building the right product”.

The software should do what the user really requires.

Verification vs validation

Is a whole life-cycle process - V & V must be applied at each stage in the software process.

Has two principal objectives• The discovery of defects in a system;• The assessment of whether or not the system is

useful and useable in an operational situation.

The V & V process

V& V goals

Verification and validation should establish confidence that the software is fit for purpose.

This does NOT mean completely free of defects.

Rather, it must be good enough for its intended use and the type of use will determine the degree of confidence that is needed.

V & V confidence

Depends on system’s purpose, user expectations and marketing environment• Software function

• The level of confidence depends on how critical the software is to an organisation.

• User expectations• Users may have low expectations of certain kinds of

software.

• Marketing environment• Getting a product to market early may be more

important than finding defects in the program.

Software inspections. Concerned with analysis of the static system representation to discover problems (static verification)• May be supplement by tool-based document and code

analysis Software testing. Concerned with exercising and

observing product behaviour (dynamic verification)• The system is executed with test data and its operational

behaviour is observed

Static and dynamic verification

Static and dynamic V&V

Formalspecification

High-leveldesign

Requirementsspecification

Detaileddesign

Program

Prototype Programtesting

Softwareinspections

Can reveal the presence of errors NOT their absence.

The only validation technique for non-functional requirements as the software has to be executed to see how it behaves.

Should be used in conjunction with static verification to provide full V&V coverage.

Program testing

Defect testing• Tests designed to discover system defects.• A successful defect test is one which reveals the

presence of defects in a system.• Covered in Chapter 23

Validation testing• Intended to show that the software meets its

requirements.• A successful test is one that shows that a requirements

has been properly implemented.

Types of testing

Defect testing and debugging are distinct processes.

Verification and validation is concerned with establishing the existence of defects in a program.

Debugging is concerned with locating and repairing these errors.

Debugging involves formulating a hypothesis about program behaviour then testing these hypotheses to find the system error.

Testing and debugging

The debugging process

Locateerror

Designerror repair

Repairerror

Retestprogram

Testresults

Specification Testcases

Careful planning is required to get the most out of testing and inspection processes.

Planning should start early in the development process.

The plan should identify the balance between static verification and testing.

Test planning is about defining standards for the testing process rather than describing product tests.

V & V planning

The V-model of development

Systemspecification

Systemdesign

Detaileddesign

Module andunit codeand test

Sub-systemintegrationtest plan

Systemintegrationtest plan

Acceptancetest plan

ServiceAcceptance

testSystem

integration testSub-system

integration test

Requirementsspecification

The structure of a software test plan

The testing process. Requirements traceability. Tested items. Testing schedule. Test recording procedures. Hardware and software requirements. Constraints.

The software test plan

Software inspections

These involve people examining the source representation with the aim of discovering anomalies and defects.

Inspections not require execution of a system so may be used before implementation.

They may be applied to any representation of the system (requirements, design,configuration data, test data, etc.).

They have been shown to be an effective technique for discovering program errors.

Inspection success

Many different defects may be discovered in a single inspection. In testing, one defect ,may mask another so several executions are required.

The reuse domain and programming knowledge so reviewers are likely to have seen the types of error that commonly arise.

Inspections and testing

Inspections and testing are complementary and not opposing verification techniques.

Both should be used during the V & V process. Inspections can check conformance with a

specification but not conformance with the customer’s real requirements.

Inspections cannot check non-functional characteristics such as performance, usability, etc.

Program inspections

Formalised approach to document reviews Intended explicitly for defect detection (not

correction). Defects may be logical errors, anomalies in

the code that might indicate an erroneous condition (e.g. an uninitialised variable) or non-compliance with standards.

Inspection pre-conditions

A precise specification must be available. Team members must be familiar with the

organisation standards. Syntactically correct code or other system

representations must be available. An error checklist should be prepared. Management must accept that inspection will

increase costs early in the software process. Management should not use inspections for staff

appraisal ie finding out who makes mistakes.

The inspection process

Inspectionmeeting

Individualpreparation

Overview

Planning

Rework

Follow-up

Inspection procedure

System overview presented to inspection team.

Code and associated documents are distributed to inspection team in advance.

Inspection takes place and discovered errors

are noted. Modifications are made to repair discovered

errors. Re-inspection may or may not be required.

Inspection roles

Inspection checklists

Checklist of common errors should be used to drive the inspection.

Error checklists are programming language dependent and reflect the characteristic errors that are likely to arise in the language.

In general, the 'weaker' the type checking, the larger the checklist.

Examples: Initialisation, Constant naming, loop termination, array bounds, etc.

Inspection checks 1

Inspection checks 2

Inspection rate

500 statements/hour during overview. 125 source statement/hour during individual

preparation. 90-125 statements/hour can be inspected. Inspection is therefore an expensive

process. Inspecting 500 lines costs about 40

man/hours effort - about £2800 at UK rates.

Automated static analysis

Static analysers are software tools for source text processing.

They parse the program text and try to discover potentially erroneous conditions and bring these to the attention of the V & V team.

They are very effective as an aid to inspections - they are a supplement to but not a replacement for inspections.

Static analysis checks

Stages of static analysis

Control flow analysis. Checks for loops with multiple exit or entry points, finds unreachable code, etc.

Data use analysis. Detects uninitialised variables, variables written twice without an intervening assignment, variables which are declared but never used, etc.

Interface analysis. Checks the consistency of routine and procedure declarations and their use

Stages of static analysis

Information flow analysis. Identifies the dependencies of output variables. Does not detect anomalies itself but highlights information for code inspection or review

Path analysis. Identifies paths through the program and sets out the statements executed in that path. Again, potentially useful in the review process

Both these stages generate vast amounts of information. They must be used with care.

LINT static analysis

Use of static analysis

Particularly valuable when a language such as C is used which has weak typing and hence many errors are undetected by the compiler,

Less cost-effective for languages like Java that have strong type checking and can therefore detect many errors during compilation.

Verification and formal methods

Formal methods can be used when a mathematical specification of the system is produced.

They are the ultimate static verification technique.

They involve detailed mathematical analysis of the specification and may develop formal arguments that a program conforms to its mathematical specification.

Arguments for formal methods

Producing a mathematical specification requires a detailed analysis of the requirements and this is likely to uncover errors.

They can detect implementation errors before testing when the program is analysed alongside the specification.

Arguments against formal methods

Require specialised notations that cannot be understood by domain experts.

Very expensive to develop a specification and even more expensive to show that a program meets that specification.

It may be possible to reach the same level of confidence in a program more cheaply using other V & V techniques.

The name is derived from the 'Cleanroom' process in semiconductor fabrication. The philosophy is defect avoidance rather than defect removal.

This software development process is based on:• Incremental development;• Formal specification;• Static verification using correctness arguments;• Statistical testing to determine program reliability.

Cleanroom software development

The Cleanroom process

Constructstructuredprogram

Definesoftware

increments

Formallyverifycode

Integrateincrement

Formallyspecifysystem

Developoperational

profileDesign

statisticaltests

Testintegratedsystem

Error rework

Cleanroom process characteristics

Formal specification using a state transition model.

Incremental development where the customer prioritises increments.

Structured programming - limited control and abstraction constructs are used in the program.

Static verification using rigorous inspections. Statistical testing of the system (covered in

Ch. 24).

Formal specification and inspections

The state based model is a system specification and the inspection process checks the program against this mode.l

The programming approach is defined so that the correspondence between the model and the system is clear.

Mathematical arguments (not proofs) are used to increase confidence in the inspection process.

Specification team. Responsible for developing and maintaining the system specification.

Development team. Responsible for developing and verifying the software. The software is NOT executed or even compiled during this process.

Certification team. Responsible for developing a set of statistical tests to exercise the software after development. Reliability growth models used to determine when reliability is acceptable.

Cleanroom process teams

The results of using the Cleanroom process have been very impressive with few discovered faults in delivered systems.

Independent assessment shows that the process is no more expensive than other approaches.

There were fewer errors than in a 'traditional' development process.

However, the process is not widely used. It is not clear how this approach can be transferred to an environment with less skilled or less motivated software engineers.

Cleanroom process evaluation

Key points

Verification and validation are not the same thing. Verification shows conformance with specification; validation shows that the program meets the customer’s needs.

Test plans should be drawn up to guide the testing process.

Static verification techniques involve examination and analysis of the program for error detection.

Key points

Program inspections are very effective in discovering errors.

Program code in inspections is systematically checked by a small team to locate software faults.

Static analysis tools can discover program anomalies which may be an indication of faults in the code.

The Cleanroom development process depends on incremental development, static verification and statistical testing.

Sommerville, Ammann-Offutt, Mejia, 2009 Software Engineering Slide 1 Verification and Validation.

Documents

Transcript of Sommerville, Ammann-Offutt, Mejia, 2009 Software Engineering Slide 1 Verification and Validation.

Prime Mobile asphalt mixing plant - Ammann · Ammann has now been successfully constructing asphalt mixing plants for over 100 years. At the present time, over 4.000 Ammann mixing

Real Talk with Dr. Offutt Harnessing the Power of Social Media to Improve Teen Health Laura Offutt M.D. Founder and Creator Real Talk with Dr. Offutt,

Sophie Ammann MA Dissertation

Cost / Benefits Arguments for Automation and Coverage Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu.

Ch1 Ian Sommerville

State of Tennessee Department of State Tennessee …...29. Untitled 30. Martin 31. Untitled 32. Untitled 33. Untitled 34. Merrill 35. Offutt 36. Offutt, James 37. Downman (Offutt)

Software Engineering Experimentation Software Engineering Specific Issues (Mostly CS as well) Jeff Offutt offutt

The Model-Driven Test Design Process Based on the book by Paul Ammann & Jeff Offutt offutt/softwaretest/ Jeff Offutt Professor, Software.

10/8/2015© Jeff Offutt, 2001-20071 Menu Design Guidelines Jeff Offutt offutt/ SWE 432 Design and Implementation of Software for.

Offutt Squadron - Apr 2007

Offutt Right Start - Medical

Sommerville (2004) Pulling out the intentional structure ...wexler.free.fr/library/files/sommerville (2004) pulling out the... · In a recent study (Woodward & Sommerville, 2000),

Ammann Whitney Structural Engineering Considerations

Of 33 Improving Logic-Based Testing Jeff Offutt Professor, Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu.

User Interface Overview Jeff Offutt offutt/ SWE 432 Design and Implementation of Software for the Web.

Jan Ammann - Art and Soul Managementpresse.artandsoul.eu/janammann/Ammann, Jan, CV 2017.TXT.EN.pdf · Roman Polanski’s smash hit musical TANZ DER VAMPIRE. In October 2010, Jan Ammann

A Mutation Carol Past, Present and Future Jeff Offutt Software Engineering George Mason University Fairfax, VA USA offutt/offutt@gmu.edu.

A. Jefferson Offutt, PhD

SOMMERVILLE. - Project Euclid

Introduction to Software Testing Chapter 3.1, 3.2 Logic ... · Introduction to Software Testing Chapter 3.1, 3.2 Logic Coverage Paul Ammann & Jeff Offutt