Chapter 10: Testing Lecturer: Dr. Mai Fadel. Basic Definitions A failure is an unacceptable behavior...

Chapter 10: Testing

Lecturer: Dr. Mai Fadel

Basic Definitions• A failure is an unacceptable behavior exhibited by a

system.• E.g. the production of incorrect outputs, the system

running too slowly, or the user having trouble finding online help.

• A defect is a flaw in any aspect of the system including the requirements, the design, and the code, that contributes, or may potentially contribute, to the occurrence of one or more failures. A defect is also known as a fault.

• e.g. forwarding a message to their selves in a chatting program causes the system to hang or crash (infinite loop). No documentation or source code comments are provided for a complex algorithm.

Effective and efficient testing• Testing is the process of deliberately trying to cause failures in a

system in order to detect any defects that might be present.– To test effectively, you must use a strategy that uncovers as many

defects as possible.– To test efficiently, you must find the largest possible number of defects

using the fewest possible tests.• Black-box testing

– Testers treat a system as a black box, i.e. they provide the system with inputs and observe the outputs, but the cannot see what is going on inside.

– Cannot see neither code, internal data or documentation describing the internal of the system.

• Glass-box testing– An alternative approach is to treat the system as a glass-box.– The tester can examine the design documents, the code, and observe

at run time the steps taken by algorithms and their internal data.• Glass-box is more time consuming, but it removes much of the

guess work. It is usually applied by individuals to test their own code.

• Most of the technique in this chapter are oriented toward black-box testing.

Equivalence classes: a strategy for choosing what to test

• In order to test efficiently, you should divide the possible inputs into groups that you believe will be treated similarly by reasonable algorithms. Such groups are called equivalence classes.

• A tester needs only to run one test per equivalence class, using a representative member of that class as input.

• The set of equivalence classes for a system as a whole in the set of all possible combinations of inputs.

• The total number of eq. classes for the system is the product of classes of the individual inputs.

• e.g. 6 inputs => 2x4x11x3x3x4= 3168• Combinatorial explosion of the space required: the total number becomes

very large.• You cannot realistically test every possible system-wide eq. class.• A reasonable approach:

– Test the eq. classes for each individual input.– The, where possible test all combinations where one input is likely to

affect the interpretation of another.• In addition to testing a representative value of each equivalence

class, you should also test values at the boundaries of eq. classes (checking the extreme values).

Detecting specific categories of defects

• The next few sections give a non-exhaustive list of some of the most common categories of defects.

• This will help in designing the appropriate test cases.

• It will help when designing software (avoiding such defects)

Defects in ordinary algorithms

• Defects that are most common in all types of algorithms. • More often introduced by the designer or the

programmer.• Incorrect logical conditions• Defect: the logical conditions that govern looping, and if-

then-else statements are wrongly formulated. – Conditions needs completely restructuring– More subtle defect, such as nesting paranthess incorrectly,

reversing comparison operators (< becomes >), or mishandling the equality case (<= becomes <)

• Testing strategy: use equivalence class and boundary testing. To compute the equivalence classes consider each variable used in the logical condition as an input.

Defects in algorithms: performing a calculation in the wrong part of the control

construct• Defect: the program performs an action when it

should not, or does not perform an action when it should do. – Typically caused by inappropriately excluding the

action from, or including an action in, a loop or if-then-else construct.

• Testing strategy: design tests that execute each loop zero times, exactly once, and more than once.– Ensure that anything ‘unusual’ that could happen

while looping is made to occur on the first and last iteration.

Defects in algorithms

• Not terminating a loop or recursion– Ensure they reach a terminating case– Analyze what causes a repetitive action to be stopped, and run

test cases that might not be handled correctly.• Not setting up the correct preconditions for an algorithm

– Run test cases in which each precondition is not satisfied• Not handling null conditions

– Determine all possible null conditions and run the test cases.• Not handling singleton or non-singleton conditions

– Brainstorm to determine unusual conditions• Off-by-one errors

– Develop boundary tests in which you verify that the program computes the correct numerical answer, or performs the correct number of iterations.

– In graphical applications, study the display to see if objects slightly overlap or have slight gaps

Defects in algorithms

• Operator precedence errors– Apply code inspection

• Use of inappropriate standard algorithms – Requires knowledge of the properties of algorithms and design

test– e.g. of bad choices:– Inefficient sort algorithm: ‘bubble sort’. Increase the number of

items to be sort and observe the execution time. It it double four time the as long => inefficient.

– Inefficient search algorithm: ensure that the search time does not increase unexpectedly. Check if the search time is affected by the position of the item in the list.

– A search or sort that is case sensitive when it should not be, or vice versa: check with mixed case data.

Defects in numerical algorithms

• Numerical computations defects are special class of algorithmic defects. They can occur in any software that performs mathematical calculations (e.g. involving floating point values)

• Not using enough bits or digits to store maximum values

• Using insufficient places after the decimal point or too few significant figures=> may cause the system to round excessively=>data is stored inaccurately or lead to a build-up of errors

Defects in numerical algorithms

• Ordering operations poorly so that errors build up– Occurs when you do small operations on large

floating point numbers. e.g. add or subtract 1 from 3.54 x 1028 the answer will be equal to the large number. The defect occurs when the programmer intends that the large number should be modified. (not enough significant figures)

• Assuming a floating-point value will be exactly equal to some other value– e.g. incorrect -> d != 10.0 correct -> d < 10.0

Defects in timing and co-ordination: deadlocks, livelocks, and critical races

• Timing and co-ordination defects arise in situations involving some form of concurrency.– Several threads or processes interact in inappropriate ways.

• A deadlock is a situation in where two or more threads or processes are stopped, waiting for each other to do something before they either can proceed. Since neither can do anything, they permanently stop each other from proceeding.

• A livelock is similar to deadlock, in the sense that the system is stuck in a particular behavior that it cannot get out of. The difference is: whereas in deadlock the system is normally hung, with nothing going on, in livelocks, the system can do some computation, but it can never get out of a limited set of states.

• A critical race is a defect in which one thread or process can sometimes experience a failure because another thread or process interferes with the ‘normal’ sequence of events. The defect is not the interference of the other thread, but that the system allows interference to occur.– e.g. two threads are working to achieve some outcome, if one sped up

or slowed down the result will be different (set and get to the same data).

Defects in handling stress and unusual situations

• These are encountered only when a system is being heavily used, or forced to its limits in some other way.

• They represent lack of robustness.• To test for such defects you must run the system

intensively, supply it with a very large amount of input, run many copies of it, run it on a computer or network that is busy running other systems.

• Insufficient throughput or response time on minimal configurations

• Incompatibility with specific configurations of hardware or software

• Defects in handling peak loads or missing resources• Inappropriate management of resources• Defects in the process of recovering from a crash

Writing formal test cases and test plans

• A test case is an explicit set of instructions designed to detect a particular class of defect in a software system, by bringing about a failure.

• A test plan is a document that contains a complete set of test cases for a system, along with other information about the testing process. (a standard type of documentation)

• No test plan => ad-hoc testing=> poor quality software.

• Testing can start long before the testing phase, for example writing test cases for use cases during requirement analysis.

Information to include in a formal test case

• Each test case should have the following information:– Identification and classification. test id, a descriptive title that

indicates its purpose, part of the system being tested, the importance of the test case.

– Instructions. Tell the tester exactly what to do. How to put the system in the required initial state and what input to provide.

– Expected results. – Cleanup (when needed). Tells the tester how to make the

system go ‘back to normal’ or shut down after the test. e.g. erroneous data be added to the database.

• Test cases can be organized into groups or tables. (common instructions can be reported to the group not each test case)

• It is common to completely automate the testing process. Each test case may become a method that throws an exception if the test fails. Test case would need to report an identification of what failed.

Levels of importance of test cases

• Classify test cases according to their importance, the most important should be executed first.

• Level 1: first pass critical test cases. These designed to verify that the system runs and is safe.

• Level 2: general test cases. These verify that the system performs the day-to-day functions correctly => ‘success’. – Important to fix, may still permit testing other aspects of the

system.

• Level 3: test cases of lesser importance. e.g. testing cosmetic aspects (button become greyed-out when it is not needed).

Strategies for testing large systems

• There are several strategies that can be use to test the entire system that has many subsystems and thousands of test cases.

Integration testing: big bang versus incremental approaches

• Integration testing is testing how the parts of a system work together.

• Unit testing is testing an individual module or component in isolation.

• Big bang testing (integration): you take the entire integrated system and test it all at once.– Satisfactory, but when failure occurs it may be hard to tell in

which subsystem a defect lies.• Incremental testing (integration): you first test each

individual subsystem in isolation, and then continue testing as you integrate more and more subsystems.

• Incremental testing can be performed horizontally or vertically depending on the architecture of the system.

• Horizontal testing can be used when the system is divided into separate sub-applications, you simply test each sub-application in isolation.

Vertical incremental testing

• Can be top-down, bottom-up, or sandwich.• Top-down testing: you start by testing only the user

interface, with the underlying functionality simulated by stubs.– Then you work downwards, integrating lower and lower layers,

each time creating stubs for the layers that remain un-integrated.– A stub is a piece of code that have the same interface (API) as

the lower-level layers, but which do not perform any real computations or manipulate any real data.

– Any call to a stub will typically return a default value.– Any defect can be reasonably confident that the defect is in the

layer that calls the stubs.– Drawback: writing the code for stubs. There are automated tools

for doing so.

Vertical incremental testing• Bottom-up testing: you start by testing the very lowest

levels of the software. – This might include a database layer, a network layer, a layer that

performs some algorithmic computations.– You need drivers to test lower layers. A driver is a simple

program designed specifically for testing; they make calls to the lower layers.

– A test harness is a driver that fully automates the testing of the lower layers.

• Sandwich testing/ mixed testing is a hybrid bottom-up and top-down testing: testing the user interface using stubs, and testing the very lowest level functions using drivers.– The middle layer will remain on which you perform the final set

of tests.– The most effective.

Deciding when to stop testing• Not a practical approach: go on re-testing

software until all the test cases has passed.• Poor strategy to stop testing because merely

you have run out of time or money => poor-quality system.

• Establish a set of criteria to decide when to testing should be complete:– All the level 1 test cases must have been successfully

executed.– Certain predefined percentages of level 2 and 3 test

cases must have been executed successfully.– The targets must have been achieved and then

maintained for at least two cycle of ‘builds’. A build involves compiling and integrating all the components of the software, incorporating any changes since the last build.

Testing performed by users and clients: alpha, beta and acceptance

• Testing is first performed by software engineers in the development organization.

• The testing process normally involves users. • It includes the involvements of users in testing versions of the

system that are almost ready to be put into production.• It starts once the developers believe the software has reached a

sufficient level of quality.• Alpha testing is testing performed by users and clients, under the

supervision of the software development team (invited users).• Beta testing is testing performed by the user or client in their normal

work environment. – Users know that the software will contain more defects but they have

access to its functionality before the others.– Beta testers are responsible for reporting problems when they discover

them.• Acceptance testing is performed by users and clients to decide

whether software is of sufficient quality to purchase.

Chapter 10: Testing Lecturer: Dr. Mai Fadel. Basic Definitions A failure is an unacceptable behavior...

Documents

Transcript of Chapter 10: Testing Lecturer: Dr. Mai Fadel. Basic Definitions A failure is an unacceptable behavior...