Www.ischool.drexel.edu INFO 631 Prof. Glenn Booker Week 4 – Testing and Object-Oriented Metrics...

www.ischool.drexel.edu

INFO 631 Prof. Glenn Booker

Week 4 – Testing and Object-Oriented Metrics

1INFO631 Week 4

www.ischool.drexel.eduINFO631 Week 4 2

Software Testing Overview

• Software testing is the process of executing a software system to determine if it matches its specification and executes correctly in its intended environment

• Requires running an executable program with real or simulated inputs to determine its response


Software Testing Overview

• Incorrect software behavior leads to a failure– Produces incorrect outputs– Code executes too slowly– Code uses too much memory

• Failures are caused by faults


Testing Prerequisites

• Code reviews – uncompiled source code is read and analyzed statically – May use Formal methods to provide proof of

correctness (Z, Larch)• Specification – defines correct behavior so

that incorrect behavior is easier to identify – If you don’t know what it should do, when is it

broken?


Software Testing by Phase

• Unit testing tests individual software components (modules)– To test, need to define the input domain for unit in

question and ignore rest of system• Integration (or component) testing tests

combinations of units that have each passed previous unit testing– Focus is on a larger subset of domain, representing

communication between units


Software Testing by Phase

• Regression testing tests code containing bug fixes or new features to ensure original functions still work (may be used during development and maintenance)

• System testing tests a collection of components which are regarded as a deliverable product


Testing S Curve

• A common method for tracking test cases is to count the total number of test cases planned over time– Often looks like an s-shaped curve

# test cases

Time


Testing S Curve

• Then, plot the number of test cases attempted, and successfully completed; all also over time

• Main purpose is to help estimate test completion


Defect Arrival

• As a project progresses, defects are identified in documents, code, etc.

• A typical mechanism for tracking these defects is a Problem Tracking Report (PTR)

• The rate of testing, and quality of the code, determine the rate at which new PTRs will be discovered


Defect Arrival

• Generally want more PTRs found early in testing, so there are few defects found near an intended release or product shipment– Can track trends in PTR discovery across

several releases– Also key to track backlog number of PTRs

remaining to be fixed


Product Size

• Product size is critical to estimation of development effort and schedule, making it the most common basic measurement of the product

• Can track the number of lines of code released over time (so each key event is a release date and its size)


Other Testing Measures

• Measures during testing could also include– Percent of CPU utilization – Number of system crashes and hangs per

week– Average time between crashes or hangs (text

refers to unplanned IPL (Initial Program Load = booting))


Severity of Problems

• As problems are identified through PTRs, their severity or criticality is important to assess– Resolving PTRs which cause complete

system failure or severe trouble can preempt all other activity

– Release criteria can include specific numbers of PTRs by severity


Testing Process Measurement

• Often helps to use calendar time to measure process-related metrics, instead of life cycle phase

• Time-based measurements are best presented relative to the release date, to help focus attention on the goal

• Qualify measurements as good/bad or red/yellow/green


Defect Cause Classification

• Look for patterns of defect arrival, and what kind of defects are identified– Patterns such as by testing phase (unit,

component, regression, or system test)– Kind of defect, such as per the ODC or

some other classification scheme


Measures That Work

• No set of testing measures are “The” correct set, so monitor the effectiveness of your measures in providing meaningful information

• Change measures slowly, to allow for significant data to be collected for each set


Measures That Work

• Likewise, specific measures and goals for them need to be defined: – In order to evaluate a vendor-developed

product, or – Determine when a release is ready to go


Testing Test Case Type

• Testing may be done for structural and/or functional test cases– Structural testing focuses on correctness of

the code structure– Functional testing focuses on the software

meeting its requirements• Both types are generally used; some more

in certain life cycle phases


Structural Testing

• Structural testing – inputs based solely on structure of the source code or its data structures– Also called “code based testing,” or “white box

testing”– Structural testing may focus on

• Control flow – which module is currently being run?• Data flow – software moves data from one location

to another


Structural Test Cases

• Criteria (measures) for structural testing completeness– Has testing covered all common programming

errors?– Has all source code been exercised?

• Statements, branches, independent paths

– Has all internal data been initialized and used?


Structural Test Cases

• Example: Control flow based– Draw control flow graph for code– Determine its cyclomatic complexity– Determine basis set of linearly independent

paths (also used for regression)– Prepare test cases that will force execution of

each path in basis set


Predicate Testing (Structural)

• What can go wrong with predicates:– Boolean operator error (not, and, or)– Boolean variable error (wrong variable)– Boolean parentheses error (unmatched)– Relational operator error (<, <, etc.)– Arithmetic expression error (+, -, etc.)


Predicate Testing (Structural)

• If (Cond1 or cond2) then …– Cover all possible T/F conditions; e.g. if there

are two T/F conditions, check:• Cond1 true, cond2 true• Cond1 true, cond2 false• Cond1 false, cond2 true• Cond1 false, cond2 false


Loop Testing (Structural)

• Simple loop: Let N be the maximum number of allowable passes through loop

• Define test cases which will:• Skip loop entirely• One pass through loop• Two passes through loop• Some typical middle number of passes through loop• N-1 passes through loop• N passes through loop• N+1 passes through loop


Functional Testing

• Functional testing – test cases are selected without regard to source code structure– Based on attributes of the specification or

operational environment– Also called “black box testing,” “behavioral

testing,” or “specification based testing”


Functional Test Selection

• Criteria for functional testing completeness:– Have ways that software can fail been thought

through?• Have test cases been selected that show it

doesn’t fail?• Have all of the inputs been applied?• Have all possible software states been explored?• Have all user scenarios been run?



• Design tests to answer the following questions:– How is functional validity tested?– What classes of inputs will make good test

cases?– Is the system sensitive to certain input

values?– How are boundaries of a data class isolated?



– What data rates and data volume can the system tolerate?

– What effect will specific combinations of data have on system operation?


Unit Testing Strategy

• Focus testing effort on small unit of design– Module consists of a Function, Subroutine,

Subprogram, Class or Package• Often unit testing is white box (structural)

oriented


Unit Testing - Strategy

• Develop test cases to check– Interfaces– Boundary conditions– Local data structures– Independent paths– Error handling paths

• Measure percent of each covered during testing


Unit Testing Procedures

• Module is generally invoked by or invokes other modules– Calls or is called by subprograms– Uses or is used by classes, packages

Driver Module

Stub1 Stub2 …

Test Cases


The Idea of Stubs

Test Module

Stub

Display Trace Message

Test Module

Stub

Data

Display a Passed Parameter

Stubs are the start of a module, used as a placeholder


The Idea of Stubs

Test Module

Stub

Data

Return value from table or external file

Test Module

Stub

Data

Data

Do table search for input parameter, return associated output parameter


Integration Testing – Top Down Integration

• Typical Steps (see next slide)– Main module (m1) used as test driver– Use stubs for subordinates (m2, m3, m4)– Replace subordinate stubs one at a time with actual

modules– Tests are conducted as each module is integrated– Continue to replace stubs with modules and test– Regression test to ensure no new errors introduced


Integration Testing – Top Down Integration

m1

m8

m5 m6 m7

m4m2 m3

…

Code modules


Integration Testing –Bottom Up Integration

• Typical Steps (see next slide)– Low level modules are combined into builds

(clusters)– Driver (higher level stubs) are written to

control test case I/O (input and output)– Cluster is tested– Drivers are removed, clusters are combined

moving upward


Integration Testing–Bottom Up Integration

Driver

Test Module

Invoke Subordinate

A

Driver

Test Module

Data

Send param fromtable or externalfile

B

Driver

Test Module

Data

Display param

C

Driver

Test Module

DataData

Combo of drivers B and C

D

First test A, then B & C separately, then integrate into D


Regression Testing

• Regression testing occurs when– Software faults are repaired, and/or– Software is enhanced

• Given some version N of software– Collection of faults to be repaired are bundled

with a collection of enhancements to be made– This modified software constitutes version

N+1


Regression Testing

• How much retesting is required of the N+1 version to determine if faults repaired or no new faults are added in the enhanced portions of software?

• For faults which are repaired– Can fix fault that was reported– Fail to fix the fault– Fix fault, but injected a new fault– Failed to fix fault, and injected a new fault


Adequacy of Test Coverage

• Fault seeding – insert faults intentionally into the code– Based on capture/recapture models in biology– Apply test cases, discover what fraction of seeded

faults are discovered– Assume that the fraction of unseeded faults are in

the same proportion– Given numbers of seeded faults vs. the seeded

faults uncovered, and the number of unseeded faults found, can then estimate the number of unseeded faults remaining.


Object Oriented Metrics

• Many of the traditional software metrics can also be applied to object-oriented software– Lines of code– Cyclomatic complexity– Percent of comments

• In addition, several sets of metrics specifically for OO development exist


OO Counting Metrics

• Often counting basic characteristics is helpful, such as– Number of classes– Average number of LOC per class– Average number of methods per class– Average number of LOC per method


OO Productivity

• Productivity during object oriented development is measured much like any other code creation process– Typical measures are LOC/hour, function

points per person-month (PM), or number of classes per person-year (PY)

– Any kind of size measure per unit time could be used


OO Quality

• Quality measures can include the number of defects per class, but defects per KLOC are much more commonly accepted

• Quality measures are typically used to identify problem areas of code, much like procedural quality measures



• The six “CK” metrics are a common foundation for OO measurement– Weighted methods per class (WMC)– Response for a class (RFC)– Lack of cohesion of methods (LCOM)– Coupling between objects (CBO)– Depth of inheritance tree (DIT)– Number of children (NOC)


Sample Class DiagramConvex Set

+perimeter()

Polygon

+area()

Triangle Quadrilateral

Ellipse

Circle

Scalene Isosceles Equilateral


Weighted methods per class (WMC)

• WMC is defined for each class• WMC could be defined two ways

– The number of methods (messages) for each class• E.g. class Triangle has WMC=2 (for methods area and

perimeter), but Quadrilateral only has WMC=1 (perimeter)

– The sum of complexities for all methods for each class• Code specific


Response for a class (RFC)

• RFC is the sum of how many ways all methods may be used by the class to implement all messages– Class Polygon has five objects which may

invoke perimeter differently– Class Triangle has three objects which may

invoke area differently– Hence RFC for Polygon is 5+3=8


Lack of cohesion of methods (LCOM)

• Measures how isolated classes are by how often they use the same attributes– One method:

• Find percent of methods which use a given attribute

• Average those percentages for the class• Subtract from 100%

– Low resulting value means higher cohesion (a good thing usually)


Lack of cohesion of methods (LCOM)

– Another method: • Look for pairs of methods which share an attribute• Count how many different groups of pairs were

found

– Low number means less cohesion (bad)


Coupling between objects (CBO)

• Coupling between object classes arises from– Passing messages between objects– Methods in one object use methods

or attributes from the other class– Inheritance

• CBO is how many classes are coupled to a class, not by inheritance; prefer low coupling


Depth of inheritance tree (DIT)

• DIT is how many classes from the current class to the root of the inheritance tree– Scalene class has DIT=3 (Triangle, Polygon,

and Convex Set)• Higher DIT lowers maintainability• A related metric is the number of methods

inherited (NMI)


Number of children (NOC)

• NOC is the number of classes immediately below the current class– Triangle has NOC=3 (Scalene, Isosceles, and

Equilateral)– Convex Set has NOC=2



• Skill requirements during iterative OO development (p. 353) show that experts are needed early on, while beginners should only participate in later iterations

Www.ischool.drexel.edu INFO 631 Prof. Glenn Booker Week 4 – Testing and Object-Oriented Metrics...

Documents

Transcript of Www.ischool.drexel.edu INFO 631 Prof. Glenn Booker Week 4 – Testing and Object-Oriented Metrics...