White Box Testing

32
1 White-box testing From Wikipedia, the free encyclopedia White-box testing (a.k.a. clear box testing, glass box testing, transparent box testing, or structural testing) is a method of testing software that tests internal structures or workings of an application, as opposed to its functionality (i.e. black-box testing). In white-box testing an internal perspective of the system, as well as programming skills, are required and used to design test cases. The tester chooses inputs to exercise paths through the code and determine the appropriate outputs. This is analogous to testing nodes in a circuit, e.g. in- circuit testing (ICT). While white-box testing can be applied at the unit, integration and system levels of the software testing process, it is usually done at the unit level. It can test paths within a unit, paths between units during integration, and between subsystems during a system level test. Though this method of test design can uncover many errors or problems, it might not detect unimplemented parts of the specification or missing requirements. White-box test design techniques include: Control flow testing Data flow testing Branch testing Path testing Hacking In penetration testing, white-box testing refers to a methodology where an ethical hacker has full knowledge of the system being attacked. The goal of a white-box penetration test is to simulate a malicious insider who has some knowledge and possibly basic credentials to the target system. Compare with black-box testing. Control flow From Wikipedia, the free encyclopedia

Transcript of White Box Testing

Page 1: White Box Testing

1

White-box testingFrom Wikipedia, the free encyclopedia

White-box testing (a.k.a. clear box testing, glass box testing, transparent box testing, or structural testing) is a

method of testing software that tests internal structures or workings of an application, as opposed to its

functionality (i.e. black-box testing). In white-box testing an internal perspective of the system, as well as

programming skills, are required and used to design test cases. The tester chooses inputs to exercise paths

through the code and determine the appropriate outputs. This is analogous to testing nodes in a circuit, e.g. in-

circuit testing (ICT).

While white-box testing can be applied at the unit, integration and system levels of the software

testing process, it is usually done at the unit level. It can test paths within a unit, paths between units during

integration, and between subsystems during a system level test. Though this method of test design can

uncover many errors or problems, it might not detect unimplemented parts of the specification or missing

requirements.

White-box test design techniques include:

Control flow testing

Data flow testing

Branch testing

Path testing

Hacking

In penetration testing, white-box testing refers to a methodology where an ethical hacker has full knowledge of

the system being attacked. The goal of a white-box penetration test is to simulate a malicious insider who has

some knowledge and possibly basic credentials to the target system.

Compare with black-box testing.

Control flowFrom Wikipedia, the free encyclopedia

Not to be confused with Flow control.

In computer science, control flow (or alternatively, flow of control) refers to the order in which the

individual statements, instructions, or function calls of an imperative or a declarative program are executed or

evaluated.

Page 2: White Box Testing

2

Within an imperative programming language, a control flow statement is a statement whose execution results

in a choice being made as to which of two or more paths should be followed. For non-strict functional

languages, functions and language constructs exist to achieve the same result, but they are not necessarily

called control flow statements.

The kinds of control flow statements supported by different languages vary, but can be categorized by their

effect:

continuation at a different statement (unconditional branch or jump),

executing a set of statements only if some condition is met (choice - i.e. conditional branch),

executing a set of statements zero or more times, until some condition is met (i.e. loop - the same

as conditional branch),

executing a set of distant statements, after which the flow of control usually returns

(subroutines, coroutines, and continuations),

stopping the program, preventing any further execution (unconditional halt).

Interrupts and signals are low-level mechanisms that can alter the flow of control in a way similar to a

subroutine, but usually occur as a response to some external stimulus or event (that can

occur asynchronously), rather than execution of an 'in-line' control flow statement. Self-modifying code can also

be used to affect control flow through its side effects, but usually does not involve an explicit control flow

statement (an exception being the ALTER verb in COBOL[citation needed]).

At the level of machine or assembly language, control flow instructions usually work by altering the program

counter. For some CPUs the only control flow instructions available are conditional or

unconditional branches (sometimes called jumps).

Primitives

Labels

Main article: Label (programming language)

A label is an explicit name or number assigned to a fixed position within the source code, and which may be

referenced by control flow statements appearing elsewhere in the source code. Other than marking a position

within the source code a label has no effect.

Line numbers are an alternative to a named label (and used in some languages such as Fortran and BASIC),

that are whole numbers placed at the beginning of each line of text within the source code. Languages which

use these often impose the constraint that the line numbers must increase in value in each subsequent line, but

may not require that they be consecutive. For example, in BASIC:

10 LET X = 3

20 PRINT X

Page 3: White Box Testing

3

In other languages such as C and Ada a label is an identifier, usually appearing at the beginning of a line and

immediately followed by a colon. For example, in C:

Success: printf ("The operation was successful.\n");

The Algol 60 language allowed both whole numbers and identifiers as labels (both attached by colons to the

following statement), but few if any other variants of Algol allowed whole numbers.

Goto

Main article: GOTO

The goto statement (a combination of the English words go and to, and pronounced accordingly) is the most

basic form of unconditional transfer of control.

Although the keyword may either be in upper or lower case depending on the language, it is usually written as:

goto label

The effect of a goto statement is to cause the next statement to be executed to be the statement appearing at

(or immediately after) the indicated label.

Goto statements have been considered harmful by many computer scientists, notably Dijkstra.

Subroutines

Main article: Subroutine

The terminology for subroutines varies; they may alternatively be known as routines, procedures, functions

(especially if they return results) or methods (especially if they belong to classes or type classes).

In the 1950s, computer memories were very small by current standards so subroutines were used primarily[citation

needed] to reduce program size; a piece of code was written once and then used many times from various other

places in the program.

Nowadays, subroutines are more frequently used to help make a program more structured, e.g. by isolating

some particular algorithm or hiding some particular data access method. If many programmers are working on

a single program, subroutines are one kind of modularitythat can help split up the work.

Minimal structured control flow

See also: Structured program theorem

In May 1966, Böhm and Jacopini published an article[1] in Communications of the ACM which showed that any

program with gotos could be transformed into a goto-free form involving only choice (IF THEN ELSE) and

loops (WHILE condition DO xxx), possibly with duplicated code and/or the addition of Boolean variables

Page 4: White Box Testing

4

(true/false flags). Later authors have shown that choice can be replaced by loops (and yet more Boolean

variables).

The fact that such minimalism is possible does not necessarily mean that it is desirable; after all, computers

theoretically only need one machine instruction (subtract one number from another and branch if the result is

negative), but practical computers have dozens or even hundreds of machine instructions.

What Böhm and Jacopini's article showed was that all programs could be goto-free. Other research showed

that control structures with one entry and one exit were much easier to understand than any other form,

primarily because they could be used anywhere as a statement without disrupting the control flow. In other

words, they were composable. (Later developments, such as non-strict programming languages - and more

recently, composable software transactions - have continued this line of thought, making components of

programs even more freely composable.)

Control structures in practice

Most programming languages with control structures have an initial keyword which indicates the type of control

structure involved. Languages then divide as to whether or not control structures have a final keyword.

No final keyword: Algol 60, C, C++, Haskell, Java, Pascal, Perl, PHP, PL/I, Python,PowerShell. Such

languages need some way of grouping statements together:

Algol 60 and Pascal : begin ... end

C, C++, Java, Perl, PHP, and PowerShell: curly brackets { ... }

PL/1: DO ... END

Python: uses indentation level (see Off-side rule)

Haskell: either indentation level or curly brackets can be used, and they can be freely mixed

Final keyword: Ada, Algol 68, Modula-2, Fortran 77, Mythryl, Visual Basic. The forms of the final keyword

vary:

Ada: final keyword is end + space + initial keyword e.g. if ... end if, loop ... end loop

Algol 68, Mythryl: initial keyword spelled backwards e.g. if ... fi, case ... esac

Fortran 77: final keyword is end + initial keyword e.g. IF ... ENDIF, DO ... ENDDO

Modula-2: same final keyword END for everything

Visual Basic: every control structure has its own keyword. If ... End If; For ... Next; Do ... Loop

Choice

Main article: Conditional (programming)

Loops

Page 5: White Box Testing

5

"Program loop" redirects here. For a specific type of loop that listens to and dispatches messages, see Event

loop. For other uses, seeLoop.

A loop is a sequence of statements which is specified once but which may be carried out several times in

succession. The code "inside" the loop (the body of the loop, shown below as xxx) is obeyed a specified

number of times, or once for each of a collection of items, or until some condition is met.

In functional programming languages, such as Haskell and Scheme, loops can be expressed by

using recursion or fixed point iteration rather than explicit looping constructs. Tail recursion is a special case of

recursion which can be easily transformed to iteration.

Count-controlled loops

Main article: For loop

See also: Loop counter

Most programming languages have constructions for repeating a loop a certain number of times. Note that if N

is less than 1 in these examples then the language may specify that the body is skipped completely, or that the

body is executed just once with N = 1. In most cases counting can go downwards instead of upwards and step

sizes other than 1 can be used.

FOR I = 1 TO N for I := 1 to N do begin xxx xxx NEXT I end;

DO I = 1,N for ( I=1; I<=N; ++I ) { xxx xxx END DO }

In many programming languages, only integers can be reliably used in a count-controlled loop. Floating-point

numbers are represented imprecisely due to hardware constraints, so a loop such as

for X := 0.1 step 0.1 to 1.0 do

might be repeated 9 or 10 times, depending on rounding errors and/or the hardware and/or the compiler

version. Furthermore, if the increment of X occurs by repeated addition, accumulated rounding errors may

mean that the value of X in each iteration can differ quite significantly from the expected sequence 0.1, 0.2, 0.3,

..., 1.0.

Condition-controlled loops

Main article: While loop

Page 6: White Box Testing

6

See also: Do-while loop

Most programming languages have constructions for repeating a loop until some condition changes. Note that

some variations place the test at the start of the loop, while others have the test at the end of the loop. In the

former case the body may be skipped completely, while in the latter case the body is always executed at least

once.

DO WHILE (test) repeat xxx xxx LOOP until test;

while (test) { do xxx xxx } while (test);

Collection-controlled loops

Main article: Foreach

Several programming languages (e.g. Ada, D, Smalltalk, Perl, Java, C#, Mythryl, Visual

Basic, Ruby, Python, JavaScript, Fortran 95 and later) have special constructs which allow implicitly looping

through all elements of an array, or all members of a set or collection.

someCollection do: [:eachElement |xxx].

foreach (item; myCollection) { xxx }

foreach someArray { xxx }

Collection<String> coll; for (String s : coll) {}

foreach (string s in myStringCollection) { xxx }

$someCollection | ForEach-Object { $_ } forall ( index = first:last:step... )

General iteration

General iteration constructs such as C's for statement and Common Lisp's do form can be used to express

any of the above sorts of loops, as well as others -- such as looping over a number of collections in parallel.

Where a more specific looping construct can be used, it is usually preferred over the general iteration construct,

since it often makes the purpose of the expression more clear.

Page 7: White Box Testing

7

Infinite loops

Infinite loops are used to assure a program segment loops forever or until an exceptional condition arises, such

as an error. For instance, an event-driven program (such as a server) should loop forever handling events as

they occur, only stopping when the process is terminated by an operator.

Often, an infinite loop is unintentionally created by a programming error in a condition-controlled loop, wherein

the loop condition uses variables that never change within the loop.

Continuation with next iteration

Sometimes within the body of a loop there is a desire to skip the remainder of the loop body and continue with

the next iteration of the loop. Some languages provide a statement such as continue, skip, or next which

will do this. The effect is to prematurely terminate the innermost loop body and then resume as normal with the

next iteration. If the iteration is the last one in the loop, the effect is to terminate the entire loop early.

[edit]Redo current iteration

Some languages, like Perl and Ruby, have a redo statement that restarts the current iteration from the

beginning.

[edit]Restart loop

Ruby has a retry statement that restarts the entire loop from the initial iteration.

[edit]Early exit from loops

When using a count-controlled loop to search through a table, it might be desirable to stop searching as soon

as the required item is found. Some programming languages provide a statement such as break or exit,

whose effect is to terminate the current loop immediately and transfer control to the statement immediately

following that loop. One can also return out of a subroutine executing the looped statements, breaking out of

both the nested loop and the subroutine. Things can get a bit messy if searching a multi-dimensional table

using nested loops (see #Proposed control structures below).

The following example is done in Ada which supports both early exit from loops and loops with test in the

middle. Both features are very similar and comparing both code snippets will show the difference: early

exit needs to be combined with an if statement while a condition in the middle is a self contained construct.

with Ada.Text IO;

with Ada.Integer Text IO;

procedure Print_Squares is

X : Integer;

begin

Read_Data : loop

Ada.Integer Text IO.Get(X);

exit Read_Data when X = 0;

Ada.Text IO.Put (X * X);

Page 8: White Box Testing

8

Ada.Text IO.New_Line;

end loop Read_Data;

end Print_Squares;

Python supports conditional execution of code depending on whether a loop was exited early (with

a break statement) or not by using a else-clause with the loop. For example,

for n in set_of_numbers:

if isprime(n):

print "Set contains a prime number"

break

else:

print "Set did not contain any prime numbers"

Note that the else clause in the above example is attached to the for statement, and not the

inner if statement. Both Python's for andwhile loops support such an else clause, which is executed only if

early exit of the loop did not occur.

Loop variants and invariants

Loop variants and loop invariants are used to express correctness of loops.[2]

In practical terms, a loop variant is an integer expression which has an initial non-negative value. The variant's

value must decrease during each loop iteration but must never become negative during the correct execution of

the loop. Loop variants are used to guarantee that loops will terminate.

A loop invariant is an assertion which must be true before the first loop iteration and remain true after each

iteration. This implies that when a loop terminates correctly, both the exit condition and the loop invariant are

satisfied. Loop invariants are used to monitor specific properties of a loop during successive iterations.

Some programming languages, such as Eiffel contain native support for loop variants and invariants. In other

cases, support is an add-on, such as the Java Modeling Language's specification for loop statements in Java.

Loop system cross reference table

Programming

language

conditional loop

early exit

continuation

redo

retry

correctness facilities

begin

middle

end

countcollectio

ngener

alinfinit

e[1] variantinvaria

nt

Ada Yes YesYes

Yes arrays No Yesdeep

nestedNo

Page 9: White Box Testing

9

C Yes NoYes

No [2] No Yes Nodeep neste

d[3]

deep nested [3] No

C++ Yes NoYes

No [2] No [9] Yes Nodeep neste

d[3]

deep nested [3] No

C# Yes NoYes

No [2] Yes Yes Nodeep neste

d[3]

deep nested [3]

Common Lisp

Yes YesYes

Yes Yes Yes Yesdeep

nestedNo

Eiffel Yes No No Yes[10] Yes Yes Noone

level [10

]

No NoN

o[11] Yes Yes

F# Yes No No Yes Yes No No No [6] No No

FORTRAN 77

Yes No No Yes No No Noone level

Yes

Fortran 90 Yes No No Yes No No Yesdeep

nestedYes

Fortran 95 and later

Yes No No Yes arrays No Yesdeep

nestedYes

Haskell No No No No Yes No Yes No [6] No No

Java Yes No Yes

No [2] Yes Yes No deep nested

deep nested

No non-nativ

non-native [1

Page 10: White Box Testing

10

e[12] 2]

JavaScript Yes NoYes

No [2] Yes Yes Nodeep

nesteddeep

nestedNo

OCaml Yes No No Yesarrays,lis

tsNo No No [6] No No

PHP Yes NoYes

No [2]

[5] Yes [4] Yes Nodeep

nesteddeep

nestedNo

Perl Yes NoYes

No [2]

[5] Yes Yes Nodeep

nesteddeep

nestedYes

Python Yes No No No [5] Yes No Nodeep neste

d[6]

deep nested [6] No

REBOLNo [7

] YesYes

Yes Yes No [8] Yesone

level [6] No No

Ruby Yes NoYes

Yes Yes No Yesdeep neste

d[6]

deep nested [6] Yes Yes

Standard ML

Yes No No Noarrays,lis

tsNo No No [6] No No

Visual Basic .NET

Yes NoYes

Yes Yes No Yes

one level per

type of loop

one level per type of

loop

Page 11: White Box Testing

11

Windows PowerShell

Yes NoYes

No [2] Yes Yes No ? Yes

1. a while (true) does not count as an infinite loop for this purpose, because it is not a dedicated

language structure.

2. a b c d e f g h C's for (init; test; increment) loop is a general loop construct, not specifically

a counting one, although it is often used for that.

3. a b c Deep breaks may be accomplished in C, C++ and C# through the use of labels and go to s.

4. a Iteration over objects was added in PHP 5.

5. a b c A counting loop can be simulated by iterating over an incrementing list or generator, for instance,

Python's range().

6. a b c d e Deep breaks may be accomplished through the use of exception handling.

7. a There is no special construct, since the while function can be used for this.

8. a There is no special construct, but users can define general loop functions.

9. a The upcoming C++0x standard introduces the range-based for. In the STL there is

an std::for_each template function which can iterate on STL containers and call an unary

function for each element.[3] The functionality also can be constructed as macro on these containers.[4]

10. a Count controlled looping is effected by iteration across an integer interval; early exit by including an

additional condition for exit.

11. a Eiffel supports a reserved word retry, however it is used in exception handling, not loop control.

12. a Requires Java Modeling Language (JML) behavioral interface specification language.

Structured non-local control flow

Many programming languages, particularly those which favor more dynamic styles of programming, offer

constructs for non-local control flow. These cause the flow of execution to jump out of a given context and

resume at some predeclared point. Conditions, exceptions, and continuations are three common sorts of non-

local control constructs.

Conditions

PL/I has some 22 standard conditions (e.g. ZERODIVIDE SUBSCRIPTRANGE ENDFILE) which can be

RAISEd and which can be intercepted by: ON condition action; Programmers can also define and use their own

named conditions.

Like the unstructured if only one statement can be specified so in many cases a GOTO is needed to decide

where flow of control should resume.

Unfortunately, some implementations had a substantial overhead in both space and time (especially

SUBSCRIPTRANGE), so many programmers tried to avoid using conditions.

Page 12: White Box Testing

12

Common Syntax examples:

ON condition GOTO label

[edit]Exceptions

Main article: Exception handling

Modern languages have a structured construct for exception handling which does not rely on the use of GOTO:

try {

xxx1 // Somewhere in here

xxx2 // use: '''throw''' someValue;

xxx3

} catch (someClass& someId) { // catch value of someClass

actionForSomeClass

} catch (someType& anotherId) { // catch value of someType

actionForSomeType

} catch (...) { // catch anything not already caught

actionForAnythingElse

}

Any number and variety of catch clauses can be used above. In D, Java, C#, and Python

a finally clause can be added to the tryconstruct. No matter how control leaves the try the code

inside the finally clause is guaranteed to execute. This is useful when writing code that must relinquish

an expensive resource (such as an opened file or a database connection) when finished processing:

FileStream stm = null; // C# example

try {

stm = new FileStream ("logfile.txt", FileMode.Create);

return ProcessStuff(stm); // may throw an exception

} finally {

if (stm != null)

stm. Close();

}

Since this pattern is fairly common, C# has a special syntax:

using (FileStream stm = new FileStream ("logfile.txt", FileMode.Create)) {

return ProcessStuff(stm); // may throw an exception

}

Upon leaving the using-block, the compiler guarantees that the stm object is released.

Python's with statement and Ruby's block argument to File.open are used to similar effect.

All these languages define standard exceptions and the circumstances under which they are thrown.

Users can throw exceptions of their own (in fact C++ and Python allow users to throw and catch almost

any type).

Page 13: White Box Testing

13

If there is no catch matching a particular throw, then control percolates back through subroutine calls

and/or nested blocks until a matching catch is found or until the end of the main program is reached, at

which point the program is forcibly stopped with a suitable error message.

The AppleScript scripting programming language provides several pieces of information to a "try" block:

try

set myNumber to myNumber / 0

on error e number n from f to t partial result pr

if ( e = "Can't divide by zero" ) then display dialog "You idiot!"

end try

Continuations

Main article: Continuation

Non-local control flow cross reference

Programming language conditions exceptions

Ada No Yes

C No No

C++ No Yes

C# No Yes

D No Yes

Eiffel No Yes

Haskell No Yes

Java No Yes

Page 14: White Box Testing

14

Mythryl Yes Yes

Objective-C No Yes

PHP No Yes

PL/I Yes No

Python No Yes

REBOL Yes Yes

Ruby No Yes

Visual Basic .NET Yes Yes

Windows PowerShell No Yes

Proposed control structures

In a spoof Datamation article[5] in 1973, R. Lawrence Clark suggested that the GOTO statement could be

replaced by the COMEFROMstatement, and provides some entertaining examples. This was actually

implemented in the INTERCAL programming language, a language designed to make programs as

obscure as possible.

In his 1974 article "Structured Programming with go to Statements",[6] Donald Knuth identified two

situations which were not covered by the control structures listed above, and gave examples of control

structures which could handle these situations. Despite their utility, these constructions have not yet found

their way into mainstream programming languages.

Loop with test in the middle

The following was proposed by Dahl in 1972:[7]

loop loop xxx1 read(char);

Page 15: White Box Testing

15

while test; while not atEndOfFile; xxx2 write(char); repeat; repeat;

If xxx1 is omitted we get a loop with the test at the top. If xxx2 is omitted we get a loop with the test at the

bottom. If while is omitted we get an infinite loop. Hence this single construction can replace several

constructions in most programming languages. A possible variant is to allow more than one while test;

within the loop, but the use of exitwhen (see next section) appears to cover this case better.

Languages lacking this construct generally emulate it using an equivalent infinite-loop-with-break idiom:

while (true) { xxx1 if (not test) break xxx2}

The Wikibook Ada

Programminghas a page on

the topic of

Control

In Ada, the above loop construct (loop-while-repeat) can be represented using a standard infinite loop

(loop - end loop) that has an exit when clause in the middle (not to be confused with

theexitwhen statement in the following section).

with Ada.Text_IO;

with Ada.Integer_Text_IO;

procedure Print_Squares is

X : Integer;

begin

Read_Data : loop

Ada.Integer_Text_IO.Get(X);

exit Read_Data when X = 0;

Ada.Text IO.Put (X * X);

Ada.Text IO.New_Line;

end loop Read_Data;

end Print_Squares;

Naming a loop (Like Read_Data in our example) is optional but allows to leave the outer loop of several

nested loops.

Page 16: White Box Testing

16

Multiple early exit/exit from nested loops

This was proposed by Zahn in 1974.[8] A modified version is presented here.

exitwhen EventA or EventB or EventC; xxx exits EventA: actionA EventB: actionB EventC: actionC endexit;

exitwhen is used to specify the events which may occur within xxx, their occurrence is indicated by using

the name of the event as a statement. When some event does occur, the relevant action is carried out,

and then control passes just after endexit. This construction provides a very clear separation between

determining that some situation applies, and the action to be taken for that situation.

exitwhen is conceptually similar to exception handling, and exceptions or similar constructs are used for

this purpose in many languages.

The following simple example involves searching a two-dimensional table for a particular item.

exitwhen found or missing; for I := 1 to N do for J := 1 to M do if table[I,J] = target then found; missing; exits found: print ("item is in table"); missing: print ("item is not in table"); endexit

Control flow diagramFrom Wikipedia, the free encyclopedia

Page 17: White Box Testing

17

Example of a so called "performance seeking control flow diagram".[1]

A control flow diagram (CFD) is a diagram to describe the control flow of a business process, process or

program.

Control flow diagrams were developed in the 1950s, and are widely used in multiple engineeringdisciplines.

They are one of the classic business process modeling methodologies, along withflow charts, data flow

diagrams, functional flow block diagram, Gantt charts, PERT diagrams, and IDEF.[2]

Overview

A control flow diagram can consist of a subdivision to show sequential steps, with if-then-else conditions,

repetition, and/or case conditions. Suitably annotated geometrical figures are used to represent operations,

data, or equipment, and arrows are used to indicate the sequential flow from one to another.[3]

There are several types of control flow diagrams, for example:

Change control flow diagram, used in project management

Configuration decision control flow diagram, used in configuration management

Process control flow diagram, used in process management

Quality control flow diagram, used in quality control.

Page 18: White Box Testing

18

In software and systems development control flow diagrams can be used in control flow analysis, data flow

analysis, algorithm analysis, andsimulation. Control and data flow analysis are most applicable for real time and

data driven systems. These flow analyses transform logic and data requirements text into graphic flows which

are easier to analyze than the text. PERT, state transition, and transaction diagrams are examples of control

flow diagrams.[4]

Types of Control Flow Diagrams

Process Control Flow Diagram

A flow diagram can be developed for the process control system for each critical activity. Process control is

normally a closed cycle in which a sensor provides information to a process control software

application through a communications system. The application determines if the sensor information is within the

predetermined (or calculated) data parameters and constraints. The results of this comparison are fed to an

actuator, which controls the critical component. This feedback may control the component electronically or may

indicate the need for a manual action.[5]

This closed-cycle process has many checks and balances to ensure that it stays safe. The investigation of how

the process control can be subverted is likely to be extensive because all or part of the process control may be

oral instructions to an individual monitoring the process. It may be fully computer controlled and automated, or

it may be a hybrid in which only the sensor is automated and the action requires manual intervention. Further,

some process control systems may use prior generations of hardware and software, while others are state of

the art.[5]

Performance seeking control flow diagram

The figure presents an example of a performance seeking control flow diagram of the algorithm. The control

law consists of estimation, modeling, and optimization processes. In the Kalman filter estimator, the inputs,

outputs, and residuals were recorded. At the compact propulsion system modeling stage, all the estimated inlet

and engine parameters were recorded.[1]

In addition to temperatures, pressures, and control positions, such estimated parameters as stall margins,

thrust, and drag components were recorded. In the optimization phase, the operating condition constraints,

optimal solution, and linear programming health status condition codes were recorded. Finally, the actual

commands that were sent to the engine through the DEEC were recorded.[1] dfd(data float diagam)is network

manen ment system

Data flow diagramFrom Wikipedia, the free encyclopedia

Page 19: White Box Testing

19

Data flow diagram example.[1]

A data flow diagram (DFD) is a graphical representation of the "flow" of data through an information system.

DFDs can also be used for thevisualization of data processing (structured design).

On a DFD, data items flow from an external data source or an internal data store to an internal data store or an

external data sink, via an internal process.

A DFD provides no information about the timing of processes, or about whether processes will operate in

sequence or in parallel. It is therefore quite different from a flowchart, which shows the flow of control through

an algorithm, allowing a reader to determine what operations will be performed, in what order, and under what

circumstances, but not what kinds of data will be input to and output from the system, nor where the data will

come from and go to, nor where the data will be stored (all of which are shown on a DFD).

Overview

Data flow diagram example.

Page 20: White Box Testing

20

Data flow diagram -Yourdon/DeMarco notation.

It is common practice to draw a context-level data flow diagram first, which shows the interaction between the

system and external agents which act as data sources and data sinks. On the context diagram (also known as

the 'Level 0 DFD') the system's interactions with the outside world are modelled purely in terms of data flows

across the system boundary. The context diagram shows the entire system as a single process, and gives no

clues as to its internal organization.

This context-level DFD is next "exploded", to produce a Level 1 DFD that shows some of the detail of the

system being modeled. The Level 1 DFD shows how the system is divided into sub-systems (processes), each

of which deals with one or more of the data flows to or from an external agent, and which together provide all of

the functionality of the system as a whole. It also identifies internal data stores that must be present in order for

the system to do its job, and shows the flow of data between the various parts of the system.

Data flow diagrams were proposed by Larry Constantine, the original developer of structured design,[2] based

on Martin and Estrin's "data flow graph" model of computation.

Data flow diagrams (DFDs) are one of the three essential perspectives of the structured-systems analysis and

design method SSADM. The sponsor of a project and the end users will need to be briefed and consulted

throughout all stages of a system's evolution. With a data flow diagram, users are able to visualize how the

system will operate, what the system will accomplish, and how the system will be implemented. The old

system's dataflow diagrams can be drawn up and compared with the new system's data flow diagrams to draw

comparisons to implement a more efficient system. Data flow diagrams can be used to provide the end user

with a physical idea of where the data they input ultimately has an effect upon the structure of the whole system

from order to dispatch to report. How any system is developed can be determined through a data flow diagram.

In the course of developing a set of levelled data flow diagrams the analyst/designers is forced to address how

the system may be decomposed into component sub-systems, and to identify the transaction data in the data

model.

Page 21: White Box Testing

21

There are different notations to draw data flow diagrams (Yourdon & Coad and Gane & Sarson[3]), defining

different visual representations for processes, data stores, data flow, and external entities.[4]

Developing a data flow diagram

Event partitioning approach

Event partitioning was described by Edward Yourdon in Just Enough Structured Analysis.[5]

A context level Data flow diagram created using Select SSADM.

This level shows the overall context of the system and its operating environment and shows the whole system

as just one process. It does not usually show data stores, unless they are "owned" by external systems, e.g.

are accessed by but not maintained by this system, however, these are often shown as external entities.[6]

Level 1 (high level diagram)

This level (level 1) shows all processes at the first level of numbering, data stores, external entities and the

data flows between them. The purpose of this level is to show the major and high-level processes of the system

and their model will have one, and only one, level-1 diagram. A level-1 diagram must be balanced with its

parent context level diagram, i.e. there must be the same external entities and the same data flows, these can

be broken down to more detail in the level 1, example the "enquiry" data flow could be split into "enquiry

request" and "enquiry results" and still be valid.[6] This is all about using your creativity.

Level 2 (low level diagram)

A Level 2 Data flow diagram showing the "Process Enquiry" process for the same system.

Page 22: White Box Testing

22

This level is a decomposition of a process shown in a level-1 diagram, as such there should be a level-2

diagram for each and every process shown in a level-1 diagram. In this example, processes 1.1, 1.2 & 1.3 are

all vimal of process 1. Together they wholly and completely describe process 1, and combined must perform

the full capacity of this parent process. As before, a level-2 diagram must be balanced with its parent level-1

diagram.

Cyclomatic complexityFrom Wikipedia, the free encyclopedia

Cyclomatic complexity (or conditional complexity) is a software metric (measurement). It was developed by

Thomas J. McCabe, Sr. in 1976 and is used to indicate the complexity of a program. It directly measures the

number of linearly independent paths through a program'ssource code. The concept, although not the method,

is somewhat similar to that of general text complexity measured by the Flesch-Kincaid Readability Test.

Cyclomatic complexity is computed using the control flow graph of the program: the nodes of

the graph correspond to indivisible groups of commands of a program, and a directed edge connects two

nodes if the second command might be executed immediately after the first command. Cyclomatic complexity

may also be applied to individual functions, modules, methods or classes within a program.

One testing strategy, called Basis Path Testing by McCabe who first proposed it, is to test each linearly

independent path through the program; in this case, the number of test cases will equal the cyclomatic

complexity of the program.[1]

Description

Page 23: White Box Testing

23

A control flow graph of a simple program. The program begins executing at the red node, then enters a loop (group of three

nodes immediately below the red node). On exiting the loop, there is a conditional statement (group below the loop), and

finally the program exits at the blue node. For this graph, E = 9, N = 8 and P = 1, so the cyclomatic complexity of the

program is 3.

The cyclomatic complexity of a section of source code is the count of the number of linearly

independent paths through the source code. For instance, if the source code contained no decision points such

as IF statements or FOR loops, the complexity would be 1, since there is only a single path through the code. If

the code had a single IF statement containing a single condition there would be two paths through the code,

one path where the IF statement is evaluated as TRUE and one path where the IF statement is evaluated as

FALSE.

Mathematically, the cyclomatic complexity of a structured program[note 1] is defined with reference to a directed

graph containing the basic blocks of the program, with an edge between two basic blocks if control may pass

from the first to the second (the control flow graph of the program). The complexity is then defined as:[2]

M = E − N + 2P

where

M = cyclomatic complexity

E = the number of edges of the graph

N = the number of nodes of the graph

P = the number of connected components

The same function as above, shown as astrongly-connected control flow graph, for calculation via the alternative method.

For this graph, E = 10, N = 8 and P = 1, so the cyclomatic complexity of the program is still 3.

Page 24: White Box Testing

24

An alternative formulation is to use a graph in which each exit point is connected back to the entry point. In this

case, the graph is said to be strongly connected, and the cyclomatic complexity of the program is equal to the

cyclomatic number of its graph (also known as thefirst Betti number), which is defined as:[2]

M = E − N + P

This may be seen as calculating the number of linearly independent cycles that exist in the graph, i.e. those

cycles that do not contain other cycles within themselves. Note that because each exit point loops back to the

entry point, there is at least one such cycle for each exit point.

For a single program (or subroutine or method), P is always equal to 1. Cyclomatic complexity may, however,

be applied to several such programs or subprograms at the same time (e.g., to all of the methods in a class),

and in these cases P will be equal to the number of programs in question, as each subprogram will appear as a

disconnected subset of the graph.

It can be shown that the cyclomatic complexity of any structured program with only one entrance point and one

exit point is equal to the number of decision points (i.e., 'if' statements or conditional loops) contained in that

program plus one.[2][3]

Cyclomatic complexity may be extended to a program with multiple exit points; in this case it is equal to:

π - s + 2

where π is the number of decision points in the program, and s is the number of exit points.[3][4]

Formal definition

Formally, cyclomatic complexity can be defined as a relative Betti number, the size of a relative

homology group:

which is read as “the first homology of the graph G, relative to the terminal nodes t”. This is a technical way of

saying “the number of linearly independent paths through the flow graph from an entry to an exit”, where:

“linearly independent” corresponds to homology, and means one does not double-count backtracking;

“paths” corresponds to first homology: a path is a 1-dimensional object;

“relative” means the path must begin and end at an entry or exit point.

This corresponds to the intuitive notion of cyclomatic complexity, and can be calculated as above.

Alternatively, one can compute this via absolute Betti number (absolute homology – not relative) by identifying

(gluing together) all terminal nodes on a given component (or equivalently, draw paths connecting the exits to

the entrance), in which case (calling the new, augmented graph  , which is ), one obtains:

Page 25: White Box Testing

25

This corresponds to the characterization of cyclomatic complexity as “number of loops plus number of

components”.

Etymology / Naming

The name Cyclomatic Complexity presents some confusion, as this metric does not only count cycles (loops)

in the program. Instead, the name refers to the number of different cycles in the program control flow graph,

after having added an imagined branch back from the exit node to the entry node.[2]

A better name for popular usage would be Conditional Complexity, as "it has been found to be more

convenient to count conditions instead of predicates when calculating complexity".[5]

[edit]Applications

[edit]Limiting complexity during development

One of McCabe's original applications was to limit the complexity of routines during program development; he

recommended that programmers should count the complexity of the modules they are developing, and split

them into smaller modules whenever the cyclomatic complexity of the module exceeded 10.[2] This practice was

adopted by the NIST Structured Testing methodology, with an observation that since McCabe's original

publication, the figure of 10 had received substantial corroborating evidence, but that in some circumstances it

may be appropriate to relax the restriction and permit modules with a complexity as high as 15. As the

methodology acknowledged that there were occasional reasons for going beyond the agreed-upon limit, it

phrased its recommendation as: "For each module, either limit cyclomatic complexity to [the agreed-upon limit]

or provide a written explanation of why the limit was exceeded."[6]

[edit]Implications for Software Testing

Another application of cyclomatic complexity is in determining the number of test cases that are necessary to

achieve thorough test coverage of a particular module.

It is useful because of two properties of the cyclomatic complexity, M, for a specific module:

M is an upper bound for the number of test cases that are necessary to achieve a complete branch

coverage.

M is a lower bound for the number of paths through the control flow graph (CFG). Assuming each test

case takes one path, the number of cases needed to achieve path coverage is equal to the number of paths

that can actually be taken. But some paths may be impossible, so although the number of paths through the

CFG is clearly an upper bound on the number of test cases needed for path coverage, this latter number

(of possible paths) is sometimes less than M.

All three of the above numbers may be equal: branch coverage   cyclomatic complexity   number of paths.

For example, consider a program that consists of two sequential if-then-else statements.

if( c1() )

Page 26: White Box Testing

26

f1();

else

f2();

if( c2() )

f3();

else

f4();

The control flow graph of the source code above; the red circle is the entry point of the function, and the blue circle is the exit

point. The exit has been connected to the entry to make the graph strongly connected.

In this example, two test cases are sufficient to achieve a complete branch coverage, while four are necessary

for complete path coverage. The cyclomatic complexity of the program is 3 (as the strongly-connected graph

for the program contains 9 edges, 7 nodes and 1 connected component) (9-7+1).

In general, in order to fully test a module all execution paths through the module should be exercised. This

implies a module with a high complexity number requires more testing effort than a module with a lower value

since the higher complexity number indicates more pathways through the code. This also implies that a module

with higher complexity is more difficult for a programmer to understand since the programmer must understand

the different pathways and the results of those pathways.

Unfortunately, it is not always practical to test all possible paths through a program. Considering the example

above, each time an additional if-then-else statement is added, the number of possible paths doubles. As the

program grew in this fashion, it would quickly reach the point where testing all of the paths was impractical.

Page 27: White Box Testing

27

One common testing strategy, espoused for example by the NIST Structured Testing methodology, is to use

the cyclomatic complexity of a module to determine the number ofwhite-box tests that are required to obtain

sufficient coverage of the module. In almost all cases, according to such a methodology, a module should have

at least as many tests as its cyclomatic complexity; in most cases, this number of tests is adequate to exercise

all the relevant paths of the function.[6]

As an example of a function that requires more than simply branch coverage to test accurately, consider again

the above function, but assume that to avoid a bug occurring, any code that calls either f1() or f3() must also

call the other.[note 2] Assuming that the results of c1() and c2() are independent, that means that the function as

presented above contains a bug. Branch coverage would allow us to test the method with just two tests, and

one possible set of tests would be to test the following cases:

c1() returns true and c2() returns true

c1() returns false and c2() returns false

Neither of these cases exposes the bug. If, however, we use cyclomatic complexity to indicate the number of

tests we require, the number increases to 3. We must therefore test one of the following paths:

c1() returns true and c2() returns false

c1() returns false and c2() returns true

Either of these tests will expose the bug.

[edit]Cohesion

One would also expect that a module with higher complexity would tend to have lower cohesion (less than

functional cohesion) than a module with lower complexity. The possible correlation between higher complexity

measure with a lower level of cohesion is predicated on a module with more decision points generally

implementing more than a single well defined function. A 2005 study showed stronger correlations between

complexity metrics and an expert assessment of cohesion in the classes studied than the correlation between

the expert's assessment and metrics designed to calculate cohesion.[7]

[edit]Correlation to number of defects

A number of studies have investigated cyclomatic complexity's correlation to the number of defects contained

in a module. Most such studies find a strong positive correlation between cyclomatic complexity and defects:

modules that have the highest complexity tend to also contain the most defects. For example, a 2008 study by

metric-monitoring software supplier Enerjy analyzed classes of open-source Java applications and divided

them into two sets based on how commonly faults were found in them. They found strong correlation between

cyclomatic complexity and their faultiness, with classes with a combined complexity of 11 having a probability

of being fault-prone of just 0.28, rising to 0.98 for classes with a complexity of 74.[8]

Page 28: White Box Testing

28

However, studies that control for program size (i.e., comparing modules that have different complexities but

similar size, typically measured in lines of code) are generally less conclusive, with many finding no significant

correlation, while others do find correlation. Some researchers who have studied the area question the validity

of the methods used by the studies finding no correlation.[9]

Les Hatton claimed recently (Keynote at TAIC-PART 2008, Windsor, UK, Sept 2008) that McCabe Cyclomatic

Complexity has the same prediction ability as lines of code.