Program Slicing: Theory and Applicationsansuman/sdv/ProgramSlicingPartII.pdf · WET: Recording /...
Transcript of Program Slicing: Theory and Applicationsansuman/sdv/ProgramSlicingPartII.pdf · WET: Recording /...
Program Slicing Part II: Tools and Applications
Agenda Brief discussion on an open-source slicer (WET)
Setting up, running and using WET Short demo
Applications of Program Slicing• Debugging• Functional cohesion• Program Comprehension• Software maintenance• Differencing• Integration• Testing• Quality assurance• Reverse engineering
2
WET (Whole Execution Traces) Open source software infrastructure capable of
tracing and analyzing long program executions Gathers both data + control dependency
information Written on top of Valgrind Online / offline slicing on execution traces
collected by Valgrind Record / replay traces for multi-threaded
programs Scheduling decisions for an executing program
3
WET infrastructure DIABLO
optional component of WET required only if control dependence information is desired
gcc toolchain To run Diablo on a target executable to compute
static control dependence information, the target executable must first be compiled statically using a patched gcc tool chain.
Valgrind a required component of WET. WET requires a
modified version of Valgrind 2.2.0
4
Installing and Running WET Install diablo, gcc-toolchain, valgrind (all
available from WET website) Running
Compile the C program using gcc –c –g Generate the executable using the gcc toolchain Invoke diablo on the executable for populating static
control dependence information Collect tracing information using valgrind
Generates a TRACE.txt file containing entire execution Can dump limited execution history as well
5
WET Trace Format Comprehensive Tracing Information
includes the statements exercised, the dependences exercised, and the values computed at each exercised statement instance.
Format of trace N instruction_block 1 instruction_block 2 ... instruction_block N
6
WET Trace Format The first line (N) denotes the total number of
dynamically-executed instructions for which tracing information is collected. This number represents the total number of instruction_block entries.
Each instruction_block entry contains all the tracing information for a particular executed instruction.
7
WET Trace Format
Each instruction_block is as follows. info dependences values_computed // in hexadecimal form
The first entry is a single line containing 6 distinct kinds of information. id num_use_port instr_addr fileName functionName
lineNum
8
WET Trace Formatid num_use_port instr_addr fileName functionName lineNum
• id is a unique numerical identifier for the instruction. • num_use_port is the number of uses
(dependences) for the instruction.• instr_addr denotes the instruction address of the
current instruction. • fileName, functionName, and lineNum respectively
denote the file name, the function name, and the source code line number of the current instruction.
9
WET Trace Format• Dependences show the dynamic dependence
information for each use port of the current instruction• the first use port always denotes control dependence
• Dependences are represented as:SIZE N
0:dep_id instance1:dep_id instance...N-1:dep_id instance
• N denotes the number of dependences• X:Y Z indicates that the Xth instance of given use port
dependent upon the Zth instance of Y. 10
Example1: int x, y, result;2: int main(int argc, char **argv)3: {4: if(argc >= 3)5: {6: x = atoi(argv[1]);7: y = atoi(argv[2]);8: result = x * 7;9: result = result + y;10: printf("result %d\n", result);11: }12: }
• Program is run with values 2 and 7 as input
• Statement 8 has address 0x8048242 with id 2118
• Value 2 for x from input
2118 2 8048242 foo1.c main 8SIZE 1
0:1873 0SIZE 1
0:2113 0VALUES 1
0:2
1873 2 8048210 foo1.c main 4…….2113 2 8048225 foo1.c main 6
11
WET: Recording / Slicing use model Slicing can be done on an instruction instance Offline using Java code for slicer
Works on comprehensive or limited history trace Online when valgrind runs
Can choose to dump only slice and no trace Produces a SLICE.txt file
Multi-threaded program record / replay Can record scheduling decisions when valgrind runs Stored in log file Can be used to replay for debugging purposes
12
Applications of Program Slicing
Applications of Program Slicing
• Debugging• Functional cohesion• Program Comprehension• Software maintenance• Differencing• Integration• Testing• Quality assurance• Reverse engineering
14
What is a bug?
8/20/2013 15
“A software bug (or just "bug") is an error, flaw, mistake, … in a computer program that prevents it from behaving as intended (e.g., producing an incorrect result). … Reports detailing bugs in a program are commonly known as bug reports, fault reports, … change requests, and so forth.” --- Wikipedia
“Even today, debugging remains very much of an art. Much of the computer science community has largely ignored the debugging problem….. over 50 percent of the problems resulted from the time and space chasm between symptom and root cause or inadequate debugging tools.” -- IBM Sys Jnl
Debugging is a time consuming activity
Need methods to trace back to root cause of bug from the manifested error
Software Debugging
Typical Debugging Steps
1. Hypothesize the cause of the
error
2. Try to confirm itObservable
error
Error?
Error?
Execution
Error? D
C
B
A
Software Debugging
Software Debugging Time consuming Key problem is automation
Automate Software Debugging
1. Hypothesize the cause of the error
2. Try to confirm it
• Automatic step • Identify suspicious statements
• Manual step• Use specification?
Slicing for Debugging
• Set criterion– Why is output v = 0 at end of program for input a = 2?
• Technique– Need not inspect all statements in execution trace– Use slicing to know how did v get its value at the
particular point of concern– Natural slicing criterion
• Inspect the slice– Could possibly lead programmer to suspect other
variables/line numbers– Slice again ….
The Product bug
/*Sum and Product of numbers 1…n */1. scanf(“%d”,&n);2. s=0;3. p=0;4. while (n>0)5. {6. s = s+n;7. p = p*n;8. n = n-1;9.}10. printf(“%d%d”,p,s);
Product incorrect, Sum correctly computed
No point looking at sum Focus on statements
which influence product computation Static Slicing on <p, 10>
scanf(“%d”,&n);p=0;while (n>0){
p = p*n;n = n-1;
}21
The Product bug: Can we do better?
/*Sum and Product of numbers 1…n */scanf(“%d”,&n);s=0;p=0;while (n>0){
s = s+n;p = p*n;n = n-1;
}printf(“%d%d”,p,s);
Static slice can assist debugging but not always Slices tend to be large
Use dynamic slicing Execute with n=0
Routine boundary test
Dynamic slicing helps the debugging activity greatly
p=0;
22
Slicing for Debugging
Program
Input
Exec. Trace
Output
OK Unexpected, debug it
Dynamic Slice =Bug Reportcriterion
Debugging
Instrument
Can locate faults if there is a committed errorCannot locate errors of omission
23
Debugging Program Evolution
Changes may induce failures in stable programs Changes may be due to several factors
Cost of managing software evolution takes large percent of the total cost of software.
Finding root cause of regression bugs is still a major headache in large software development projects.
8/20/2013 24
Manual review isn’t enough!
8/20/2013 25
A True Story Release 4.17 of the GNU debugger GDB brought several new features,
languages, and platforms, but for some reason, it no longer integrated properly with the graphical front-end data display debugger DDD
The arguments specified within DDD were not being passed to the debugged program.
Something changed within GDB such that it no longer worked
Something? Between the 4.16 and 4.17 releases, no less than 178,000 lines changed. How can I isolate the change that caused the failure and make GDB work again?
The GDB example is an instance of the “worked yesterday, not today” problem: after applying a set of changes, the program no longer works as it should.
Debugging Evolution Bugs
8/20/2013 27
Old Stable Program P
Test Input t
New Buggy Program P’
Change Analysis?
8/20/2013 28
…if (x > 0){
y = x +1;z = x;
w = x + 2;} else{
y = x;}…
…if (x > 0){
y = x;z = x;
w = x + 1;} else{
y = x + 1;}…
Requires defining the set of all changes.
Search among subsets !
Trace Comparison?
8/20/2013 29
Root cause
Compare failing test with a similar, successful test.
Requirement: How do we find such an execution?
Question : Why ignore the evolution?
An experiment we tried
8/20/2013 30
Bug-free distribution:Linux (GNU Core-utils, net –tools)
Buggy distribution: BusyBox
Busybox distribution is 121 KLOC.Errors to be root-caused in some common utilities: arp, top, printf
Development Changes
Trying on Embedded Linux The concept
Golden: GNU Coreutils Changed Version: Busybox
De-facto distribution for embedded devices. Aims for low code size Less checks and more errors.
The practice Large bug report produced using trace comparison
8/20/2013 31
A direct approach
Characterize observable error (obs) y != 0
Weakest Pre-condition based Analysis along failing path w.r.t. obs 2*x != 0 2*x + 1 != 0
Compare the WPs and find differing constraints. Map differing constraints to the lines contributing them.
8/20/2013 32
input x;y = 2 * x;output y
P P’ input x;y = 2*x+1; // bugoutput y
Entire failing trace is not needed
8/20/2013 33
1. ... // input inp1, inp22. if (inp1 > 0)3. x = f1(inp1); // bug4. else x = g1(inp1); 5. if (inp2 > 0)6. y = f2(inp2)7. else y = g2(inp2);8. ... // output x, y observe unexpected x < 0 for inp1 == inp2 == 1
Observable error: x<0 at line 8.WP along the trace of inp1 == inp2 == 1 gives us
inp2 > 0 ∧ inp1 > 0 ∧ f1(inp1) < 0Bug report = { 2, 3, 5 }Line 5 is clearly not relevant since inp2 does not contribute to computing x.
What is the issue? Did not exploit the dependency relationship
inp1 helps compute x inp2 helps compute y
Project the “relevant” part of the trace. Dynamic slicing Symbolic execution (WP computation) along the
dynamic slice only
Crucial for scalability of this method!
8/20/2013 34
Approach – in action (simplified)
8/20/2013 35
1. ... // input inp1, inp2
2. if (inp1 > 0)
3. x = f1(inp1); // bug
4. else x = g1(inp1);
5. if (inp2 > 0)
6. y = f2(inp2)
7. else y = g2(inp2);
8. ... // output x, y observe unexpected x < 0 for inp1 == inp2 == 1
f1(inp1) < 0 (data dep.)
f1(inp1) < 0 ∧ inp1 > 0 (control dep.)
f1(inp1)< 0 ∧ inp1> 0
Approach - summary Set observable error: x< 0 Set slicing criterion: value of x at line 8 Simultaneously perform
Dynamic slicing – Control and Data dependencies Symbolic execution – along the slice
WP computation along the slice
The above is performed on both P, P’ Produces WP, WP’ – conjunction of constraints Find differing constraints in WP, WP’ Map differing constraints to contributing LOC – this is
the bug-report.8/20/2013 36
A glimpse inside the ARP bug
8/20/2013 37
-AinetAll computers connected to host with inet address family
Embedded Linux
GNU Coreutils
Crash
Crash identified as NULL pointer access at crash site hw_type unexpectedly set as NULL at crash site
BusyBox ARP
8/20/2013 38
int arp_main(int argc, char **argv) {....option_mask32 = getopt32(argc, argv, "A:p:H:t:i:adnDsv“, &protocol,
&protocol, &hw_type, &hw_type, &device);if (option_mask32 & ARP_OPT_A || option_mask32 & ARP_OPT_p){
ap = get_aftype(protocol);}if (option_mask32 & ARP_OPT_A || option_mask32 & ARP_OPT_p){
hw = get_hwtype(hw_type);}....
const struct hwtype *get_hwtype (const char *name) {const struct hwtype * const *hwp;hwp = hwtypes;while (*hwp != NULL) {
if (!strcmp((*hwp)->name, name))return (*hwp);
hwp++;}return NULL;}
Test Input: -Ainet
Crash since name is NULL
Slicing Criterianame == NULL
Appears in WP
WP along slice
Should not have been executed
Check should have beenif (option_mask32 & ARP_OPT_H)
ARP in net-tools
8/20/2013 39
int main(int argc, char **argv) {int i, lop, what;while ((i = getopt_long(argc,argv, "A:H:adfp:nsei:t:vh?DNV“,
longopts, &lop)) != EOF) {switch (i) {
case ’A’: case ’p’: ap = get_aftype(optarg);break;case ’H’: case ’t’: hw = get_hwtype(optarg); hw_set = 1;break;default : break;
}if (hw_set==0)
if ((hw = get_hwtype(DFLT_HW)) == NULL)// Error check and exit
}
const struct hwtype *get_hwtype (const char *name) {const struct hwtype * const *hwp; hwp = hwtypes;while (*hwp != NULL) {
if (!strcmp((*hwp)->name, name))return (*hwp);
hwp++;}return NULL;
Test Input: -Ainet
name = DFLT_HW
Is not executed
Slicing Criterianame != NULL
Does not appear in WP
WP along slice
Methodology in action
8/20/2013 40
Trace Collection with –Ainet for buggy and stable ARP
Identify variable responsible for crash and map it to stable ARP
name == NULL name != NULL
WP on busyBoxtrace along slice
WP on net-tools trace along slice
Stable WP
Buggy WP
STP solver
WP comparison
Map back to source to Generate bug report
Differing WP terms
Applications of Program Slicing
• Debugging• Functional cohesion• Program Comprehension• Differencing• Integration• Software maintenance• Testing• Quality assurance• Reverse engineering
41
Cohesion Cohesion of a unit, of a module, of an object, or a component
addresses the attribute of “ degree of relatedness” within that unit, module, object, or component.
Functional
Sequential
Communicational
Procedural
Temporal
Logical
Coincidental
Levels of Cohesion
where Functionalis the
“highest”
Performing more than 1unrelated functions
Performing 1 single function
Higher the better
Importance in System Design During design the system is decomposed into
modules and the relationships among modules are indicated
Two structural design criteria as to the “goodness” of a module Cohesion : Glue for intra-module components Coupling : Strength of inter-module connections
Cohesion Cohesive programs easier to maintain, modify, reuse
Encapsulation in Object-oriented programs
Syntax preserving static slicing for measuring cohesiveness proposed by Ott and Bieman Slice captures a thread through a program concerned with
the computation of a single variable If we take several slices from a function, each for a
different variable and find that these slices have a lot of code in common, variables are somewhat related Function’s tasks are strongly related and function exhibits high
cohesiveness
Cohesion Measurement
Decide the processing elements of a function whose slices let us decide cohesiveness Values printed by a function Global variables Reference parameters
Assume values printed at end of function
Isolate the processing element by slicing the program for the variable whose value is output
Cohesive section of a function is made up of statements present in all processing elements
Examplevoid Marks( ){ int Pass, Fail, Count;
Pass = 0 ;Fail = 0 ;Count = 0 ;while (!eof()) {
input(Marks);if (Marks >= 40)
Pass = Pass + 1;if (Marks < 40)
Fail = Fail + 1;Count = Count + 1;
}output(Count) ;output(Pass) ;output(Fail) ;
}
void Processing_element_count(){ int Count;
Count = 0 ;while (!eof()) {
input(Marks);Count = Count + 1;
}}
void Processing_element_pass (){ int Pass;
Pass = 0 ;while (!eof()) {
input(Marks);if (Marks >= 40)
Pass = Pass + 1;}
} 46
Example (contd.)void Processing_element_fail (){ int Fail;
Fail = 0 ;while (!eof()) {
input(Marks);if (Marks < 40)
Fail = Fail+ 1;}
}
Little overlap between 3 processing elements
Only 2 of 10 lines of code are in all 3 slices
Cohesion is 2/10 = 0.2
void Marks(){ int Pass, Fail, Count;
Pass = 0 ; PFail = 0 ; FCount = 0 ; C
while (!eof()) { C P Finput(Marks); C P Fif (Marks >= 40) PPass = Pass + 1; Pif (Marks < 40) FFail = Fail + 1; FCount = Count + 1;} Coutput(Count) ;output(Pass) ;output(Fail) ;
}47
Example of High Cohesion High overlap between 2
processing elements 7 of 12 lines of code are
in both slices Cohesion is 7/12
void MinMax(){ int Smallest, Largest, num, i;
for (i=0;i<10;i=i+1) { L Sinput(num); L SNumArray[i] = num;} L S
Smallest = NumArray[0]; L SLargest = Smallest; Li = 1; L Swhile (i<10) { L Sif (Smallest > NumArray[i]) S
Smallest = NumArray[i]; Sif (Largest < NumArray[i]) L
Largest = NumArray[i]; L i = i + 1;} L S
output(Smallest);output(Largest);
}48
Cohesion Measurement Formalism A few important concepts from program and data slices:
A data token is any variable or constant in the program A slice within a program is the collection of all the statements that
can affect the value of some specific variable of interest. A data slice is the collection of all the data tokens in the slice that
will affect the value of a specific variable of interest. Glue tokens are the data tokens in the program that lie in more than
one data slice. Super glue tokens are the data tokens in the program that lie in
every data slice of the program
Measure Program Cohesion through 2 metrics:
- weak functional cohesion = (# of glue tokens) / (total # of data tokens)- strong functional cohesion = (#of super glue tokens) / (total 3 of data tokens)
Procedure Sum and Product (N : Integer; Var SumN, ProdN : Integer);Var I : IntegerBegin
SumN : = 0; ProdN : = 1;For I : = 1to N do begin
SumN : = SumN + IProdN: = ProdN + I
End;End;
1-50
Data Slice for SumN ( N : Integer; Var SumN, ProdN : Integer);Var I : IntegerBegin
SumN : = 0; ProdN : = 1;For I : = 1 to N do begin
SumN : = SumN + IProdN: = ProdN + I
End;End;
Data Slice for SumN = N1·SumN1·I1·SumN2·01·I2·12·N2·SumN3·SumN4·I31-51
Data Slice for ProdN
Data Slice for ProdN = N1·ProdN1·I1·ProdN2·11·I2·12·N2·ProdN3·ProdN4·I4
( N : Integer; Var SumN, ProdN : Integer);Var I : IntegerBegin
SumN : = 0; ProdN : = 1;For I : = 1 to N do begin
SumN : = SumN + IProdN: = ProdN + I
End;End;
1-52
Data token SumN ProdNN1
SumN1
ProdN1
I1
SumN2
01
ProdN2
11
I2
12
N2
SumN3
SumN4
I3
ProdN3
ProdN4
I4
11
111
111111
1
11
11111
111 1-53
Super Glue
S1 S2 S3I I I Super GlueI
II
I I I Super GlueI
I I GlueI I Glue
1-54
Functional Cohesion
Strong functional cohesion (SFC) in this case is the same as WFCSFC = 5/17 = 0.204
• If we had computed only SumN or ProdNthen SFC = 17/17 = 1
• Glue tokens bind slices• Relative adhesiveness / stickiness
1-55
Applications of Program Slicing
• Debugging• Functional cohesion• Program Comprehension• Differencing• Integration• Software maintenance• Testing• Quality assurance• Reverse engineering
56
Program Comprehension
Software Maintenance phase often starts with program comprehension Legacy systems with sparse documentation Original developers not available
Slicing can help Conditioned / Constrained slicing Condition can be used to identify cases of interest Amorphous Conditioned slicing can assist further
Program is a set of cases captured by a condition Set of conditions used to construct a case
Examplefor(i=0;isspace(s[i]);i++);sign=(s[i]==’-’)?-1:1;if(s[i]==’+’ || s[i]==’-’) i++;for(n=0;isdigit(s[i]);i++)
n = 10*n + (s[i]-’0’);r=sign*n;
atoi program (K&R)
Effect of this code when string begins with ‘+’?
Amorphous conditioned slicing helps
Variable of interest is r Slicing condition s[0] == ‘+’ Slicing condition s[0] == ‘-’ Slicing condition s == “”
for(i=1;n=0;isdigit(s[i]);i++)n = 10*n + (s[i]-’0’);
r=n;
r=0
for(i=1;n=0;isdigit(s[i]);i++)n = 10*n + (s[i]-’0’);
r=-n;
58
Program Comprehension
Conditioned slicing, combined with amorphous slicing helps in program comprehension
Conditions used to capture cases of interest Allows program to be broken into fragments each of
which are relevant to a particular form of computation Amorphous slicing with respect to the condition
removes parts of the program that are irrelevant, focusing on the conditions of interest
Applications of Program Slicing
• Debugging• Functional cohesion• Program Comprehension• Software maintenance• Differencing• Integration• Testing• Quality assurance• Reverse engineering
60
Software Maintenance In “Kill that Code!”, G. Weinberg describes the world’s
most expensive program errors. The top three disasters were caused by a change to exactly one line of code: “Each one involved the change of a single digit in a previously correct program.” The argument goes that since the change was to only one line, the usual mechanisms for change control could be circumvented. the results were catastrophic
Weinberg offers a partial explanation: “Unexpected linkages,” i.e., the value of the modified variable
was used in some other place in the program.
Software Maintenance Maintenance is difficult: hard to determine when a
code change will affect some other piece of code
Inconsistencies introduced by code change Due to unexpected linkages Pinpoint potential inconsistencies after changes done? Can we resolve these inconsistencies? (NP-hard)
Construct a solution that implements changes within the semantic constraints Prohibit linkages into code changing which will have
ripple effects
Example Unix utility wc Slice on nw outputs no.
of words in a file Slice on nc outputs no.
of chars in a file Slice on nl outputs no.
of lines in a file Use decomposition to
guarantee no ripple effects induced by modifications in a component
Independent of slicing method
#define YES 1 #define NO 0main ( ) {
int c, nl , nw, nc, inword ;inword = NO ;n l = 0; nw = 0; nc = 0; c = getchar();while ( c != EOF ) {
nc = nc + 1;if (c == '\n‘) nl = nl + 1;if (c == '\n‘ || c == ‘ ‘ || c == ‘\t’)
inword = NO;else i f (inword == NO) {
inword = YES ;nw = nw + 1;
}c = getchar();
}L: printf("%d \n" , nl ); printf("%d \n",nw) ;
printf("%d \n",nc);63
Slices for (nw, L) and (nc, L)#define YES 1 #define NO 0main ( ) {
int c, nw, inword ;inword = NO ;nw = 0; c = getchar();while ( c != EOF ) {
if (c == '\n‘ || c == ‘ ‘ || c == ‘\t’)inword = NO;
else i f (inword == NO) {inword = YES ;nw = nw + 1;
}c = getchar();
}printf("%d \n",nw);
}
#define YES 1 #define NO 0main ( ) {
int c, nc;nc = 0; c = getchar();while ( c != EOF ) {
nc = nc + 1; c = getchar();
}printf("%d \n",nc);
}
64
Slices for (inword, L) (nl, L) (c, L)#define YES 1 #define NO 0main ( ) {
int c, inword ;inword = NO ;c = getchar();while ( c != EOF ) {
if (c == '\n‘ || c == ‘ ‘ || c == ‘\t’)inword = NO;
else i f (inword == NO) {inword = YES ;
}c = getchar();
}printf("%d \n",nw);
}
#define YES 1 #define NO 0main ( ) {
int c, nl;nl = 0; c = getchar();while ( c != EOF ) {
if (c == '\n‘) nl = nl + 1; c = getchar();
}printf("%d \n",nl);
}
main ( ) {int c;c = getchar();while ( c != EOF ) {
c = getchar();}
}65
Slicing in Software maintenance
Decompose a program "directly" into two components with respect to a variable v A decomposition slice for v, which is the union of
certain slices taken at certain line numbers on vCaptures all computations on v Independent of program location
Complement of the decomposition slice for vMust remain fixed after any change on v Is also a program slice.
Allows a programmer to alter code without unintentional ripple effects
Why decomposition slice?
1 input a2 input b3 T = a t b4 print T5 T = a – b6 print T
Slice S (T,4) = {1, 2, 3, 4} Slice S (T, 6) = {1, 2, 5, 6}
Slicing at last statement of a program not enough to get all computations involving the slice variable
A decomposition slice captures all computations on a given variable
67
Constructing a decomposition slice
Similar to concept of critical instruction in dead code elimination algorithms Locate instructions that are useful in some sense Declared to be critical
Start by marking output statements critical Use-definition chains traced to mark instructions that
impact output statements
Decomposition slice is union of a collection of slices
Constructing the complement
Complement constructed in such a way that when certain statements of decomposition slice are removed from the original program, the program that remains is the complement
Decomposition slice used to guide systematic statement removal Cannot construct the complement by removing all
statements in the slice The slice has to be executable – some statements are
needed in both
More about Decomposition slice
Let Output(P, v) be the set of statements in prog. P that output variable v, let last be the last statement of P, and let N = Output(P,v) U {last}. The statements in Un ∈ N S(v) form the decomposition slice on v, denoted S(v).
Relationship between decomposition slices Take the decomposition slice for each variable in the
program and form a lattice of these decomposition slices, ordered by set inclusion
Output restricted decomposition slices
More about Decomposition slice Output restricted decomposition slices S(v) and S(w) are
independent if S(v) ∩ S(w) = 0 a peculiar program would have independent decomposition slices Weakly dependent if S(v) ∩ S(w) is not empty
Let S(v) and S(w) be output-restricted decomposition slices, v ≠ w, and S(v) ⊂ S (w). S (v) is said to be strongly dependent on S(w)
An output-restricted slice S(w) that is not strongly dependent on any other slice is maximal Ends of the lattice
S(nc), S(nl), and S(nw) are maximal. S(inword) is strongly dependent on S(nw); S(c) is strongly dependent on all others
Statement Classification Statements in S(v) ∩ S(w) are slice dependent
contained in decomposition slices which are interior points of the lattice
Slice independent statements are statements which are not slice dependent in a maximal decomposition slice which are not in the
union of the decomposition slices which are properly contained in the maximal slice
Two or more slices depend on the computation performed by dependent statements Independent ones do not contribute to other slices
Statement Classification 12 of slice on nc is independent with respect to all 13 and 14 of nl slice are independent with respect to all Decomposition slice on c is strongly dependent on
others 6 and 15-20 of the slice on nw are slice independent
statements with respect to S(nc), S(nl), and S(c); 19 is slice independent when compared with S(inword). Statements 6, 15-18, and 20 of the decomposition slice on
inword are slice independent statements with respect to S(nc), S(nZ), and S(c); no statements are slice independent when compared with S(nw)
Decomposition Principle When modifying a program, dependent statements
cannot be changed or the effect will ripple out of the focus of interest.
Given a maximal output-restricted decomposition slice S(v) of program P, delete the independent and output statements of S and P Gives the complement of S(v)
Complements of nw and nl
main ( ) {int c, nl , nw, nc, inword ;n l = 0; nc = 0; c = getchar();while ( c != EOF ) {
nc = nc + 1;if (c == '\n‘) nl = nl + 1;c = getchar();
}printf("%d \n" , nl );printf("%d \n",nc);
#define YES 1 #define NO 0main ( ) {
int c, nl , nw, nc, inword ;inword = NO ;nw = 0; nc = 0; c = getchar();while ( c != EOF ) {
nc = nc + 1;if (c == '\n‘ || c == ‘ ‘ || c == ‘\t’)
inword = NO;else i f (inword == NO) {
inword = YES ;nw = nw + 1;
}c = getchar();
}printf("%d \n",nw) ;printf("%d \n",nc); 75
Slicing in Program Modification Statement independence can be used to build a set
of guidelines for software modification
Classify variables as dependent or independent A variable that is the target of a dependent assignment
statement is called a dependent variable. if all assignments to a variable are in independent
statements, then the variable is called an independent variable
Modification Principles Independent statements may be deleted from a
decomposition slice Assignment statements that target independent variables may
be added anywhere in a decomposition slice Logical expressions (and output statements) may be added
anywhere in a decomposition slice New control statements that surround (i.e., control) any
dependent statement will cause the complement to change. Changes to a dependent variable v in the extracted slice can
be done in one of the following two approaches: Extend the slice so that v is independent in the slice. Add a new local variable (to the slice), copy the value to
the new variable, and manipulate the new name only
Merging and Testing modifications Merging the modified slice with the complement easy
Maintainer is actually editing the entire program Working on a view of the program with the unneeded
statements deleted and with the dependent statements restricted from modification
Slice gives a smaller piece of code to focus on Rules of the previous slide provide the means by which the
deleted and restricted parts cannot be changed accidentally
Changes restricted to independent or newly created variables Testing is reduced to testing the modified slice
Surgeon’s Assistant A differencing tool based on decomposition slicing,
called the Surgeon’s Assistant, partitions a program into three parts ( assume the computation of variable v is to be changed)
Independent part statements in the decomposition slice taken with respect to v that are not in any other decomposition slice
Dependent part statements in the decomposition slice taken with respect to v that are in another decomposition slice.
Complement statements that are not independent ( statements in some other decomposition slice, but not v’s)
Applications of Program Slicing
• Debugging• Functional cohesion• Program Comprehension• Software maintenance• Differencing• Integration• Testing• Quality assurance• Reverse engineering
80
Differencing
Need for differencing between two programs
Algorithm for finding textual differencing between programs are often insufficient
Slicing can be applied to identify semantic difference
Solution to differencing problem
Compare the backward slices of vertices of old andnew’s dependence graphs Gold and Gnew
Components whose vertices in Gold and Gnew haveisomorphic slices have the same behavior in old andnew.
Set of vertices from Gnew for which there is novertex in Gold with an isomorphic slice safelyapproximates the components with different behavior.
Solution to differencing problem
This set is safe as it is guaranteed to contain allthe components with different behavior.
It is an approximation
Applications of Program Slicing
• Debugging• Functional cohesion• Program Comprehension• Software maintenance• Differencing• Integration• Testing• Quality assurance• Reverse engineering
84
Program Integration Concerns the problem of merging program variants Given a program Base and two variants A and B,
each created by separate copies of Base. Goal:-
To determine whether the modifications interfere. If don’t, to create an integrated program to incorporate
both sets of changes, as well as the portions of Base preserved in both variants.
Need for integration When system is customized by a user and
simultaneously upgraded by a maintainer and user desires a customized version
When several versions of a program exist and the same enhancement or bug-fix to be made to all of them
Integration using slicing• It uses program differencing to identify the
changes in Variants A and B with respect toBase
• Preserved components are those components that are not affected in A and B
• This set is safely approximated as the set of components with isomorphic slices in Base , A, and B.
Integration using slicing A merged program is obtained by taking the
graph union of the differences between A and Base, the differences between B and Base, and the preserved components.
The merged program produced captures the changed behavior of A and B along with the preserved behavior of all three programs.
While it is NP-hard, an important property of the algorithm is that it is semantics-based.
An integration tool makes use of knowledge of the programming language to determine whether the changes made to Base to create the variants have undesirable semantic interactions.Only if there is no such interference will the
tool produce an integrated programThe algorithm also provides guarantees about
how the execution behavior of the integrated program relates to the execution behaviors of the base program and two variants
Integration using slicing
Applications of Program Slicing
• Debugging• Functional cohesion• Program Comprehension• Software maintenance• Differencing• Integration• Testing• Quality assurance• Reverse engineering
90
Testing Software maintainers are also faced with the
task of regression testing: retesting a software after modification
This may require to run large number of test cases though changes are small
Although the effort required to make a small change may be minimal, the effort required to retest a program after such a change may be substantial
Slicing to reduce regression testing cost Decomposition slicing eliminates the need for
regression testing on the complement, there may be a substantial number of tests to be run on the dependent, independent and changed parts.
Slicing can be used to reduce the number of these tests.
Slicing to reduce regression testing cost Here the algorithms assume programs are tested
using test data adequacy criteria: a minimum standard that a test suite must satisfy e.g. all statements criterion: requires that all statements
in a program to be executed by at least one test case
Gupta et. al. : algorithm for reducing cost of regression testing that uses slicing to determine parts affected transitively by an edit at point p
Bates and Horwitz: test case selection for allvertices and all flow-edges test data adequacy Key notion: Equivalent execution pattern
Control Slice• Components with equivalent execution patterns
are identified using a new kind of slice calledcontrol slice
• Control slice: a slice taken with respect to the control predecessors of a vertex– includes the statements necessary to capture when
a statement is executed without capturing the computation carried out at the statement
Slicing in Regression testing
Program differencing can be used to further reduce the cost of regression by reducing the size of the program that the test must run on
For a small change, the program produced using the program differencing techniques is considerably smaller and consequently requires fewer resource to retest, especially when run on the reduced test set produced
Applications of Program Slicing
• Debugging• Functional cohesion• Program Comprehension• Software maintenance• Differencing• Integration• Testing• Quality assurance• Reverse engineering
96
Software quality assurance Software quality assurance auditors are faced with a
myriad of difficulties, ranging from inadequate time to inadequate computer-aided software engineering (CASE) tools
One particular problem is the location of safety critical code that may be interleaved throughout the entire system
Another problem is once this code is located, its effects throughout the system are difficult to ascertain.
Solution to problems• Program slicing is applied to migrate these
difficulties in two ways
• Program slicing can be used to locate all code that contributes to the value of variables that might be part of a safety critical component
• Slicing based techniques can be used to validate functional diversity (that means there is no interaction between a safety critical to another safety critical system and interaction between non-safety critical system to a safety critical one)
Common mode failure A design error in h/w or s/w, or an implementation error
in s/w may result in a Common Mode Failure(CMF)
A CMF is a failure as a result of a common cause, such as the failure of a system caused by the incorrect computation of an algorithm. Suppose X and Y are distinct critical outputs and X
measures rate of increase while Y measures rate of decrease.
If the computation of the both of the rates depends on a call to a common numerical differentiator, then a failure in the differentiator can cause a CMF of X and Y.
Solution to this problem This can be solved by combing Fault Tree
Analysis and program slicing
Once the system hazards have been identified, the objective of the fault tree analysis is to migrate the risk that they will occur.
Applications of Program Slicing
• Debugging• Functional cohesion• Program Comprehension• Software maintenance• Differencing• Integration• Testing• Quality assurance• Reverse engineering
101
Reverse Engineering Reverse engineering concerns the problem of
comprehending the current design of a program and the way this design differs from the original design
Involves abstracting out of the source code, the design decisions and rationale from the initial development (design recognition) and understanding the algorithms chosen (algorithm recognition)
Slicing in Reverse Engineering
Program slicing provides a toolset for this type of re-abstraction
Example a program can be displayed as a lattice of slices ordered by the is-a-slice-of relation
Comparing the original lattice and the lattice after ( years of) maintenance can guide an engineer towards places where reverse engineering energy should be spent
Interface slicing Interface slicing can be used for reverse
engineering
An interface slicing is essentially a forward slice taken with respect to the entry vertices in a collection of procedures
This projection of a general software module (e.g., a set, list, window widget), captures the particular behaviors required for a particular use
Interface slicing
A interface slice is computed from an interface dependence graph as a forward graph traversal (e.g., traverses the dependence edges from target to source)
Starting from all calls on procedure P, this “backward” interface slice includes the public interfaces for those procedures (from other modules) that require P.
References[1] S. Horwitz and T. Reps. Efficient comparison of program slices. Technical Report 983,
University of Wisconsin at Madison, 1990.[2] D. Binkley, S. Horwitz, T. Reps. Program integration for languages with procedure
calls. ACM Transactions on Software Engineering and Methodology, 4(1):3-35, January 1995.
[3] V. Berzins . Software merge: Models and methods for combining changes to programs. International Journal on Systems Integration, 1:121-141, August 1991.
[4] S. Horwitz, J. Prins and T. Reps. Integrating non-interfering versions of programs. ACM Transactions on programming languages and Systems, 11(3):345-387, July 1989.
[5] K. B. Gallagher and J. R. Lyle. Using program slicing in software maintenance. IEEE transactions on Software Engineering, 17(8):751-761, August 1991.
[6] K. B. Gallagher. The surgeon’s assistant. In Software Engineering Reseacrh Forum, Boca Raton, FL, November 1995.
[7] S. Bates and S. Horwitz. Incremental program testing using program dependence graphs. In Conference Record of the Twenth ACM Symposium on Principles of Programming Languages,. ACM, 1993
[8] S. Rapps and E. J. Weyuker. Selecting software test adta using data flow information. IEEE Transactions on Software Engineering, SE-11(4):367-375, 1985.
106106
References[9] D. Binkley, S. Horwitz and T. Reps. Program integration for languages witgh
procedure calls. ACM an Transcations on Software Engineering and Methodology, 4(1):3-35, January 1995.
[10] R. Gupta, M.J. Harrold and M.L. Soffa. An approach to regression testing using slicing. In proceedings of the Ieee Conference on Software Maintenance, pages 299-308, 1992.
[11] D . Binkley. Using sematic differencing to reduce the cost of regression testing. In Proceedings of the Conference on Software Maintenance- 1992, pages 41-50, November 1992
[12] K. B. Gallagher and J. R. Lyle. Program slicing and software safety. In proceedings of the Eight Annual Conference on Computer Assurance, pages 71-80, June 1993. COMPASS ‘93.
[13] J. R. Lyle, D. R. Wallace, J. R. Graham, K.B. Gallagher, J.E. Poole and D. W. Binkley. A CASE tool to evaluate functional diversity in high integrity software, U.S. Department of Commerce, Technology Administration, National Institute of Standards and Technology, Computer Systems Laboratory, Gaithersburg, M.D, 1995.
[14] K.B. Gallagher and J.R. Lyle. Using program slicing in software maintenance. IEEE Transactiona on Software Engineering, 17(8):751-761, August 1991.
107107
References[15] T. Reps. Algebraic properties of program integration. Science of Computer
Programming, 17:139-215, 1991.[16] J. Beck. Program and intergace slicing for reverse engineering. In Proceedings of
the Fifteenth International Conference on Software Engineering, 1883. also in Proceedings of the Working Conference on Reverse Engineering.
[17] E. Yourdon and L. Constantine. Structured Design. Prentic e-Hall, Englewood Cliffs, New Jersy, 1979.
[18] J. Biemen and L.Ott. Measuring functional cohesion. IEEE Transactions on Software Engineering, 20(8):644-657, August 1994.
108108
References[19] D. Binkley, S. Horwitz and T. Reps. Program integration for languages witgh
procedure calls. ACM an Transcations on Software Engineering and Methodology, 4(1):3-35, January 1995.
[20] R. Gupta, M.J. Harrold and M.L. Soffa. An approach to regression testing using slicing. In proceedings of the Ieee Conference on Software Maintenance, pages 299-308, 1992.
[21] D . Binkley. Using sematic differencing to reduce the cost of regression testing. In Proceedings of the Conference on Software Maintenance- 1992, pages 41-50, November 1992
[22] K. B. Gallagher and J. R. Lyle. Program slicing and software safety. In proceedings of the Eight Annual Conference on Computer Assurance, pages 71-80, June 1993. COMPASS ‘93.
[23] J. R. Lyle, D. R. Wallace, J. R. Graham, K.B. Gallagher, J.E. Poole and D. W. Binkley. A CASE tool to evaluate functional diversity in high integrity software, U.S. Department of Commerce, Technology Administration, National Institute of Standards and Technology, Computer Systems Laboratory, Gaithersburg, M.D, 1995.
[24] K.B. Gallagher and J.R. Lyle. Using program slicing in software maintenance. IEEE Transactiona on Software Engineering, 17(8):751-761, August 1991.
109109
Thank You