PASTE 2001 Making Slicing Practical: The Final Mile William Griswold University of California, San...
-
Upload
marlon-hazlett -
Category
Documents
-
view
215 -
download
0
Transcript of PASTE 2001 Making Slicing Practical: The Final Mile William Griswold University of California, San...
PASTE 2001
Making Slicing Practical: The Final MileWilliam GriswoldUniversity of California, San Diego
in collaboration with Leeann Bent (UCSD) & Darren Atkinson (Santa Clara University)
special thanks to GrammaTech Inc.
2
The Conundrum• Program slicing has been an archetype of
SE analysis for 20 years [Weiser 80]
– reveal hidden or dispersed program relationships
– assist in reasoning about and changing them
• Yet program slicers are not widely used
• Why not?– Then: implementations were too slow and
imprecise
– Now: inadequate attention to essential SE needs
3
Static Backward Program Slice
void setg(int a1, int a2) { int x = 1; g = a1; if (pred(a2)) { g = x; } printf(“%d”, g); }
Set of expressions and statements that may affect the value of a chosen variable referenceduring execution. [Weiser 81]
Background
initial slicing criterion
initial slicing criterion
4
Benefits & Applications of Slicing
– Debugging: What subset of statements might have helped produce the wrong value for g?
– Evolution: What components are inputs to the one I’m evolving? Does this change break that feature?
– Testing: I changed this statement, what subset of the tests do I need to rerun?
Key value is its subsetting property: helps a programmer focus on the parts of the system relevant to current task
Underlying dependences are also of potential value
5
Freddie Krueger, Tool User““Your tool can Your tool can solve all sorts of problems solve all sorts of problems for for usus. But . But it’ll have to analyze our have to analyze our entire 1 entire 1 MLOC programMLOC program, which is written in , which is written in 4 4 languageslanguages and and doesn’t compiledoesn’t compile right now. I right now. I want the results as want the results as fastfast as compilationas compilation, with , with an intuitive an intuitive graphicalgraphical display display linked to the linked to the source source andand integrated into our IDE integrated into our IDE. I want . I want to to save the resultssave the results, and have them , and have them automatically automatically updatedupdated as I change the as I change the program. By the way, I use program. By the way, I use WindowsWindows, but , but some of my colleagues use some of my colleagues use UnixUnix. It’s . It’s OK if OK if the tool misses stuff or returns lots of datathe tool misses stuff or returns lots of data, , as long asas long as we can post-process we can post-process. We just . We just need a net win.”need a net win.”
Our most recent study involved a 500 KLOC Fortran/C app developed on SGI’s
Our most recent study involved a 500 KLOC Fortran/C app developed on SGI’s
6
grep Slicer• Fast, scalable
• Easy to use
• Flexible (caps, words…)
• Language independent
• Integrated in every env.
• Imprecise– Allows iteration & filtering
• Used in many activities– Renaming, restructuring,
finding change points…
• Slow, doesn’t scale
• Easy to use
• Forward, backward, chop
• C; Java coming
• Stand-alone
• Precise, but slices large– Allows iteration &
filtering
• Several postulated uses
?
7
Our Experience on the Final Mile• Several years designing Sprite, a fast,
scalable program slicer for C
• Comparative study [2000] with GrammaTech’s CodeSurfer slicer for C, a commercial product
• Other tools: design, implementation, anduser studies
8
Sprite 1.2.2
Slicing CriterionSlicing Criterion
9
CodeSurfer 1.1p1
10
Why We’re Still on the Final Mile Little understanding of programmers’
needs, tasks, or how they would use a slicer
• Slicers suffer from inflexibility & poor usability
• User interfaces are in formative stages
• Several opportunities insufficiently explored
11
Usability and Flexibility
• Wrote few dozen “feature benchmarks”– Unstructured c-flow, context sensitivity,
pointers...
– Expose intervening, interacting algorithmic factors
• 5 small production-quality programs– compress, wally, ispell, ed, diff
• A few larger programs using Sprite– gcc, emacs, FEM app
• Varied slicer options, studied results
Part 1
12
Uninitialized Pointers
int *p; // uninitialized ... *p = x; y = *p;
• Neither tool included x in the slice
• No warning
• Misleading (say, if you’re debugging)
13
Undefined Functions
x = f(&y,z);
• Sprite: included call, but did not propagate
• CodeSurfer: also propagated to &y and z
– No propagation to z if slicing on y
• Both printed warning to terminal (not GUI)– Easy to not notice
– Easy to forget a library or have undefined function
14
Library Modeling• Both modeled effects of libraries with
skeletal functionsint write(int fd, char *s) { return (*filesystem[fd]++ = *s);}
• Only libc and libm, both incomplete– Only a few undefined functions for our programs
– CodeSurfer’s noticeably better, more complex
• Missing could impact precision, effort, or perf.– What if need Xlib???
15
Slicing into Callerssetg(a, b);...
}void setg(int a1, int a2) {
int x = 1; ...
• Sprite: could not slice into callers
• CodeSurfer: must slice into callers
Neither customizable to produce other’s results
(CodeSurfer did support single-step slicing and slicing on just control- or data-dependences)
16
Control Dependence Sensitivity• “Spurious” dependence can bloat slice by
x10 – Indirection through array of function pointers
• Sprite’s “strongly typed function calls” feature fixed gcc problems, not emacs
– Modelling of control dependences (CodeSurfer) switch (ch) { case ‘a’: if (pred(e)) return; g = x + 1; break; case ‘b’: g = x + 2; break; }
FuncType ops[N] = {&f1, &f2, …} ... (*ops[i])(a1, a2);
FuncType ops[N] = {&f1, &f2, …} ... (*ops[i])(a1, a2);
17
Statement Inclusion & Highlighting• Sprite: no global decls, uninitialized locals,
control flow keywords
– Included goto’s only if option enabled
• CodeSurfer: “executable slice” highlighting– Includes syntactic sugar
Right choice depends on user’s task – Omitting a statement could lead to oversight
– Overwhelming with highlighting can, too
18
Understanding Slicing Results• Hard: “Why is this statement in the slice?”
– Forgotten or incomplete libraries
– “Style” of slice (into callers, declarations)
– Real dependence or algorithmic artifact?
– Control, data, pointer, ...
– Gap betw. highlighting & underlying dependences• Querying dependence edges, points-to sets unhelpful
• Answer critical to correct software change
• We modified: comment out pointer assignment– Must rebuild, reslice, and compare results by hand
19
Summary - The Bad News• Tools had “hidden” behaviors on
erroneous or incomplete programs– Could mislead programmer, hampering use
• Tools had rigid notion of a slice– Each suited for some tasks, but not others
• Tools required significant effort to use– Completeness of libraries
– Understanding reasons for inclusion in slice
Implementations of algorithms, not full tools
20
Summary - The Good News
• Both tools supported (potentially) inaccurate analyses to remove uninteresting information
– Sprite: user customization of pointer properties
– CodeSurfer: I/O functions that aren’t interdependent
– Suggests generalizing slicing as an SE analysis
– Unfortunately, manipulating analysis can be costly
• Variance between tools exposes tool options
– Suggests user-customizable features
– E.g., highlighting options: decls versus no-decls
21
The Human–Computer Interface• Good user interfaces are crucial to
effectiveness– Leverage & aid programmer’s cognitive processes
– A key difference between an algorithm and a tool
• Example for evolution (what I understand)
– Tremendous scale
– Wide dispersal of information: redundancy, many references to an element
– Requirement for complete and consistent changeof the dispersed elements (e.g., rename)
– Invisibility of far-away information stresses recall
Part 2
22
• Easy to use and fast
– Takes a textual pattern and target for search
grep ”cursor” *.java
– Lists all matching lines and highlights in editor
– Next/Previous operations traverse the matches
• Scalability issues: limited visibility, inflexible– Can see just a few matches in context of use
– One search at a time; cannot compare results
Example - grep & Company
23
• Highly evolved artifact for coping with scale– Spatial organization, zooming, cursors, insets,
indices, folding, itineraries
• Idea: augment grep with Seesoft-like view– Delineated regions to represent files
– Colored symbols for matched lines in files
– Puts matches on an equal footing with modules
• See only portion of program, little of value
• Use map metaphor fully: Aspect Browser
Maps Enhance Visibility
Lines with two or more matches
Lines with two or more matches
Lines matched by pattern
Lines matched by pattern
File stripsFile strips
• See only portion of program, little of value
• Use map metaphor fully: Aspect Browser
24
“You are here”“You are here”
“Folding” cue that files are hidden
“Folding” cue that files are hidden
Views updated on save
Views updated on save
Now extending with Atlas metaphor to accommodate vast scale
Now extending with Atlas metaphor to accommodate vast scale
25
CodeSurfer GUI is half-way there
One-file “Seesoft” summary
One-file “Seesoft” summary
Traversal operationsTraversal operations
Cue of underlying hitsCue of underlying hits
26
Complements• Slice explainer
– High-level analysis of why statement is in slice
• May/Must highlighting– Overlay slice with highlighting dynamic slices or
execution coverage
– Uncovered stmts, dependences hint at imprecision
• Filtering w/ other analyses and customizations– CodeSurfer has chopping; Sprite set operations– grep, Ajax tools [O’Callahan]– PIM-like customizations [Field]
• With small run-time impact
Part 3: Opportunities
27
Emerging Challenges• Multi-threading [Hatcliff et al.], exceptions
– intrinsically flow-sensitive, so flow-insensitive pointer analysis is substantial loss [Rugina]
• What is a program? What is the program?– Federated client-server apps
• Often multi-language
– Trend is to write component glue• Less source code, but huge apps• How treat vast, numerous packages? Sans
source?– Analysis via Java library byte codes costly [O’Callahan]– Vast amount of stub code to write for libraries
28
Long-Standing Issues• Robustness - incomplete or buggy
programs– Useful for developing and evolving systems– Complicates the analysis– Harder for programmer to interpret results
• Integration into IDE’s [Hayes 2000]
– Work with IDE’s GUI and AST, or else bloat
– Reuse analyses across IDE’s• In principle not hard
• In practice, performance/precision tradeoffs can require big rewrites for “small” language change
29
Recommendations & Future Work• Explore non-algorithmic aspects of
precision
• Attend to users’ tasks and habits– Giving users what they need is “precision”
• Do an observational study of programmers– Essential problems: complex tasks of
programmers
– Accidental problems: kludgy user interfaces…
• Listen to Michael’s talk tomorrow ;)
30
Recap• 20 years of slicing research
– Huge strides in algorithmic precision & performance
– Little adoption
• “Non-algorithmic” factors like libraries have significant impact on precision– Cannot ignore, now that libs dominate applications
• Remaining problem is usefulness of slicers– Confusing behaviors, inflexibility, and need for
costly “programming” are barriers to adoption
– Understanding programmer’s use of dependences