Applications of Type Constraints in Software Engineering Tools
-
Upload
lawrence-bullock -
Category
Documents
-
view
25 -
download
2
description
Transcript of Applications of Type Constraints in Software Engineering Tools
IBM Research
© 2004 IBM Corporation
Applications of Type Constraints in Software Engineering Tools
Frank TipIBM T.J. Watson Research Center
IBM Research
© 2004 IBM Corporation2
This Presentation is Based on Joint Work With
Ittai Balaban (New York University)
Dirk Bäumer (IBM Zurich Research Center)
Bjorn De Sutter (Ghent University)
Julian Dolby (IBM T.J. Watson Research Center)
Robert Fuhrer (IBM T.J. Watson Research Center)
Adam Kieżun (MIT)
IBM Research
© 2004 IBM Corporation3
IBM Research
about 3000 people world-wide– 1600 at IBM T.J. Watson Research Center
– other sites: Almaden, Austin, Zurich, Haifa, China, India
Software Technology Department– about 70 people, director Daniel Yellin
– projects on: compiler optimization (JikesRVM), aspects, performance analysis, web services, refactoring, verification, XML, ...
– www.research.ibm.com/compsci/plansoft/index.html
ARTIST project (Advanced Refactoring Tools for Improving Software archiTecture)– Robert Fuhrer, Mandana Vaziri, Tim Klinger, Adam Kiezun (intern),
Frank Tip (project leader)
– collaboration with Eclipse JDT team at IBM Zurich
– collaboration with IBM Rational
– academic collaborations with Bjorn De Sutter (Ghent University), Ittai Balaban (NYU)
IBM Research
© 2004 IBM Corporation4
Other Research Activities
change impact analysis– given an old and a new version of a program, and a test that fails in
the new version, find the subset of the source code changes responsible for the failure
– with Barbara Ryder and Xiaoxia Ren (Rutgers) and Julian Dolby (IBM), Max Stoerzer (University of Passau)
– papers: PASTE’01, OOPSLA’04
Jax: an application extractor for Java– apply static analysis techniques to eliminate redundant functionality
from Java applications, and apply size-reducing transformations
– with Peter Sweeney, Chris Laffra, Aldo Eisma, David Streeter
– transferred to IBM product (WebSphere Studio Device Developer)
– papers: CACM’03, TOPLAS’02, OOPSLA’00, FSE’00, OOPSLA’99
IBM Research
© 2004 IBM Corporation5
Outline
background
type constraints for Java programs– notation and terminology
– constraint generation rules
applications– generalization-related refactorings (OOPSLA’03)
– customization of library classes (ECOOP’04)
– refactorings for introducing generics (work in progress)
related work
conclusions and future work
IBM Research
© 2004 IBM Corporation6
Outline
background
type constraints for Java programs– notation and terminology
– constraint generation rules
applications– generalization-related refactorings (OOPSLA’03)
– customization of library classes (ECOOP’04)
– refactorings for introducing generics (work in progress)
related work
conclusions and future work
IBM Research
© 2004 IBM Corporation7
Scope of our Research
start with a type-correct Java program P
for a given transformation that transforms P into P’– we would like to check/guarantee that P’ is type-correct
– we would like to check/guarantee that P’ has the same behavior as P
– (in some cases) compute “maximal” P’ for which the above properties hold
we use type constraints to establish these properties– formalism for expressing relationships between program expressions that
must hold in order for a program to be type-correct
– traditionally used for type checking and type inference
transformations under consideration– refactorings: well-known maintenance operations, usually aimed at making
code more flexible/general; proposed by the programmer
– driven by static/dynamic analysis in link-time optimizer
IBM Research
© 2004 IBM Corporation8
Refactoring
refactoring: the application of behavior-preserving transformations to a program in order to improve a program’s design
– eliminating undesirable program characteristics
– e.g., duplicated code, classes/methods that are too large,...
– making existing classes/methods usable in new contexts
– preparing for extensions
– breaking up monolithic systems into components
– introduction of design patterns
refactoring (noun): a specific program transformation. Usually identified by:
– name (e.g., “Extract Method”, “Pull Up Members”, ...)
– preconditions
– a specific set of transformations to be performed by a programmer or by an automated tool
IBM Research
© 2004 IBM Corporation9
Refactoring
pioneered by Griswold [1991], Opdyke [1992] & Johnson, leading to Smalltalk Refactoring Browser [Roberts 1992]
recently popularized by continuous-refinement methodologies such as “Extreme Programming” [Beck 2000]
catalogues of common refactorings:[Fowler 1999], [Kerievsky 2003]
Fowler describes refactorings as a series of steps to be performed by the programmer
– manual refactoring is very error-prone
– renewed interest in automated refactoring support in IDEs
– refactoring support featured in Eclipse, IntelliJ IDEA, OmniCore, ...
IBM Research
© 2004 IBM Corporation10
Categories of Refactorings (see Fowler’s book)
making method calls simpler– Rename Method, Add/Remove Parameter, ...
composing methods– Extract Method, Inline Method, Inline Local, ...
moving features between objects– Move Method, Move Field, Extract Class, ...
organizing data– Self-Encapsulate Field, Replace Data Value with Object, ...
simplifying/eliminating conditionals– Replace Conditional with Polymorphism, ...
dealing with generalization– Extract Interface, Pull Up Members, ...
IBM Research
© 2004 IBM Corporation11
Eclipse (www.eclipse.org) open-source (CPL) development environment
– implemented in Java, XML– basis for commercial offerings by IBM (WSAD, WSDD) and others
plugin-architecture– plugins contribute views/perspectives– plugins provide extension points
state-of-the-art development environment for Java– quick-fixes, refactoring, type hierarchy view, call hierarchy, search facilities– support for other languages (C, Smalltalk, AspectJ)
various IBM programs focused on Eclipse– Eclipse Innovation Grants for academics (2002, 2003)– Eclipse Technology Exchange meetings (ICSE, OOPSLA, ECOOP)
solid basis for research/education projects– Penumbra, Gild, Hipikat, ECESIS, ...– Continuous Testing, Java Traits, Ownership Types, ...
IBM Research
© 2004 IBM Corporation12
Demo: Eclipse Refactorings
IBM Research
© 2004 IBM Corporation13
Outline
background
type constraints for Java programs– notation and terminology
– constraint generation rules
applications– generalization-related refactorings (OOPSLA’03)
– customization of library classes (ECOOP’04)
– refactorings for introducing generics (work in progress)
related work
conclusions and future work
IBM Research
© 2004 IBM Corporation14
Type Constraints
formalism developed in 1990s
– captures relationships between types of program constructs
original purpose: type checking/inference
– prove that certain kinds of errors cannot occur at run-time
– e.g., no “message not understood” errors
we use a variation on the formalism from a book by Palsberg & Schwartzbach
– adapted/extended to capture the semantics of Java
IBM Research
© 2004 IBM Corporation15
Type Constraints Notation
[E] the type of expression E
[M] the declared return type of method M
[F] the declared type of field F
Decl(M) the type that contains method M
Param(M,i) the i-th parameter of method M
, subtype relation
IBM Research
© 2004 IBM Corporation16
Syntax of Type Constraints
[E] = [E’] the type of expression E must be the same as the
type of expression E’
[E] [E’] the type of expression E is a proper
subtype of the type of expression E’
[E] [E’] either [E] = [E’] or [E] [E’]
[E] T the type of expression E is defined to be T
[E] [E1] or ... or [E] [Ek]
disjunction: at least one of subconstraints
[E] [E1], ..., [E] [Ek] must hold
IBM Research
© 2004 IBM Corporation17
Generating Type Constraints
declaration C v [v] C
assignment E1 = E2 [E2] [E1]
access E.f to field F [E.f] [F]
[E] Decl(F)
return E in method M [E] [M]
method M in class C Decl(M) C
this in method M [this] Decl(M)
direct call E.m(E1,...,En) to method M [E.m(E1,...,En)] [M]
[Ei] [Param(M,i)]
[E] Decl(M)
IBM Research
© 2004 IBM Corporation18
for a call E.m(E1,...,En) to a virtual method M
RootDefs(M) = { M’ | M overrides M’, and there exists no M’’ (M’’ M’) such that M’ overrides M’’ }
Virtual Method Calls
[E.m(E1,...,En) ] [M]
[Ei] [Param(M,i)]
[E] Decl(M1) or... or [E] Decl(Mk) where RootDefs(M) = { M1,...,Mk }
IBM Research
© 2004 IBM Corporation19
Constraints for Virtual Method Calls
public void foo(String s1, String s2) {
Hashtable h = new Hashtable();
h.put(s1, s2);
}
[h] Decl(Map.put(...)) or [h] Decl(Dictionary.put(...))
Map
Hashtable
Dictionary
Map
[h] Map or [h] Dictionary
put()
put()put()
IBM Research
© 2004 IBM Corporation20
Constraints for Overriding & Hiding
if method M’ overrides method M, M’ M
if field F’ hides field F
[Param(M’,i)] = [Param(M,i)]
[M’] = [M]
Decl(M’) < Decl(M)
Decl(F’) < Decl(F)
IBM Research
© 2004 IBM Corporation21
Casts
for a cast (C)E[(C)E] C
[E] [(C)E] or [(C)E] [E]
if C is a class and [E] is a class
the latter constraint need not be generated if C or |E| is an interface
these constraints only capture the requirements for type-correctness (not necessarily program behavior)
it is possible to avoid generating disjunctions by preserving the “directionality” of the cast
IBM Research
© 2004 IBM Corporation22
Outline
background
type constraints for Java programs– notation and terminology
– constraint generation rules
applications– generalization-related refactorings (OOPSLA’03)
– customization of library classes (ECOOP’04)
– refactorings for introducing generics (work in progress)
related work
conclusions and future work
IBM Research
© 2004 IBM Corporation23
Refactoring for Generalization
several refactorings are concerned with generalization
– moving methods/fields to superclasses and subclasses
– splitting & merging of classes
– manipulating the types of declarations
Chapter 11 of Fowler’s book mentions:
– Extract Interface
– Pull Up Member(s)
– Push Down Member(s)
– Extract Subclass
– Generalize Type
IBM Research
© 2004 IBM Corporation24
Extract Interface – Recipe
select class C
select subset M of C’s methods
create interface I containing declarations of the methods in M
add inheritance “C implements I”
“Adjust client type declarations to use the interface” [Fowler, p.342]
IBM Research
© 2004 IBM Corporation25
Extract Interface: An Example
List class with methods as follows:
– add(Comparable) add an element
– addAll(List) add contents of another List
– iterator() iteration support
– sort() sorts the list
ListIterator class
– implements java.util.Iterator; methods hasNext(), next()
Client class
– create List; add some elements
– add contents of another List; sort the List
– print contents of the List
extract an interface Bag from List
– declares add(Comparable), addAll(List), iterator()
interface Bag { public Iterator iterator(); public List add(Comparable e); public List addAll(List v0);}class List implements Bag { int size = 0; Comparable[] elems = new Comparable[10]; public Iterator iterator(){ return new ListIterator(this); } public List add(Comparable e) { if (this.size + 1 == this.elems.length) { Comparable[] newElems = new Comparable[2 * this.size]; System.arraycopy(this.elems, 0, newElems, 0, this.size); this.elems = newElems; } this.elems[this.size++] = e; return this; } public List addAll(List v1) { java.util.Iterator i = v1.iterator(); for (; i.hasNext(); this.add((Comparable)i.next())); return this; } public void sort() { /* insertion sort */ }}
List/Bag Example (1)
class ListIterator implements java.util.Iterator { private int count = 0; private List v2; ListIterator(List v3){ v2 = v3; } public boolean hasNext(){ return this.count < this.v2.size; } public Object next(){ return this.v2.elems[this.count++]; }}public class Client { public static void main(String[] args) { List v4 = createList(); populate(v4); update(v4); sortList(v4); print(v4); } static List createList(){ return new List(); } static void populate(List v5){ v5.add("foo").add("bar"); } static void update(List v6) { List v7 = new List().add("zap").add("baz"); v6.addAll(v7); } static void sortList(List v8){ v8.sort(); } static void print(List v9) { for (Iterator iter = v9.iterator(); iter.hasNext();) System.out.println("Object: " + iter.next()); }}
List/Bag Example (2)
ListList
List
List
List
IBM Research
© 2004 IBM Corporation28
Problem Statement
identify all declarations that can be updated to make use of the newly extracted interface
want to be able to reason about:
– correctness of the solution
– maximality of the solution
IBM Research
© 2004 IBM Corporation29
Using Type Constraints
declared types of variables, fields, parameters constrained by:
– field access, method calls
– assignments, parameter-passing
several other invariants must be maintained to preserve type-correctness & program behavior
Observation: all these constraints can be stated succinctly and uniformly using type constraints
IBM Research
© 2004 IBM Corporation30
List.add(),Bag.add() [Bag.add()] = [List.add()]
List.addAll(),Bag.addAll() [v0] = [v1]
[Bag.addAll()] = [List.addAll()]
List.iterator() List [v3]
List.add() List [List.add()]
List.addAll() [v1] Bag, List [List.addAll()]
ListIterator.iterator() [v3] [v2]
ListIterator.hasNext() [v2] List
ListIterator.next() [v2] List
Client.main() [Client.createList()] [v4], [v4] [v5], [v4] [v6], [v4] [l8], [v4] [l9]
Client.createList() List [Client.createList()]
Client.populate() [v5] Bag, [List.add()] Bag
Client.update() [List.add()] [v7], [List.add()] Bag, [v6] Bag, [v7] [v1]
Client.sortList() [v8] List
Client.print() [v9] Bag
IBM Research
© 2004 IBM Corporation31
Observation
the constraints for the original program contain all the information we need
some declarations cannot be updated
List [v3] [v2] List
[v4] [v8] List
other variables are less constrained
[v1] Bag
IBM Research
© 2004 IBM Corporation32
Algorithm for Determining “Updatable” Declarations
iterative algorithm for determining non-updatable declarations
– first determine declarations that cannot be updated because of member access (e.g., [v2] List, [v8] List)
– if x is non-updatable, and there is a type constraint
[y] [x], [y] = [x], or [y] < [x]
then y is non-updatable
iterate until fixed-point is reached
IBM Research
© 2004 IBM Corporation33
Non-Updatable Declarations for the Example Program
{ v2, v3, v4, v8, Client.createList() }
(consistent with earlier result)
IBM Research
© 2004 IBM Corporation34
Justification (Details in Paper)
type-correctness
– updating the “updatable” declaration elements results in a program that satisfies all type constraints
preservation of behavior
– argument based on the fact that method dispatch, cast/instanceof behavior do not depend on declared types
maximality
– updating any non-updatable declarations will result in the violation of type constraints
IBM Research
© 2004 IBM Corporation35
Another Refactoring: Pull Up Members
class A { ...}
class B extends A { public B foo(){ return this;}}
[this] Decl(B.foo())Decl(B.foo()) B[B.foo()] B[this] [B.foo()]?
IBM Research
© 2004 IBM Corporation36
Pull Up Members (2)
class A { public B foo(){ return this;}}
class B extends A { ...}
[this] Decl(A.foo())Decl(A.foo()) A[A.foo()] B
[this] ≤ [A.foo()]
IBM Research
© 2004 IBM Corporation37
Other Refactorings
Generalize Type
– update the type of a declaration E
– use type constraints to determine allowable supertypes/subtypes
– may enable Pull Up Members in certain cases
Extract Subclass
– splitting of a class
– can be treated similarly as Extract Interface
Push Down Members
– the “inverse” of Pull Up Members
– similar issues
IBM Research
© 2004 IBM Corporation38
Perspective
infer from original program a system of ordering constraints between types of declaration elements
– original program is just one possible solution
Extract Interface
– declarations: variables
– locations of members: constants
Pull Up Members
– declarations: constants
– locations of members: variables
Generalize Type
– selected declaration: variable
– all other declarations & locations of members: constants
IBM Research
© 2004 IBM Corporation39
Demo: Extract Interface & Generalize Type
IBM Research
© 2004 IBM Corporation40
Outline
background
type constraints for Java programs– notation and terminology
– constraint generation rules
applications– generalization-related refactorings (OOPSLA’03)
– customization of library classes (ECOOP’04)
– refactorings for introducing generics (work in progress)
related work
conclusions and future work
IBM Research
© 2004 IBM Corporation41
Class Libraries
class libraries improve programmer productivity– programmers don’t have to waste time developing & debugging
standard infrastructure
but... class libraries are often implemented with some typical/ average usage pattern in mind
for example: container class implementations assume that:– elements are accessed often & frequently– a large number of elements is stored
performance loss if the actual usage of a library class differs from this typical usage pattern “MyHashTable”, “SmartHashtable”,... in various benchmarks
IBM Research
© 2004 IBM Corporation42
Our Approach
derive custom versions from library classes
rewrite application to use these custom versions
ship custom library classes with application
technical foundations:
– use type constraints to determine where custom classes can be used
– use profile information to determine where introducing custom classes is profitable
– use static analysis and profile information to decide how to customize
IBM Research
© 2004 IBM Corporation43
Example Program
class Example { void foo(Map m){ Hashtable r1 = new Hashtable(); JTree tree = new JTree(r1); Hashtable r2 = new Hashtable(); Hashtable r3 = new Hashtable(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(Object o){ Hashtable r4 = (Hashtable) o; if (r4.contains(“FOO”)) {…} }}
class Example { void foo(M m){ H r1 = new H(); JTree tree = new JTree(r1); H r2 = new H(); H r3 = new H(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(O o){ H r4 = (H) o; if (r4.contains(“FOO”)) {…} }}
Map
Hashtable
Object
DictionaryString M
H
O
DS M
H
O
DS
IBM Research
© 2004 IBM Corporation44
class Example { void foo(M m){ H r1 = new H(); JTree tree = new JTree(r1); H r2 = new H(); H r3 = new H(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(O o){ H r4 = (H) o; if (r4.contains(“FOO”)) {…} }}
class Example { void foo(M m){ H r1 = new H(); JTree tree = new JTree(r1); H r2 = new H()H1(); H r3 = new H()H2(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(O o){ H r4 = (H) o; if (r4.contains(“FOO”)) {…} }}
How to customize? M
H
O
DS
H2H1
IBM Research
© 2004 IBM Corporation45
H2H1
How to customize? M
H
O
DS
class Example { void foo(M m){ H r1 = new H(); JTree tree = new JTree(r1); H r2 = new H()H1(); H r3 = new H()H2(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(O o){ H r4 = (H) o; if (r4.contains(“FOO”)) {…} }}
H2H1
class Example { void foo(M m){ H r1 = new H(); JTree tree = new JTree(r1); H H1 r2 = new H()H1(); H H2 r3 = new H()H2(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(O o){ H r4 = (H) o; if (r4.contains(“FOO”)) {…} }}
IBM Research
© 2004 IBM Corporation46
How to customize?
class Example { void foo(M m){ H r1 = new H(); JTree tree = new JTree(r1); H H1 r2 = new H()H1(); H H1 r3 = new H()H1(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(O o){ H r4 = (H) o; if (r4.contains(“FOO”)) {…} }}
H2H1
M
H
O
DS
IBM Research
© 2004 IBM Corporation47
H2H1
AH
How to customize?
class Example { void foo(M m){ H r1 = new H(); JTree tree = new JTree(r1); H H1 r2 = new H()H1(); H H1 r3 = new H()H1(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(O o){ H r4 = (H) o; if (r4.contains(“FOO”)) {…} }}
class Example { void foo(M m){ H r1 = new H(); JTree tree = new JTree(r1); H AH r2 = new H()H1(); H AH r3 = new H()H2(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(O o){ H r4 = (H) o; if (r4.contains(“FOO”)) {…} }}
• update allocations of library types
• update declarations
H2H1
M
H
O
DS
IBM Research
© 2004 IBM Corporation48
Restrictions?
class Example { void foo(M m){ H r1 = new H(); JTree tree = new JTree(r1); H r2 = new H(); H r3 = new H(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(O o){ H r4 = (H) o; if (r4.contains(“FOO”)) {…} }}
• type correctness
H2H1
AH
M
H
O
DS
• interface compatibility
• preserve behavior of cast and instanceof operations
call to:javax.swing.JTree(Hashtable)
IBM Research
© 2004 IBM Corporation49
Outline of Approach
generate type constraints for program
– additional constraints generated to ensure that behavior of cast/instanceof operations is preserved
constraint simplification
– rewrite/replace all constraints to use “≤” only
solve the resulting constraint system
rewrite the program’s declarations and allocation sites to use the inferred types
IBM Research
© 2004 IBM Corporation50
Preserving the Behavior of Cast & instanceof we want to change declarations and allocation sites
– need to ensure that cast/instanceof operations succeed and fail in exactly the same cases as before
– use points-to analysis to approximate the set of objects to which the cast/instanceof is applied
– easily expressed using constraint (to be replaced with a ≤ constraint)
public class Example { void zip(){ zap(new Hashtable()); // A1 zap(new String()); // A2 } void zap(Object o){ Hashtable h = (Hashtable)o; // C }}
A1 ≤ C
A2 C
IBM Research
© 2004 IBM Corporation51
a1 ≤ d1 d1 ≤ H a2 ≤ d2 a3 ≤ d3 d2 ≤ D v d2 ≤ M d3 ≤ d4 d3 ≤ d2 d2 ≤ M S ≤ d4 c1 ≤ d4 v d4 ≤ c1 c1 ≤ d5 d5 ≤ H v d5 ≤ AH a3 ≤ c1 S c1
Type constraints
class Example { void foo(M m){ H r1 = new H(); JTree tree = new JTree(r1); H r2 = new H(); H r3 = new H(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(O o){ H r4 = (H) o; if (r4.contains(“FOO”)) {…} }}
class Example { void foo(M m){ d1 r1 = new a1(); JTree tree = new JTree(r1); d2 r2 = new a2(); d3 r3 = new a3(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(d4 o){ d5 r4 = (c1) o; if (r4.contains(“FOO”)) {…} }}
H
H2H1
AH
M
H
O
DS
d5 ≤ H
c1 ≤ H
IBM Research
© 2004 IBM Corporation52
Type constraints H
H2H1
AH
M
H
O
DS
a1
d1d2
a2a3
d3
M HH
d5
c1 S
d4
a1 ≤ d1 d1 ≤ H a2 ≤ d2 a3 ≤ d3 d2 ≤ D v d2 ≤ M d3 ≤ d4 d3 ≤ d2 d2 ≤ M S ≤ d4 c1 ≤ d4 v d4 ≤ c1 c1 ≤ d5 d5 ≤ H v d5 ≤ AH a3 ≤ c1 S c1
d5 ≤ H
c1 ≤ H
IBM Research
© 2004 IBM Corporation53
Constraint Solving H
H2H1
AH
M
H
O
DS
{O,S,H,H1,H2}{O,S,H,H1,H2} {O,S,H,H1,H2}
a1
d1d2
a2a3
d3
M HH
d5
c1 S
d4{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
d1 ≤ H
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2}
a1 ≤ d1
d5 ≤ HT
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2}{O,S,H,H1,H2}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
{O,S,H,H1,H2,D,M,AH}
IBM Research
© 2004 IBM Corporation54
class Example { void foo(M m){ d1 r1 = new a1(); JTree tree = new JTree(r1); d2 r2 = new a2(); d3 r3 = new a3(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(d4 o){ d5 r4 = (c1) o; if (r4.contains(“FOO”)) {…} }}
Rewriting the Example Program
a1
d1d2
a2
M HH
d5
c1 S
d4
{H1}{H2}
{AH}
{H2}
{O} {AH}
{AH}
{H}
{H}
a3
d3
class Example { void foo(M m){ H r1 = new H(); JTree tree = new JTree(r1); d2 r2 = new a2(); d3 r3 = new a3(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(d4 o){ d5 r4 = (c1) o; if (r4.contains(“FOO”)) {…} }}
class Example { void foo(M m){ H r1 = new H(); JTree tree = new JTree(r1); d2 r2 = new H1(); d3 r3 = new H2(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(d4 o){ d5 r4 = (c1) o; if (r4.contains(“FOO”)) {…} }}
class Example { void foo(M m){ H r1 = new H(); JTree tree = new JTree(r1); AH r2 = new H1(); AH r3 = new H2(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(d4 o){ d5 r4 = (c1) o; if (r4.contains(“FOO”)) {…} }}
class Example { void foo(M m){ H r1 = new H(); JTree tree = new JTree(r1); AH r2 = new H1(); AH r3 = new H2(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(O o){ d5 r4 = (c1) o; if (r4.contains(“FOO”)) {…} }}
class Example { void foo(M m){ H r1 = new H(); JTree tree = new JTree(r1); AH r2 = new H1(); AH r3 = new H2(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(O o){ d5 r4 = (H2) o; if (r4.contains(“FOO”)) {…} }}
class Example { void foo(M m){ H r1 = new H(); JTree tree = new JTree(r1); AH r2 = new H1(); AH r3 = new H2(); r2.put(“FOO”,“BAR”); bar(r3); r2 = r3; r2.putAll(m); bar(“HELLO”); } void bar(O o){ AH r4 = (H2) o; if (r4.contains(“FOO”)) {…} }}
IBM Research
© 2004 IBM Corporation55
Creating Custom Classes
1. create custom “profiling” Hashtable
– determine how often allocation sites are executed
– simulate caching schemes
– number of succeeding/failing get/put operations
2. static analysis (using “gnosis” framework developed at IBM)
– construct call graph (0-CFA, distinct allocation sites for classes of interest)
– compute type estimates
– escape analysis
3. generate custom implementations: H1, H2, …
– generated from template (using C preprocessor)
4. rewrite bytecode for the program
H2H1
AH
M
H
O
DS
IBM Research
© 2004 IBM Corporation56
Generating Custom Classes
1. lazy vs. eager allocation
2. synchronized vs. unsynchronized
3. optimizing edge cases
4. caching of frequently accessed objects
5. removal of unused fail-safe iteration code
6. …
H2H1
AH
M
H
O
DS
IBM Research
© 2004 IBM Corporation57
Applied Customizations _202_jess– specialization of Hashtable keys (String/Integer)– synchronization removal on frequently used Vectors
_209_db– use caching to optimize consecutive Vector-retrievals– synchronization removal on frequently used Vectors
_218_jack– 99% of all search operations are on empty Hashtables– lazy allocation, removal of bookkeeping for fail-safe iterators– synchronization removal on Hashtables
Jax– most containers remain small, decrease initial container size
HyperJ– optimization of empty Hashtables, removal of bookkeeping for fail-safe iterators– synchronization removal
Chess*– frequent iteration over Hashtables of fixed, small size– use smaller initial size
Pmd *– the vast majority of a huge number of allocated HashSets remains empty– lazy allocation, removal of bookkeeping for fail-safe iterators
*no synchronization removal because of GUI-related multi-threading in these benchmarks
IBM Research
© 2004 IBM Corporation58
Speedups
customization of:– java.util.* containers– StringBuffers (desynchronization only)
measurements taken on HyperThreaded Pentium 4 @ 2.8Ghz running Linux 2.4.21
IBM Research
© 2004 IBM Corporation59
Heap Consumption
significant reduction in heap consumption on _218_jack because of lazy allocation of many Hashtable-objects that remain empty
IBM Research
© 2004 IBM Corporation60
Impact on Application Size
note: original size of _209_db is only 6KB.– 15 KB of custom container classes are added
on large benchmarks (>100Kb), the size increase is <= 12%
IBM Research
© 2004 IBM Corporation61
Outline
background
type constraints for Java programs– notation and terminology
– constraint generation rules
applications– generalization-related refactorings (OOPSLA’03)
– customization of library classes (ECOOP’04)
– refactorings for introducing generics (work in progress)
related work
conclusions and future work
IBM Research
© 2004 IBM Corporation62
generics (parametric polymorphism) to be introduced in Java 1.5– classes can have type parameters that have optional bounds– reduces need for downcasts
class Hashtable<Key,Value> { ... }
class Tree<Elem extends Comparable<Elem>> { ... }
Hashtable<Integer,String> table = new Hashtable<Integer,String>();
...String s = table.get(someInteger);
Java Generics
IBM Research
© 2004 IBM Corporation63
Generic Collections
in most Java applications, the use of Collection classes is the main source of down-casts
the standard libraries for Java 1.5 contain generic versions of existing Collection classes
– Vector<T> instead of Vector
– HashMap<K,V> instead of HashMap
goal: refactor applications that use non-generic collections
– make them use generic collections instead
– use type inference to infer element types
– remove downcasts
IBM Research
© 2004 IBM Corporation64
class A {
public void foo(){
Vector v1 = new Vector();
String s1= "aaa";
this.insert(v1, s1);
String s2= (String)v1.get(0);
}
public void insert(List v2, Object o){
v2.add(o);
}
}
Example 1
IBM Research
© 2004 IBM Corporation65
class A {
public void foo(){
Vector<String> v1 = new Vector<String>();
String s1= "aaa";
this.insert(v1, s1);
String s2= (String)v1.get(0);
}
public void insert(List<String> v2, String o){
v2.add(o);
}
}
Example 1 (refactored)
update “collection” declarations
remove casts
note update of declaration of o
IBM Research
© 2004 IBM Corporation66
public void bar(){
List v1= new Vector();
v1.add(new Float(3.4));
this.reverse(v1);
Float f1 = (Float) v1.iterator().next();
}
public void baz(){
List v2 = new Vector();
v2.add(new Integer(17));
this.reverse(v2);
Integer i1 = (Integer) v2.iterator().next();
}
public void reverse(List v3){
for (int t=0; t < v3.size()/2; t++){
Object temp = v3.get(v3.size()-1);
v3.add(v3.size()-1, v3.get(t));
v3.add(t, temp);
}
}
Example 2
IBM Research
© 2004 IBM Corporation67
public void bar(){
List<Number> v1= new Vector<Number>(); v1.add(new Float(3.4));
this.reverse(v1);
Float f1 = (Float) v1.iterator().next();
}
public void baz(){
List<Number> v2 = new Vector<Number>(); v2.add(new Integer(17));
this.reverse(v2);
Integer i1 = (Integer) v2.iterator().next();
}
public void reverse(List<Number> v3){ for (int t=0; t < v3.size()/2; t++){
Number temp = v3.get(v3.size()-1); v3.add(v3.size()-1, v3.get(t));
v3.add(t, temp);
}
}
Example 2(version 1)
element types “merged” in reverse()
cannot remove casts in callers
public void bar(){
List<Float> v1= new Vector<Float>(); v1.add(new Float(3.4));
this.reverse(v1);
Float f1 = (Float) v1.iterator().next();
}
public void baz(){
List<Integer> v2 = new Vector<Integer>(); v2.add(new Integer(17));
this.reverse(v2);
Integer i1 = (Integer) v2.iterator().next();
}
public <T> void reverse(List<T> v3){ for (int t=0; t < v3.size()/2; t++){
T temp = v3.get(v3.size()-1); v3.add(v3.size()-1, v3.get(t));
v3.add(t, temp);
}
}
Example 2(version 2)
obs: no flow of values between different invocations of reverse()
need for context-sensitive analysis
introduction of type parameters
IBM Research
© 2004 IBM Corporation69
Outline of Approach
context inference
– use low-cost variation on Agesen’s Cartesian Product Algorithm (CPA) [Agesen:95] for inferring relevant contexts
– simultaneously computes points-to information for expressions and a set of contexts for each method
type inference
– generate type constraints for the program that explicitly encode context information
– solving the type constraints produces element types for declarations and allocations of container class types
source rewriting
– analyze (element) types inferred for different contexts, introduce type parameter if necessary
IBM Research
© 2004 IBM Corporation70
public void bar(){
List v1= new Vector(); // L1
v1.add(new Float(3.4));
this.reverse(v1);
Float f1 = (Float) v1.iterator().next();
}
public void baz(){
List v2 = new Vector(); // L2
v2.add(new Integer(17));
this.reverse(v2);
Integer i1 = (Integer) v2.iterator().next();
}
public void reverse(List v3){
for (int t=0; t < v3.size()/2; t++){
Object temp = v3.get(v3.size()-1);
v3.add(v3.size()-1, v3.get(t));
v3.add(t, temp);
}
}
Context Inference[●]
[●]
[●,Lext] [●,L1] [●,L2]
[●]
[●,L1]
[●]
[●,L2]
IBM Research
© 2004 IBM Corporation71
public void bar(){
List v1= new Vector(); // L1
v1.add(new Float(3.4));
this.reverse(v1);
Float f1 = (Float) v1.iterator().next();
}
public void baz(){
List v2 = new Vector(); // L2
v2.add(new Integer(17));
this.reverse(v2);
Integer i1 = (Integer) v2.iterator().next();
}
public void reverse(List v3){
for (int t=0; t < v3.size()/2; t++){
Object temp = v3.get(v3.size()-1);
v3.add(v3.size()-1, v3.get(t));
v3.add(t, temp);
}
}
Example Constraints[●]
[●]
[●,L1], [●,L2], [●,Lext]
|new Vector()|[●] Vector<X1>
|new Vector()|[●] ≤ |v1|[●]
|new Float(3.4)|[●] Float
|new Float(3.4)|[●] Types[●](v1)
|v1|[●] ≤ |v3|[●, L1]
|new Vector()|[●] Vector<X2>
|new Vector()|[●] ≤ |v2|[●]
|new Integer(17)|[●] Integer
|new Integer(17)|[●] Types[●](v2)
|v2|[●] ≤ |v3|[●, L2]
|v3.get()|[●,L1] Elem[●, L1](v3)
|v3.get()|[●, L1] ≤ |temp|[●, L1]
|v3.get()|[●, L1] ≤ Elem[●, L1](v3)
|temp|[●, L1] ≤ Elem[●, L1](v3)
|v3.get()|[●,L2] Elem[●, L2](v3)
|v3.get()|[●, L2] ≤ |temp|[●, L2]
|v3.get()|[●, L2] ≤ Elem[●, L2](v3)
|temp|[●, L2] ≤ Elem[●, L2](v3)
|v3.get()|[●,LExt] Elem[●,LExt]
(v3)
|v3.get()|[●,LExt] ≤ |temp|[●,LExt]
|v3.get()|[●,LExt] ≤ Elem[●,LExt]
(v3)
|temp|[●, LExt] ≤ Elem[●, LExt]
(v3)
IBM Research
© 2004 IBM Corporation72
Constraint Solving
standard propagation-based solver
– computes a type for each constraint variable |E|– in cases where multiple types can be chosen for an expression E, a
heuristics-based choice is made (a least specific type for container-related expressions, a most specific type for other expressions)
– different types may be computed for the same expression in different contexts (e.g., |E|1 and |E|2)
element types are unified across ≤ constraints
processing type variables
– a type variable is bound by matching it with a concrete set of types
– matching two type variables results in their unification
– type variables may be left unbound (e.g., in incomplete programs)
– use approximate solution (e.g., element type Object) when processing programs with code like v.add(v)
IBM Research
© 2004 IBM Corporation73
public void bar(){
List v1= new Vector(); // L1
v1.add(new Float(3.4));
this.reverse(v1);
Float f1 = (Float) v1.iterator().next();
}
public void baz(){
List v2 = new Vector(); // L2
v2.add(new Integer(17));
this.reverse(v2);
Integer i1 = (Integer) v2.iterator().next();
}
public void reverse(List v3){
for (int t=0; t < v3.size()/2; t++){
Object temp = v3.get(v3.size()-1);
v3.add(v3.size()-1, v3.get(t));
v3.add(t, temp);
}
}
Constraint Solving[●]
[●]
[●,L1], [●,L2], [●,Lext]
Elem[●](v1) = Float
Elem[●](v2) = Integer
Elem[●,L1](v3) = Float
Elem[●,L2](v3) = Integer
Elem[●,Lext(v3) = Object
IBM Research
© 2004 IBM Corporation74
public void bar(){
List<Float> v1= new Vector<Float>();
v1.add(new Float(3.4));
this.reverse(v1);
Float f1 = (Float) v1.iterator().next();
}
public void baz(){
List<Integer> v2 = new Vector<Integer>();
v2.add(new Integer(17));
this.reverse(v2);
Integer i1 = (Integer) v2.iterator().next();
}
public <T> void reverse(List<T> v3){
for (int t=0; t < v3.size()/2; t++){
T temp = v3.get(v3.size()-1);
v3.add(v3.size()-1, v3.get(t));
v3.add(t, temp);
}
}
Code Generation
IBM Research
© 2004 IBM Corporation75
Results
benchmark LOC #container allocations
#container declarations
#casts #casts removed
%casts removed
Hanoi 4028 3 6 20 14 70
JUnit 5317 24 63 54 21 39
JLex 7841 17 45 71 53 75
JavaCup 10598 19 78 502 373 74
Mango1 2808 2 9 2 2 100
Mango2 2808 3 13 4 2 50
Mango3 2808 1 17 10 0 0
IBM Research
© 2004 IBM Corporation76
Demo: Prototype “Genericize” Refactoring
IBM Research
© 2004 IBM Corporation77
Outline
background
type constraints for Java programs– notation and terminology
– constraint generation rules
applications– generalization-related refactorings (OOPSLA’03)
– customization of library classes (ECOOP’04)
– refactorings for introducing generics (work in progress)
related work
conclusions and future work
IBM Research
© 2004 IBM Corporation78
Related Work on Customization
automatic data structure selection for SETL
– see [Schonberg et al. ’81]
automatic component selection
– see, e.g., [Hogstedt et al. ’01, Yellin ’03]
– purely profile-based, no static analysis
– all possible component implementations supplied up-front
automatic optimization of data structures in specific domains
– e.g., data structure selection for sparse matrix problems
optimizations applied to specific container classes
– see, e.g., [Beckmann & Wang, Friedman et al. ’01]
– e.g., prefetching, incrementalizing rehash operations
much related work on partial evaluation and program specialization
– see e.g., [Schultz, Lawall, Consel ’03]
IBM Research
© 2004 IBM Corporation79
Other Related Work
type inference and type-directed transformation have been used in the translation of large COBOL programs for Y2K compliance [Eidorff et al. 99, Ramalingam et al. 99]
informal characterization of type constraints [Opdyke’92, Seguin’00, Tokuda & Batory’01]
detecting overspecific variables [Halloran & Scherlis’02]
generating proposals for refactoring class hierarchies using concept analysis [Snelting & Tip’00]
inferring generic types in Java programs [Duggan’99, Donovan et al.’04, Von Dincklage & Diwan’04]
IBM Research
© 2004 IBM Corporation80
Future Work
in progress: support for migration between functionally equivalent classes
– e.g., from Vector to ArrayList, Hashtable to HashMap
– limitations on migration due to interaction with external code
– application: upgrading of “legacy” applications
variation on Java in which programmers only refer to interface types such as Set, Map, List instead of concrete types such as HashSet, TreeMap, ArrayList
– use customization techniques to select implementation
– similar in spirit to the SETL work at NYU by Paige, Schonberg, et al. in the 1970s and 1980s
other generics-related refactorings
– select a declaration & change its type into a type parameter
IBM Research
© 2004 IBM Corporation81
Conclusions type constraints are a useful tool for supporting refactorings and related
program transformations
– checking of preconditions
– determining allowable source-code modifications
– enables reasoning about program behavior
applications
– refactorings related to generalization
– customization of library classes
– refactorings for introducing generics
– more refactorings in the works
implemented in Eclipse
– Extract Interface, Generalize Type available now
– generics refactorings planned for Eclipse 3.1
– freely available from www.eclipse.org
EXTRA SLIDES
IBM Research
© 2004 IBM Corporation83
Typical Refactoring Scenario
user proposes a transformation by interacting with GUI/Wizards in IDE
system checks if preconditions are met
system determines necessary/allowable source code updates
systems shows before/after “diff” view
user confirms
program works as before
IBM Research
© 2004 IBM Corporation84
Solving the Constraints
naive approach
– explicitly enumerate all values; each expression type in { C, I }
– for each solution, determine if constraints are satisfied
cost: O(2n), where n is the number of declarations of type C
IBM Research
© 2004 IBM Corporation85
Object-Oriented Type Systems
“A type system is a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute”
[Benjamin C. Pierce, 2002]
Traditional applications of type systems:
– enhance readability/understandability
– prove/guarantee that certain kinds of run-time errors will not occur during program execution (e.g., “message not understood”)
– foundation for abstractions & language features (e.g., module systems)
– enable optimizations (e.g., replace dynamic dispatch with direct call)
IBM Research
© 2004 IBM Corporation86
Some Terminology type: set of objects that share properties (e.g., supported operations)
– in Java, there is a direct correspondence between types and classes and interfaces in the inheritance hierarchy
static typing: type information is explicit in the source code– consistency checks can be performed by a compiler (type checking)– Note: some run-time checking may still be needed
type checking: checking certain consistency properties of programs that contain explicit type declarations– to guarantee the absence of run-time errors– a program that type-checks is (statically) type-correct
type inference– in dynamically typed languages, types of expressions are inferred from their usage– also used in statically typed languages for optimization (e.g., certain run-time checks may
be proven obsolete through analysis)
type constraints– formalism for expressing relationships between program expressions that must hold in
order for a program to be type-correct– used for type checking as well as for type inference
IBM Research
© 2004 IBM Corporation87
Observations
cannot update variable e1 because method getName() is called on e1, which is not declared in Billable
cannot update variable e2 because method getAddress() is called on e2, which is not declared in Billable
updating the return type of findEmployee() produces type mismatch in assignment to e2
updating the cast produces type mismatch in assignment to e1
IBM Research
© 2004 IBM Corporation88
Observations
Observations:
– type of v2 must be List, because of field access v2.size
– type of v3 must be List, because of assignment v2 = v3
– type of v8 must be List, because of call v8.sort()
– type of v4 must be List because it is passed as an argument to Client.sortList(), implying an assignment v8 = v4
– return type of Client.createList() must be List because of assignment v4 = Client.createList()
Conclusion:
– v0, v1, v5, v6, v7, v9, and the return types of List.add(), List.addAll(), Bag.add(), Bag.addAll() can be given type Bag
IBM Research
© 2004 IBM Corporation89
Conclusions & Future Work
customization: a technique for library-level optimizations– use type constraints to determine where applicable– use profile information to determine where useful– use static analysis and profile information to select optimizations
strong results– speedups up to 76.7% (18.8-24.1% on average)– heap consumption reduced by up to 45.9% (11.9% on average)– modest increase in app. size (<12% on large applications)
future work:– apply additional optimizations – apply to additional library classes– self-customizing classes– incorporate into whole-program optimizers
• e.g., Jax [Tip et al. 02], IBM WSDD SmartLinker
IBM Research
© 2004 IBM Corporation90
Detailed Speedup Results
IBM Research
© 2004 IBM Corporation91
Detailed Heap/Size Results
IBM Research
© 2004 IBM Corporation92
Implementation
implemented in Eclipse using existing refactoring framework [Baeumer et al. 01]
– Extract Interface
– Generalize Type
– Pull Up Members
– Push Down Members
determining type constraints nontrivial for several language features
– arrays
– member types (inner classes)
– exceptions
– overloading
IBM Research
© 2004 IBM Corporation93
Demonstration of Eclipse Refactoring Support Basic Stuff:
– texthovers: JavaDoc– ctrl-hover: Code + HyperLink– Ctrl-T: hierarchy– code completion
Rename Class– remove ugly prefix: JX_RTA -> RTA
Extract Method– method RTA.process() too long– extract processCurrentCallSitesWrtProcessedClasses()– estIterations()– undo– estIterations() with next line --- two return values– convert local to field– estIterations with next line OK now
Inline Method– RTA.moveNewToCurrentClasses()
Inline Local Variable– inline “callSite” in processCurrentCallSitesWrtProcessedClasses()
Extract Constant– DONE_ESTIMATE at end of RTA.process()
Pull Up Members– getIndex() in JX_MethodCallSite
IBM Research
© 2004 IBM Corporation94
public class Employee { public String getName(){ return _name; } public String getAddress(){ return _address; } public int getRate(){ return _rate;} public boolean hasSpecialSkill(){ return _hasSpecialSkill; } private int _rate; private boolean _hasSpecialSkill; private String _name; private String _address;}public class TimeSheet { public double charge(Employee emp, int days){ int base = emp.getRate() * days; if (emp.hasSpecialSkill()) return base * 1.05; else return base; }}
Example
Example taken from Fowler’s “Refactoring”, p.342
IBM Research
© 2004 IBM Corporation95
Example
public interface Billable { int getRate(); boolean hasSpecialSkill();}
public class Employee implements Billable { // contents of this class same as before}public class TimeSheet { public double charge(Billable emp, int days){ int base = emp.getRate() * days; if (emp.hasSpecialSkill()) return base * 1.05; else return base; }}
Example taken from Fowler’s “Refactoring”, p.342
IBM Research
© 2004 IBM Corporation96
But updating any of these references to Employee leads to compilation errors...
public class Personnel { public static Employee findEmployee(String name)
throws NotFoundException { for (int t=0; t < employees.size(); t++){ Employee e1 = (Employee)employees.elementAt(t); if (e1.getName().equals(name)) return e1; } throw new NotFoundException(); } public static String findAddress(String name) throws NotFoundException { Employee e2 = findEmployee(name); return e2.getAddress(); } private static Vector employees;}
IBM Research
© 2004 IBM Corporation97
Context Inference assume that allocation sites in a program are labeled
– distinct labels L1, ..., Lk for container-related allocation sites
– a single “blob” label ● used for all other allocation sites
– distinct label Lext represents collections created outside the application
for each method m, infer a set of contexts Contexts(m)
– each context represents a set of callers of a method
– identified by a list of labels, one for each parameter; e.g., [L1, L2, ●, ●]
for each expression E that occurs in the body of method m for which
Contexts(m), infer a points-to set Objects(E)
– set of labels; e.g., PT(E) = {L1, L2, L9, ●}
compute context-sensitive call graph
– compute for each pair <call-site, context>, a set of <method, context> pairs
– make conservative assumptions about entry point methods
IBM Research
© 2004 IBM Corporation98
Context Inference
we assume a given set of entry point points
– e.g., all public methods
– to be specified by the user of the refactoring tool
conservative assumptions about objects bound to parameters of entry point methods
– depends on declared type of the parameter
conservative assumptions about calls to external methods for which source code is unavailable
use Class Hierarchy Analysis (CHA) [Grove et al. 95] to approximate behavior of dynamic dispatch
null constants, literals, primitive values modeled as objects
IBM Research
© 2004 IBM Corporation99
Auxiliary Definitions for Context Inference Rules
set of objects assumed to be bound to parameters of entry-point methods
construct contexts for call sites that occur in method m for which Contexts(m)
{ Lext } if T ≤ Collection
ExternalObjects(T) = { ● } if T Collection
{Lext,● } otherwise
SelectContexts(, E0,...,Ek) =
{ [p0,...,pk] | pi Objects(Ei), 0 ≤ i ≤ k }
IBM Research
© 2004 IBM Corporation100
Some of the Context Inference Rules
T0.m(T1,...,Tn) is an entry point, pi ExternalObjects(Ti), = [p0,...,pn], 1 ≤ i ≤ n
Contexts(T0.m(T1,...,Tn))
pi Objects(Param(T0.m(T1,...,Tn) ))
m contains assignment E1=E2, Contexts(m)
Objects(E2) Objects(E1)
m contains call E0 new TL(E1,...,En) to constructor m’, T ≤ Collection, Contexts(m)
L Objects(E0)
m contains call E0 new TL(E1,...,En) to constructor m’, T Collection, Contexts(m)
’ SelectContexts(,E0,...,En), 0 ≤ i ≤ n
’ Contexts(m’)
● Objects(E0)
Objects(Ei) Objects’(Param(m’,i))
(C1)(C2)
(C3)
(C4)
(C5)(C6)
(C7)
IBM Research
© 2004 IBM Corporation101
Constraint Generation
constraint generation rules similar to those used for generalization-related refactorings
– constraint variables annotated with subscript that identifies their “containing” context
– additional rules that model the behavior of operations on collections
constraint variable Elem(E) represents the element type of container objects in Objects(E)
– similar: Key(E), Value(E) type for Map-style collections
notation: NewType(T) denotes a parameterized version of type T with a fresh type variable
IBM Research
© 2004 IBM Corporation102
Some of the Constraint Generation Rules
m contains assignment E1=E2, Contexts(m)
|E2| ≤ |E1|
m contains direct call E T.n(E1,...,Ek) to method m’, T Collection
Contexts(m), ’ SelectContexts(, E1, ..., Ek), E’i = Param(m’,i), 1 ≤ i ≤ k
|E| |m’|’
|Ei| ≤ |E’i| ’
m contains call E0.add(E1) to method m’, Contexts(m), Decl(m’) ≤ Collection
|E1| Types(E0)
(B1)
(B4)
(B5)
(B16)
(B24)
T Types(E)
T ≤ Elem(E)
|E1| ≤ |E2|’
Elem(E1) = Elem’(E2)(B27)
IBM Research
© 2004 IBM Corporation103
Constraint Generation for new Expressions
m contains expression E0 new T(E1,...,Ek) to constructor m’, T Collection,
Contexts(m), ’ = SelectContexts(, E0 ,...,Ek), E’i = Param(m’, i), 0 ≤ i ≤ k
|E0| T
|Ei| ≤ |E’i|’
m contains expression E0 new T(E1,...,Ek), T ≤ Collection,
Contexts(m), T’ = NewType(T)
|E0| T’(B14)
(B2)
(B3)
IBM Research
© 2004 IBM Corporation104
Code Generation
source code updating for a method m is trivial if there is one context for m, or if the types inferred for the expressions in m are the same in all contexts
if for a given expression E in method m, different types are computed in different contexts for m we attempt to introduce a type parameter for E– need to determine which (if any) other expressions must have the same type as E– a bound on a type parameter T of method m is needed if expressions of type T are
constrained to be of a type X more specific than Object in some context of m• use a common upper bound of all such types X
in programs with failing casts, the type constraint system may not have a solution in a given context– approach: merge all contexts for methods with failing casts, and continue solving
(context-insensitive solution)
a down-cast (T)E is redundant if the inferred type for E is a subtype of T – in all contexts for E