Refactoring with Synthesis - ETH Z · refactoring interface, in which a refactoring is achieved via...

Refactoring with Synthesis

Veselin RaychevETH Zurich

[email protected]

Max SchäferNanyang Technological University

[email protected]

Manu SridharanIBM T.J. Watson Research Center

[email protected]

Martin VechevETH Zurich

[email protected]

AbstractRefactoring has become an integral part of modern softwaredevelopment, with wide support in popular integrated devel-opment environments (IDEs). Modern IDEs provide a fixedset of supported refactorings, listed in a refactoring menu.But with IDEs supporting more and more refactorings, it isbecoming increasingly difficult for programmers to discoverand memorize all their names and meanings. Also, since theset of refactorings is hard-coded, if a programmer wants toachieve a slightly different code transformation, she has toeither apply a (possibly non-obvious) sequence of severalbuilt-in refactorings, or just perform the transformation byhand.

We propose a novel synthesis system which addressesthese limitations. Our system employs a recently proposedrefactoring interface, in which a refactoring is achieved viathree simple steps: the programmer first indicates the startof a code refactoring phase; then she performs some of thedesired code changes manually; and finally, she asks the toolto complete the refactoring.

Given the initial and modified programs, our synthesissystem completes the refactoring by first extracting the dif-ference between the starting program and the modified ver-sion, and then synthesizing a sequence of refactorings thatachieves (at least) the desired changes. To enable scalablesynthesis, we introduce local refactorings, which allow forfirst discovering a refactoring sequence on small programfragments and then extrapolating it to a full refactoring se-quence.

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected] ’13, October 29–31, 2013, Indianapolis, Indiana, USA.Copyright c© 2013 ACM 978-1-4503-2374-1/13/10. . . $15.00.http://dx.doi.org/10.1145/2509136.2509544

We implemented our approach as an Eclipse plug-in, withan architecture that is easily extendable with new refactor-ings. The experimental results are encouraging: with onlyminimal user input, the synthesizer was able to quickly dis-cover complex refactoring sequences for several challengingrealistic examples.

Categories and Subject Descriptors D.2.6 [ProgrammingEnvironments]: Integrated environments; Eclipse

Keywords Refactoring; Synthesis

1. IntroductionSoftware refactoring improves the structure of existing codethrough behavior-preserving transformations, themselvescalled refactorings. Since it was first formally describedtwenty years ago [8, 22], refactoring has become an im-portant part of modern software development, and is a stapleof agile software development techniques such as ExtremeProgramming [2].

While originally conceived as a manual activity, it did nottake long for the first refactoring tools to appear [26] that of-fered support for automating simple refactoring transforma-tions. Since then refactoring support has become de rigueurin interactive development environments (IDEs) for object-oriented languages. In particular, all major Java IDEs suchas Eclipse [4], IntelliJ IDEA [12] and NetBeans [21] comewith built-in support for many refactorings.

Recent studies, however, have shown that these refactor-ing tools are severely underused [19, 36]: while program-mers do refactor frequently, they perform up to 90% of refac-torings by hand, even where tool support is available. Onemajor issue identified by these studies is poor discoverabil-ity: in order to initiate an automated refactoring, the pro-grammer has to select it by name from a menu1 or use acorresponding keyboard shortcut, requiring the programmerto not only know which refactoring transformations the IDE

1 In Eclipse 4.2, this menu has 23 entries.

supports but also to memorize their names. A second obsta-cle is the complexity of the user interface for some refactor-ings, which are controlled by complex configuration dialogsthat disrupt the programmer’s workflow.

To alleviate these problems, several researchers have pro-posed novel user interface paradigms for refactoring tools:Lee et al. [16] present a system where refactorings are initi-ated by drag-and-drop gestures, whereas BeneFactor [7] andWitchDoctor [5] are tools for refactoring “autocompletion”that observe a programmer’s editing operations and try todiscover editing patterns suggestive of refactorings. Whensuch a pattern is discovered, the programmer is offered thechoice of completing the refactoring task using the built-inrefactoring tool.

These tools make it easier to apply individual refactoringsalready supported by the IDE. However, many very natu-ral refactoring transformations do not straightforwardly mapto these built-in refactorings. For example, the EXTRACTMETHOD refactoring implementations of present-day IDEsprovide no control over the set of parameters to be pro-vided by the extracted method. As we shall discuss furtherin Section 2, this means that programmers sometimes haveto perform a non-trivial sequence of auxiliary refactoringsin order for the method extraction to yield the required re-sult. This is tedious, particularly if some refactoring latein the sequence fails unexpectedly (e.g., due to a violatedpre-condition) and the programmer has to undo the previ-ous steps one by one. Even discovering the right sequence ofrefactorings to perform is often non-trivial, and the program-mer may ultimately fall back to performing the refactoringby hand.

To address this, we propose to combine a user interfacein the style of Steimann et al.’s ad hoc refactorings [32]with a novel synthesis engine based on synthesis from ex-amples [9]. As a result, a developer performs an automatedrefactoring using the following process:

1. Click a “Start Refactoring” button, thereby indicating theprogram Pi whose semantics should be preserved by therefactoring.

2. Manually perform a few of the edits for the desired refac-toring, resulting in a program Pm.

3. Click a “Complete” button, at which point the tool takesas input the two examples Pi and Pm and attempts todiscover a sequence of automated refactorings that, whenapplied to Pi, yields a program that includes the editsintroduced in Pm over Pi.

This refactoring interface, which we have implementedin an Eclipse plugin called RESYNTH, provides a number ofadvantages over traditional approaches where the user needsto figure out which (sequence of) refactorings to invoke:

• The user is freed from remembering a different menuitem or keyboard shortcut for each supported refactoring;

instead, all refactorings are exposed through a simple,unified interface.2

• Applying a transformation requiring a sequence of refac-torings becomes much easier, as the tool discovers the(possibly non-obvious) sequence for the user. Also, thetool ensures that all refactorings in the sequence will suc-ceed before applying any of them, removing the burdenof having to undo earlier refactorings if a later one fails.

• Similarly, performing a set of related refactorings wherethe order is unimportant (e.g., renaming both a field andits accessor methods) is simplified, since the refactoringsare applied as a unit.

• For a refactoring developer, adding a new refactoring nolonger requires cluttering the user interface with anothermenu item.

The heart of RESYNTH is a search strategy for discover-ing appropriate refactoring sequences based on a small num-ber of user edits. A naïve brute-force search is ineffectivefor even the smallest programs. Instead, RESYNTH takes thefollowing approach:

1. Program entities that were not edited by the user arediscarded to narrow the search space.

2. Over this pruned program, an A∗ search [11] is per-formed to discover refactoring sequences, guided by aheuristic function that minimizes edit distance and ex-pression distance from the user edits.

3. After a solution is discovered that works for the prunedprogram, RESYNTH attempts to execute the discoveredsequences of Eclipse refactorings on the full program.

While BeneFactor and WitchDoctor also infer refactor-ings from user edits, they differ from our system in that (1)they do not require the user to indicate a refactoring is occur-ring and (2) they cannot perform transformations requiringa sequence of refactorings. While (1) can be an advantagefor novice users, requiring the user to indicate the start ofthe refactoring enables the tool to discover more complexsequences, and allows for performing several independentrefactorings “atomically.”

RESYNTH is architected in a manner that eases the pro-cess of adding new refactorings. Beyond standard refactor-ing functionality, RESYNTH only requires that each refactor-ing provide a successors function to enumerate the possibleways to apply the refactoring to a given pruned program,thereby defining the search space. Adding a successors

function is straightforward in most cases, and otherwiserefactoring implementors need not be concerned about de-tails of the refactoring search.

To assess the effectiveness of our search strategy, weevaluated RESYNTH on a set of synthetic benchmarks and

2 Of course, the alternative interface could also be used in conjunction withstandard menu items.

on real example refactorings collected from Stack Overflow3

and other sources. We show that our search techniques aresignificantly more effective than alternate approaches, andcan handle many challenging real-world examples.

Main Contributions The contributions of this paper are:

• A new synthesis refactoring system, where the user indi-cates the desired transformation with example edits andthe tool synthesizes a sequence of refactorings that in-clude the edits.

• A novel technique for synthesizing refactoring sequencesvia heuristic search over pruned programs.

• An implementation RESYNTH, whose architecture mini-mizes the effort required to add new refactorings.

• An initial evaluation of RESYNTH, showing it can syn-thesize complex refactoring sequences for real examples.

Our paper is organized as follows. Section 2 gives a de-tailed example to motivate our techniques. Then, Section 3presents our core search techniques in detail. Section 4 de-tails the design and implementation of our tool RESYNTH,and Section 5 presents our experimental evaluation. Finally,Section 6 discusses related work, and Section 7 concludes.

2. OverviewModern Java IDEs such as Eclipse, NetBeans or IntelliJIDEA offer a large number of built-in refactorings thatcan be activated through a menu or a keyboard shortcut.The precise set of supported refactorings differs betweenIDEs, but usually includes a set of core refactorings suchas EXTRACT METHOD or ENCAPSULATE FIELD, many ofthem originally implemented in the Smalltalk RefactoringBrowser [26].

This set of refactorings, however, is fixed, and does not di-rectly accommodate some fairly simple transformations. Asan example, consider the method extraction problem shownin Fig. 1, which is taken from Fowler’s well-known book onrefactoring [6]. The original program, a fragment of a classrepresenting an account, is given on the left. As shown in therefactored program on the right, we want to extract lines 6and 7 into a new method printDetails(). Crucially, how-ever, the expression getOutstanding() on line 8 should notbe extracted into the new method, but passed as a parameterinstead.

This refactoring cannot be performed in one step usingthe refactoring tools of Eclipse, NetBeans or IntelliJ, sincetheir implementations of EXTRACT METHOD do not permitthe programmer to control the set of parameters of the ex-tracted method, which is instead determined automaticallyby examining the program’s data flow.

The transformation can be achieved by composing tworefactorings. First, the two statements including the ex-

3 http://stackoverflow.com

pression to be passed as a parameter are extracted into aprintDetails() method:

40 private void printDetails() {

41 System.out.println("name: " + name);

42 System.out.println("outstanding: " +

43 getOutstanding());

44 }

Then, the INTRODUCE PARAMETER refactoring is appliedto the getOutstanding() call, yielding the desired result.However, INTRODUCE PARAMETER is not widely known [36],and, at least in Eclipse, its implementation is buggier thanmore popular refactorings.

Another alternative is to decompose the refactoring intothree steps. First, we extract the expression to be passedas a parameter into a local variable outstanding insideprintOwing(), as follows:

45 void printOwing() {

46 printBanner();

47 double outstanding = getOutstanding();



50 outstanding);

51 }

Then, we extract lines 48–50 into the printDetails()

method. Since outstanding is used by, but not defined in,the code to be extracted, it will be turned into a parameter toprintDetails(), as desired. Finally, we can apply INLINELOCAL to outstanding, achieving the desired program fromFigure 1(b).4

Vakilian et al. [35] found that this latter sequence of refac-torings is quite frequently performed in practice, suggestingthat programmers often need to perform complex refactor-ings that exceed the capabilities of current refactoring en-gines, forcing them to manually compose several refactor-ings in order to achieve a single transformation.

While EXTRACT LOCAL is more widely known thanINTRODUCE PARAMETER [36], its use in this situation ishighly non-obvious and indeed somewhat counter-intuitive,since its effect is later partially undone by an inlining. Thisrequires the programmer to plan ahead to compensate forlimitations of the refactoring tool.

Abadi et al. [1] conducted a case study using Eclipse’sbuilt-in refactorings to conduct a complex refactoring, andfound that only three out of 13 uses of EXTRACT METHODcould be automatically performed by Eclipse. They suggestimproving Eclipse’s implementation of EXTRACT METHODto provide more fine-grained control over the code to extractand the parameters of the extracted method, but recent re-search [35] suggests that such an improved (and invariably

4 In fact, this approach does not quite work in either Eclipse, NetBeans orIntelliJ, since their implementations of EXTRACT LOCAL always put thedeclaration of the extracted local variable on the line immediately precedingthe extraction site, so outstanding ends up after line 48 and has to bemoved manually before the refactoring can continue.

1 public class Account {

2 private String name;

3


5 printBanner();



8 getOutstanding());

9 }

10

11 private double getOutstanding() {

12 //...

13 }

14

15 private void printBanner() {

16 //...

17 }

18 }

19 public class Account {

20 private String name;

21


23 printBanner();

24 printDetails( getOutstanding());

25 }

26

27 private void printDetails(double outstanding) {


29 System.out.println("amount: " + outstanding);

30 }

31

32 private double getOutstanding() {

33 //...

34 }

35

36 private void printBanner() {

37 //...

38 }

39 }

(a) (b)

Figure 1. Example of a complicated refactoring that cannot be achieved in a single step in current IDEs; changes highlighted.

much more complex) refactoring tool would not necessarilybe very popular, since programmers seem to favor compo-sition (of several simple refactorings) over configuration (ofone complex refactoring).

Our approach We propose a solution to this problem con-sisting of two components. We adopt a recently proposedinterface [32] where to achieve a refactoring, a programmersimply performs some of the desired changes by hand, yield-ing an initial and modified program. Given these two pro-grams, our tool employs a novel synthesis engine that au-tomatically searches for a suitable sequence of refactoringsthat performs (at least) those changes.

With our approach, a programmer could achieve therefactoring of Fig. 1 quite easily. After indicating to thetool that a refactoring is beginning, the user would just man-ually replace lines 6 and 7 with the desired method callprintDetails(getOutstanding()). At this point, of course,the program does not compile any more, since the methodprintDetails has not been defined yet. But, based on thissimple edit, our tool RESYNTH can determine the sequenceof refactorings needed to refactor the original program intothe form in Fig. 1(b). Note that existing tools like Bene-Factor [7] and WitchDoctor [5] cannot handle this examplebased on the edit above: they require the user to know asequence of refactorings to apply.

The same interface can, of course, also be used for trans-formations that can be achieved by a single refactoring (dis-cussed further in Section 5). In such cases, RESYNTH has

the advantage of providing a unified interface that frees theprogrammer from having to memorize refactoring namesand their associated keyboard shortcuts or menu items.

There are many real-world examples besides that ofFigure 1 in which a desired transformation can be ac-complished by applying a non-obvious sequence of basicrefactorings. For instance, previous work [25, 32] has dis-cussed swapping two names, e.g., transforming field declara-tions String s1; String s2; to String s2; String s1;,changing uses of s1 and s2 appropriately. One cannot applythe RENAME refactoring to first change s1 to s2 or vice-versa, since a name conflict arises. With our approach, onesimply swaps the names in the field declarations manually,and the tool discovers a sequence of three RENAME refactor-ings to complete the transformation: s1 to tmp, s2 to s1, andfinally tmp to s2. As with the previous example, this refactor-ing sequence is non-obvious, but with RESYNTH, the useris freed from worrying about the details. In Section 5, weshow that RESYNTH was able to handle several challengingreal-world examples, including these.

3. ApproachIn this section, we discuss our approach to synthesizingrefactoring sequences in detail. We formalize the problem,introduce the notion of local refactoring, describe the syn-thesis algorithm, and prove some of its key properties.

3.1 SettingA refactoring r ∈ R = Prog × PS ⇀ Prog is a partialfunction which takes as input a program and a sequence ofparameters and returns a transformed program. The functionis partial because the input program may not satisfy pre-conditions required for the transformation to be behavior-preserving. The set of parameters varies among refactorings.For example, a RENAME refactoring takes two parameters:a program entity e to be renamed and the new name for e.The types of parameters are not important here; they will bediscussed further in Section 4. An invocation of a refactoringfunction is denoted as r(P, ps) where each argument in psbelongs to PS.

Sequence of Refactorings We often use the term “se-quence of refactorings” to stand for a sequence of invoca-tions of refactoring functions. Formally, a finite sequence ofrefactorings rn(P ) of length n > 0 invoked with particulararguments is defined as:

rn(P, ps1, ..., psn) = rn(...(r2(r1(P, ps1), ps2)..., psn)

The problem addressed by our synthesis procedure cannow be stated informally as follows:

Given an initial program Pi and a modified programPm, the goal of the synthesizer is to discover a se-quence of refactorings Pf = rn(Pi, ps1, ..., psn)(for some n > 0) that preserves all changes intro-duced in Pm over Pi.

To solve this problem, a synthesizer must find both: i)which refactorings to use and ii) which arguments to pass tothe refactorings of step i).

Key Challenge A naïve approach to the above problemwould be to apply refactorings in sequence directly to Pi,searching for a sequence containing the changes in Pm.Unfortunately, this approach performs poorly, due to theneed to repeatedly transform and check preconditions on theentire input program. We observed that applying a singleEclipse refactoring to even a small program took roughlyhalf a second (consistent with previous work [5]), making asearch among thousands of refactoring sequences infeasiblein practice.

While in principle the Eclipse refactorings could be fur-ther optimized, precondition checking for many refactoringssuch as renaming is an inherently global problem that needsto take the whole program (including external libraries it de-pends on) into account, thus limiting the potential speedup.Furthermore, given the current architecture of Eclipse’srefactoring engine, applying two refactorings in a row in-variably entails changing the program at a textual level andreparsing affected compilation units, which would furtherslow down search.

The key challenge, then, is how to speed the search pro-cess sufficiently to discover realistic refactoring sequencesin reasonable amounts of time.

Solution Outline Next, we discuss our solution. We firstdiscuss how to capture changes between two programs.Then, we discuss local refactorings, an important compo-nent of our approach. A local refactoring restricts a refactor-ing to a program fragment, enabling the search to consideronly the changed parts of a program instead of an entireprogram, which is key to scalability. We then present oursearch algorithm, based on local refactorings, and finally wediscuss the overall guarantees provided by the synthesizer.

3.2 Capturing Program ChangeWe define T (P ) to be the abstract syntax tree (AST) ofprogram P . For programs with multiple files, we create onetree with a root node that joins the ASTs of all files. Giventrees T1 and T2, we define the following tree operations:

• T1 \ T2 is a tree where each full path in T1 which doesnot occur in T2 is kept.

• T1 ⊆ T2 is true iff every rooted path in T1 is a rootedpath in T2.

A rooted path is a path from the root of tree T to any nodein T , and a full path is a rooted path which ends in a leaf.

A program change is a pair (ci, cm) where ci is a subtreeof T (Pi) and cm is a subtree of T (Pm). Intuitively, (ci, cm)means that the tree ci from T (Pi) was changed to the treecm of T (Pm). In terms of the previously-defined tree opera-tions, (ci, cm) = (T (Pi) \ T (Pm), T (Pm) \ T (Pi)).

A program Pf is said to preserve a change (ci, cm) ifcm ⊆ T (Pf ). Note that ci and cm need not be syntactically-valid program ASTs since they may be missing subtrees thatdid not change between Pi and Pm.

Example Consider the two expressions Pi = x∗y+7 andPm = f() + 7, whose ASTs are shown in Fig. 3. As definedabove, ci is the tree x∗y+ and cm is the tree f()+, illustratedas dashed circles in the figure. In this example, neither ci norcm are well-formed ASTs, since they miss the right-handoperand of +, which is identical in both programs.

In what follows, we write P for T (P ) to avoid notationalclutter. So for instance, a change (ci, cm) is written as (Pi \Pm, Pm \ Pi) and preservation as cm ⊆ Pf .

Now that we have defined the notion of a programchange, we can re-state our goal formally:

Given an initial program Pi and a modified programPm, the goal of the synthesizer is to discover a se-quence of refactoring invocations (for some n > 0)Pf = rn(Pi, ps1, ..., psn) such that cm ⊆ Pf .

3.3 Local RefactoringThe concept of a local refactoring is key to the practicalityof our synthesis approach. A local refactoring enjoys the

Pi

ci cm

Pm

r 𝑛

Pi

ci cm

Pm Pf

Pi

ci cm

Pm Pi Pm

0. Initial Input 1. Extract Change 2. Synthesize Local Refactoring Sequence

3. Extrapolate to a Sequence of Full Refactorings r

𝑛 r 𝑛

r 𝑛

r 𝑛

Figure 2. Synthesis Steps

*

+

AST of x * y + 7

x y 7

f()

+

AST of f() + 7

7

ci

cm

Figure 3. Two ASTs and the change (ci, cm) between them. Thechange is captured with dotted lines.

following benefits: i) it can operate on a small portion ofthe program instead of the entire program; ii) it can performfewer pre-condition checks that those performed by a fullrefactoring; iii) it works directly on trees and hence does notneed to parse and generate code.

For a given refactoring r, we denote a corresponding localrefactoring as rl. A local refactoring rl is defined as follows:

rl : Tree× PS ⇀ Tree

That is, rl is a partial function which takes as input atree, a sequence of parameters and returns a tree. Similarlyto the full refactoring r, the function is partial because theinput may not always satisfy certain pre-conditions requiredfor the function to fire. The difference from the definitionof a full refactoring is the use of trees instead of programs.Sequences of local refactorings are defined similarly to se-quences of full refactorings defined earlier.

Extraction Function Given an invocation of a local refac-toring, we often need to obtain a corresponding invocation ofa full refactoring. Therefore, for each local refactoring func-tion rl, we associate a corresponding extraction function:

µrl : PS → PS

Then, given a local refactoring invocation rl(t, ps), thefunction µrl(ps) computes a new sequence of arguments ps′to be passed to the full refactoring r. Typically µrl is theidentity function, but we allow each implementation of alocal refactoring to decide what function it needs.

For a local refactoring to be useful in synthesis, it needsto satisfy the following condition w.r.t a full refactoring.

Definition 3.1 (Correct Local Refactoring). A local refac-toring rl is correct w.r.t its corresponding full refactoring riff ∀P ∈ Prog, f ∈ Tree, ps ∈ PS:

f ⊆ P ∧ f ′ = rl(f, ps) ∧ P ′ = r(P, µrl(ps))⇒ f ′ ⊆ P ′

P

f f’

P’

r

r

Figure 4. Lo-cal refactoringcorrectness

Fig. 4 gives the intuition for thedefinition: a local refactoring is cor-rect w.r.t. a full refactoring unlessboth are enabled and the result ofthe local refactoring is not includedin the result of the full one. Notethat depending on the definition ofthe extraction function µ there couldbe many different local refactoringsrl that are correct with respect to itscorresponding full refactoring r.

3.4 Modular SynthesisWe next describe the three steps of the synthesis process.These steps are also illustrated in Fig. 2.

Step 1: Extract Change The result of this step is a programchange (ci, cm) = (Pi \ Pm, Pm \ Pi) (as defined in Sec-tion 3.2).

Since we need only the change, we do not need to ex-plicitly construct the entire ASTs Pi and Pm. Indeed, if wecompare the trees of Pi and Pm and eliminate the commonparts between them, this would eliminates classes, meth-ods, type definitions, statements and expressions that remainunchanged between the two programs. This means that forlarge programs, we need not compare unmodified compila-tion units. Then, we construct the ASTs P ′i and P ′m whichinclude only the modified compilation units and compute theprogram change by using the formula (P ′i \ P ′m, P ′m \ P ′i ).

Step 2: Synthesize Local Refactoring Sequence We nextdescribe how we compute a local refactoring sequence. Thegoal of this step can be stated as follows:

Given a program change (ci, cm), the goal of the syn-thesizer is to discover a sequence of local refactoring

invocations t = rnl (ci, ps1, ..., psn) (where n > 0)such that cm = t.

We assume that at each step in the search, there is a finiteset of (refactoring, input) pairs that apply to the current tree.Some work may be required to compute this finite set, asrefactorings can have unbounded inputs (like the new nameinput for RENAME); we describe how our implementationhandles this issue in Section 4.

A naïve solution to our search problem would be to usebreadth-first search, which will find the shortest possiblerefactoring sequence (assuming the aformentioned finitiza-tion). However, our experiments indicate that such a searchrarely scales beyond sequences of more than four refactor-ings, as the search space grows exponentially in the lengthof the desired sequence.

Instead, we employ an A∗-based search [11]. A∗ itera-tively computes a distance function d from the initial tree cito every other generated tree. Our distance function is sim-ply the length of the current (partial) refactoring sequence.To speed up the search, A∗ also uses a heuristic function hthat estimates the distance from each tree to the target treecm. At every step, the tree t with minimal d(ci, t) + h(t) isprocessed. The “successors” of t, i.e., the results of applyingall possible local refactorings to t (a finite set, as discussedabove), are added to the set of candidates, and the searchcontinues.

Heuristic functions First, we define a correct heuristicfunction.

Definition 3.2 (Correct Heuristic Function). A heuristicfunction is correct if it satisfies the following properties:

• ∀t ∈ Tree, h(t) ≥ 0, and• If t = cm then h(t) = 0

We propose the following correct heuristic functions:

1. h1(t) is the edit distance between t and cm: this is theminimum number of node renames, leaf node inserts orleaf node deletes required to get to the tree cm from thetree t.

2. h2(t) is the expression distance between t and cm: thisis the number of program expressions present in one ofthe trees which are not present in the other one. Thisfunction ignores parts of programs which are not validexpressions. For example the statement x=y+1 consists ofthe expressions x=y+1, x, y+1, y and 1. If t consists ofthe statement x=y+1 and cm of x=z+1, then h2(t) = 6,because t contains x=y+1, y+1 and y which are not presentin cm, and cm contains x=z+1, z+1 and z which are notpresent in t.

A consistent heuristic function is a correct function whichalso satisfies the property ∀t1, t2, h(t1) ≤ d(t1, t2) + h(t2).If a consistent heuristic function is provided, the A∗ algo-rithm will always find the shortest possible sequences [11].

However, building such a heuristic function is dependenton the available local refactorings and hence would requirechange each time a new local refactoring is added.

The heuristic function that we choose is not consistent,but is correct, which is sufficient for our optimality guaran-tees (discussed later) and does not affect the correctness ofthe produced local sequence. The heuristic function is usedonly to improve the speed of the search and decrease thenumber of explored trees. In our implementation, we con-structed several heuristic functions by building different lin-ear combinations h(t) = a1h1(t) + a2h2(t) and tuning thecorresponding constants a1 and a2. Section 5 presents resultswith different choices for a1 and a2.

Step 3: Extrapolate to a Sequence of Full RefactoringsOnce a local refactoring sequence rnl (ci, ps1, ..., psn) iscomputed, the final step is to extrapolate from that sequenceand obtain a sequence of full refactorings. The sequenceof full refactorings is obtained by applying the extractionfunction to each local refactoring in the local refactoring se-quence. That is, for the full refactoring sequence we obtainrn(Pi, µrl1

(ps1), ..., µrln(psn)). However, rn may not ac-

tually be feasible, since local refactorings may ignore someof the pre-conditions which are checked in the sequence offull refactorings. If the sequence of full refactorings is infea-sible, our algorithm searches for a different local refactoringsequence and repeats the process. Otherwise, we obtain thedesired program Pf (from the sequence of full refactorings).

3.5 GuaranteesWe next discuss the two main guarantees provided by ourapproach. First we show that the synthesizer produces arefactoring sequence which satisfies our objective (that is,correctness). Further, we also prove a property of optimality.

The fact that our synthesizer achieves the correctnessobjective is also illustrated in Step 3 of Fig. 2, that is, withcm ⊆ Pf . We prove this below.

Lemma 3.3 (Correctness of Synthesis). For any sequenceof full refactorings Pf = rn(Pi, ps1, ..., psn) produced bythe synthesizer, cm ⊆ Pf .

Proof. Let rnl (ci, . . .) be the local refactoring sequence pro-duced in Step 2 of the algorithm, from which the given fullrefactoring sequence rn(Pi, . . .) was obtained. Here we donot list the arguments to avoid clutter and use . . . instead.

The proof proceeds by induction. For the inductionhypothesis assume that for some k < n, rkl (ci, . . .) ⊆rk(Pi, . . .) where the sequence rk(Pi, . . .) is obtained fromthe sequence rkl (ci, . . .). We need to prove that the treerk+1l (ci, . . .) ⊆ rk+1(Pi, . . .). From the requirement that

each local refactoring is correct (Definition 3.1) it fol-lows that rlk+1

(rkl (ci, . . .), . . .) ⊆ rk+1(rk(Pi, . . .), . . .).

Then, using induction we have proven that rnl (ci, . . .) ⊆rn(Pi, . . .). Step 2 of the local refactoring synthesis termi-

nates only if rnl (ci, . . .) = cm. Hence, by substitution itfollows that cm ⊆ rn(Pi, . . .) proving our objective.

Next, we define a notion of optimality:

Definition 3.4 (Optimality). Pf = rn(P, ps1, ..., psn) if:

• n = 1 and P 6= Pf , or• n > 1 and for all i, 0 ≤ i < n− 1, ro ∈ R, pso ∈ PS, it

is the case that ro(ri(P, ps1, ..., psi), pso) 6= Pf .

Intuitively, the above definition of optimality says thatwe cannot replace a suffix of the refactoring sequence withanother refactoring and obtain the same result. For example,a sequence of two rename refactorings where the first onerenames method A to B and the second one renames B toC is not optimal, because that sequence can be replacedby a single refactoring that renames A directly to C. Aconsequence of the above definition is that if a sequence isoptimal and correct, it means that no prefix of the sequenceis correct. For example, if the sequence of refactorings a ·b ·cis optimal and correct, it means that neither a nor a · b cansolve the problem, that is, no shorter correct prefix exists.

Lemma 3.5 (Optimality of Synthesis). A refactoring se-quence rn(P, ps1, ..., psn) produced by the synthesizer isoptimal if for all i, 1 ≤ i < n, µrli

is a bijection.

Proof. First, we will show that the local refactoring se-quence produced by A∗ is optimal. If the sequence cm =rnl (ci, ps1, ..., psn) was not optimal, then ∃j, 1 ≤ j < n− 1and a local refactoring rlo with parameters pso ∈ PS,such that rlo(r

jl (ci, ps1, ..., psj), pso) = cm. Let cP =

rjl (ci, ps1, ..., psj). Because cm = rlo(cP , pso), cm is asuccessor of cP . cP was processed in A∗ before cm and itscomputed distance function is d(ci, cP ) = j. Because allsuccessors of cP are added when cP is processed and cmis a successor of cP , then n = d(ci, cm) ≤ j + 1, whichmeans j ≥ n − 1, which contradicts the non-optimality ofrnl (ci, ...).

Next, assume that rn(P, ps1, ..., psn) is produced byusing µrli

(i ∈ [1, n]) from rnl (ci, ps1, ..., psn) and isnot optimal. This means that ∃j, 1 ≤ j < n − 1 anda refactoring ro ∈ R with parameters pso ∈ PS, suchthat ro(rj(P, ps1, ..., psj), pso) = Pf . Because for alli, 1 ≤ i < n, µrli

is a bijection, we must be able to ob-tain ro(r

j(P, ps1, ..., psj), pso) from a local sequencerlo(r

jl (ci, ps1, ..., psj), pso) = cm, but this contradicts to

the optimality of rnl (ci, ps1, ..., psn).

Our optimality guarantee shows that the sequence ofrefactorings which we produce cannot be trivially shortened.This is important as it ensures that we do not produce redun-dant or unnecessary steps and that our diverse solutions arenot trivially reducible to the same solution.

Example Let us illustrate the steps described in Fig. 2 onthe simple example in Fig. 5. For readability, the examplecontains only a single refactoring. Here, the user providesan initial program Pi that computes the area of a triangleusing Heron’s formula. The variable T stores the perimetersdivided by two and then the square of the area is computedin s. Assume that the user wants to rename the variable T top by changing some places where T is mentioned to p. Thechanges are in the program Pm (Pm does not compile).

Given Pi and Pm, in step 1 the algorithm extracts thechange and produces the two trees ci and cm. In step 2, alocal refactoring sequence is synthesized. In this example,the sequence consists of a single local rename. That is, onlythe occurrences T in the tree ci have their names changed to p

resulting in the tree cm. In step 3, the discovered local refac-toring sequence is applied on the full program. As a result,Pf is a program where all occurrences of T are renamed top. Note that only in this step, we can check if the rename isactually feasible. For instance, had the name p already beenused as a name of another variable, then this step would fail.

When the rename refactoring is implemented as a localrefactoring, it does not perform the check that the new name(e.g. p in our example) does not conflict with names outsideof the local tree, but this check is performed in the full refac-toring. Indeed, if we find a local refactoring sequence andits corresponding full refactoring sequence succeeds, then asshown earlier, the edits performed by the user are preserved(Lemma 3.3) and that the produced sequence cannot be triv-ially shortened (Lemma 3.5).

4. ImplementationWe have implemented the approach from Section 3 in a toolcalled RESYNTH. The tool is built as a plug-in for Eclipse4.2 [4]. Here we discuss how we designed RESYNTH tominimize the work needed to add new refactorings to thetool, and we present other salient implementation details.

4.1 ArchitectureRESYNTH consists of a core refactoring search engine thatinvokes individual automated refactorings through a simple,uniform API. Beyond the standard functionality required ofan automated refactoring (i.e., the ability to apply the refac-toring to a program and detect pre-condition violations),RESYNTH also requires that each full refactoring imple-ments its corresponding local refactoring as described ear-lier. Recall that each local refactoring takes a number of ar-guments as input. Conceptually, having an external searchprocess choose those arguments violates modularity, as thesearch engine would need to know the meaning of each in-dividual refactoring (in order to chose suitable arguments).Therefore, we require the developer to define a successors

function associated with that local refactoring. The searchengine simply invokes the function successors on all avail-able refactorings to enumerate the search space.

Pi :float T, s;

T = (a+b+c)/2

s = T*(T-a)*(T-b)*(T-c);

return Math.sqrt(s);

ci :=T*(T-)*

Pm :float T, s;

T = (a+b+c)/2

s = p*(p-a)*(T-b)*(T-c);


Pf :float p, s;

p = (a+b+c)/2

s = p*(p-a)*(p-b)*(p-c);


cm :=p*(p-)*

0. initial input: user does partial rename

3. perform the sequence (rename T to p) on the full program

2. synthesize sequence: local rename T to p

1. compute ci = Pi \ Pm ⇒ ci ⊆ Pi and cm = Pm \ Pi ⇒ cm ⊆ Pm

cm ⊆ Pf by Lemma 3.3

Figure 5. Example of synthesizing a refactoring sequence. Initially (stage 0), the user performs part of the rename (the userchange is highlighted in both programs). Then ci and cm are computed (stage 1). Then, a sequence of one local rename isdiscovered (stage 2). Finally, the rename is applied to the full program (stage 3).

Similar to a local refactoring, the successors functiontakes as input a tree. However, it produces a set of trees asoutput, by invoking the corresponding local refactoring witha set of possible arguments on the input tree. More formally,the successors function takes two parameters as input: i)the tree Ti representing the current local search state to beexplored from, and ii) the target tree cm (see Section 3.2).Given these parameters, successors must return a finite setof pairs (To, args), where To is a successor tree obtained byapplying a local refactoring to Ti with arguments args (e.g.,the original and new names for a RENAME refactoring).We keep the set of arguments args in the returned pair inorder to compute the full local refactoring sequence whenthe target tree cm is reached.

The minimal additional functionality that RESYNTH re-quires of local refactorings makes adding new local refactor-ings to RESYNTH relatively easy. In particular, refactoringimplementors need not concern themselves with details ofhow the search space is explored; this aspect is handled en-tirely within the core engine, using the techniques describedin Section 3.4. Instead, they only need to specify what is thespace to be searched, via the successors function.

Certain decisions made within the successors functionof each refactoring may require some tuning. For refactor-ings with unbounded inputs (e.g., RENAME, which can takean arbitrary string as a new name), there can be an infinitenumber of successors for a given tree. In such cases, therefactoring implementor must decide what finite subset ofthe possible successors should be returned by successors

(discussion of how our implemented refactorings handlethis issue is described in Section 4.2). The amount of pre-condition checking to be performed within successors isalso left to the refactoring implementor. In our experience,we found that performing aggressive pre-condition checkingduring successor computation was worthwhile, to reduce thesize of the search space and to discover violations cheaplyduring local refactoring whenever possible.

To implement local refactorings efficiently, the input treesto successors include symbol information for all names,e.g., whether a name references a local variable, a field, amethod, or some other entity. This information is useful dur-ing local pre-condition checking inside refactorings, e.g., todetect name conflicts. The initial information is computedfrom program Pi, and each refactoring must preserve the in-formation appropriately for each successor tree it produces.

To further simplify the task of implementing successor

functions, we provide utility functions that operate on treesand perform the following basic modifications:

1. Replace all occurences of a given symbol by a new tree.

2. Replace a node of the tree by a new subtree.

3. Insert a new statement in a tree.

4. Delete a statement from a tree.

Note that our trees are immutable, so these modifications ac-tually produce new trees. These utilities were used across therefactorings we have implemented thus far, and we believethey will be useful for implementing other refactorings aswell.

We now briefly compare our architecture to that of Witch-Doctor [5]. Like RESYNTH, WitchDoctor also aims to makeit easy to integrate new Eclipse refactorings into its search.To support a new refactoring in WitchDoctor, its effects haveto be described using declarative rules that match changes inthe AST. While this allows more high-level specificationsthan our successors function, it requires the AST result-ing from the refactoring to be known already. This is true inWitchDoctor, where only a single refactoring application isdetected at a time, but not in RESYNTH, where a refactoringmay create an intermediate AST that is further changed byother refactorings before resulting in the final AST. Our ap-proach thus is more powerful, but requires a bit more workto integrate a new refactoring.

4.2 Implementation DetailsRESYNTH currently includes implementations of successorsfunctions for RENAME, INLINE METHOD, EXTRACT METHOD,INLINE LOCAL, and EXTRACT LOCAL.5

Together, the refactorings above covered many of the in-teresting refactoring sequences we observed in real-worldexamples. Similar to the WitchDoctor system [5], our refac-torings currently rely on underlying Eclipse refactoring im-plementations to perform full refactorings. As discussed inprevious work [5], the Eclipse refactorings have not been en-gineered to run quickly in scenarios like these and our exper-iments confirm that each Eclipse refactoring needs aroundhalf a second to run. In a from-scratch implementation, morecode could be shared between the lightweight local refactor-ings and the full refactoring, and global pre-condition check-ing could be optimized in a manner that could speed up theoverall search.

As discussed in Section 4.1, our refactoring implemen-tations are required to finitize a potentially-infinite set ofsuccessors in their successors functions. For RENAME, welimited the set of new names for an identifier to the namepresent at the same tree node in the final tree cm or one ad-ditional fresh temporary name. The temporary name is use-ful for cases where the refactoring sequence requires namesto be swapped. Because multiple name swaps can be com-posed sequentially, one temporary name is sufficient. TheEXTRACT LOCAL and EXTRACT METHOD refactorings al-ways use a deterministically-generated name for the new lo-cal variable or method: if needed, a subsequent RENAMErefactoring can used to match the user’s desired name.

RESYNTH contains around 3100 lines of Java code, themajority of which is the engine, the plugin and the evaluationcode. Only 750 of the code lines are the local refactorings.

Example We briefly illustrate the work needed to addthe INLINE LOCAL refactoring. First we implement thesuccessors function. Recall that the successors functiontakes a tree Ti and returns a finite set of pairs (To, args).For the INLINE LOCAL refactoring, each local variable dec-laration v with an initializer expression e in Ti produces onesuccessor tree. Since Ti is a finite tree, the number of suc-cessors is finite. Each successor tree is computed as follows:

1. Let s be the resolved symbol of the variable v. First, wereplace all nodes in Ti that resolve to the symbol s withthe expression e. This replacement is in fact done via autility method provided by RESYNTH.

2. Second, we delete the statement that declared the vari-able.6

5 We also extended Eclipse’s implementation of this refactoring with anadditional parameter that determines the position at which to declare thenew local variable to avoid the problem mentioned in Section 2.6 Java allows several variable declarations per statement, but our local treesuse slightly modified ASTs where each variable declaration is a separatestatement.

Then, the produced tree is one successor To of Ti andthe set of corresponding arguments args to To is only thevariable v.

When we need to execute the full refactoring, we aregiven a program P , such that Ti ⊆ P and the argumentsof the local refactoring args = v. To execute the full refac-toring, all we need to do is find the AST node of the variabledeclaration v in P , and to call the InlineTempRefactoring

Eclipse refactoring with this AST node as a parameter.

4.3 LimitationsOur current prototype has several limitations. Currently, wedo not formally prove that our local refactorings performedits that are consistent with the full Eclipse refactorings.Rather than verifying each local refactoring separately, inthe future, we envision that the full refactoring is imple-mented in a way which enables one to derive a correct-by-construction local refactoring from it.

Also, some of our local refactorings do not currently han-dle the whole functionality of the full refactorings (for theproof-of-concept, we focused on the most common cases).For example, our local INLINE METHOD refactoring doesnot inline statements that are outside of the local tree. Due tothis limitation, in some test cases, RESYNTH does not findthe desired refactoring sequence, but finds a similar alter-native one. This limitation can be avoided if the user editsinclude the deletion of the inlined function.

5. EvaluationIn this section, we present an initial evaluation of theRESYNTH tool. We first illustrate simple edits that wesuccessfully used to perform individual refactorings withRESYNTH. We then show that RESYNTH was able to syn-thesize complex refactoring sequences required for severalreal-world examples. Furthermore, we show that the searchstrategy used by RESYNTH performed better than alternatestrategies, both on real examples and on a synthetic bench-mark suite.7 Finally, we performed a small user study tosee how programmers like the basic idea of synthesis-basedrefactoring, and to obtain feedback on RESYNTH and sug-gestions for improvements.

5.1 Individual RefactoringsAs discussed in Section 4, RESYNTH currently includes im-plementations of five refactorings that are commonly imple-mented in modern IDEs. Since they are frequently used,IDEs often make these refactorings easy to invoke; e.g.,Eclipse assigns each of them a direct keyboard shortcut.Nevertheless, RESYNTH provides an advantage when ap-plying these refactorings individually, as they can all beinvoked through a uniform interface. We confirmed that

7 All results were obtained on a 64-bit Ubuntu 12.04 machine with a 4-core3.5GHz Core i7 2700k processor and 16GB of RAM, running under Eclipse4.2 with a 4GB maximum heap.

Example steps SourceENCAPSULATE DOWNCAST 3 literature [6]EXTRACT METHOD (advanced) 4 literature [6]DECOMPOSE CONDITIONAL 6 literature [6]INTRODUCE FOREIGN METHOD 2 literature [6]REPLACE TEMP WITH QUERY 3 literature [6]REPLACE PARAMETER WITH METHOD 3 literature [6]SWAP FIELDS 3 literature [32]SWAP FIELD AND PARAMETER 3 literature [25]INTRODUCE PARAMETER 6 Stack Overflow9

Table 1. Realistic examples used to test RESYNTH.

RESYNTH could successfully perform individual refactor-ings when given the following edits (performed betweenpressing the “Start” and “Complete” buttons):

Rename An entity x (method, field, or local variable) can berenamed by editing the declaration of x or any referenceto use the new name.

Inline Local A local variable can be inlined simply by delet-ing its declaration.

Inline Method Similarly, a method m can be inlined at allsites by deleting m.

Extract Local To extract an expression e into a local x, onesimply replaces e with x.

Extract Method With Holes To extract an expression einto a new method m, the user can replace e with aninvocation of m, where the invocation includes the de-sired parameters for m. The final k statements in m canbe extracted in a similar fashion.8 Note that, as describedin Section 2, this refactoring is more general than thestandard EXTRACT METHOD, as it gives the user veryfine-grained control over which expressions to pass asarguments to m. Hence, a refactoring sequence may berequired to handle these edits.

Of course, a developer must learn that the edits outlinedabove will accomplish the desired refactorings before shecan use them. However, we believe the edits are fairly intu-itive, as they are a subset of the edits required to perform therefactorings manually. The simplicity of the edits, along withhaving a uniform interface instead of separate menu items /keyboard shortcuts for each refactoring, could ease the pro-cess of applying these refactorings in practice.

5.2 Refactoring SequencesBenchmarks To test RESYNTH’s effectiveness for synthe-sizing refactoring sequences, we collected a set of examplesthat require a non-trivial refactoring sequence to achieve the

8 We have not yet implemented support in our local refactoring for extract-ing multiple contiguous statements from the middle of a method.9 http://stackoverflow.com/questions/10121374/

referencing-callee-when-refactoring-in-eclipse

user’s desired transformation. The examples are listed in Ta-ble 1, along with information on where the example was ob-tained. The first five examples are transformations presentedin Fowler [6] that are not implemented in Eclipse, but canbe accomplished via a sequence of the refactorings we haveimplemented. We also include two examples of name swap-ping refactorings that have appeared in the literature [25, 32],each of which requires a non-obvious introduction of a tem-porary name in the refactoring sequence. Finally, the INTRO-DUCE PARAMETER example, found online, can be achievedvia a complex sequence of six of our implemented refactor-ings. While Eclipse provides an implementation of INTRO-DUCE PARAMETER, it fails to handle this example. The EX-TRACT METHOD (advanced) and SWAP FIELDS exampleswere previously discussed in Section 2.

Four other examples from Fowler’s book could be com-posed from additional Eclipse refactorings for which wehave not yet implemented local refactorings.

We also generated a suite of random benchmarks to per-form further stress testing of RESYNTH, as follows. For eachbenchmark, we first generated a Java class C containing asequence of static field declarations followed by a sequenceof static methods. Each static method returns a random ex-pression e, generated by using binary operators to combineconstants, local variable and field references, and method in-vocations up to some depth. Given C, we then generated anedited version C ′ by performing a random sequence of editslike those described in Section 5.1 to induce individual refac-torings. Each (C,C ′) pair is a benchmark that tests whetherRESYNTH can discover a sequence of refactorings on C thatpreserves the edits in C ′. For our experiments, we generated100 such benchmark inputs, with five fields, five methods,and expression depth of three for C, and two random editsto produce C ′. We found these parameters to create bench-marks that were somewhat more challenging than our realexamples (see below), while still remaining realistic. All ofour benchmarks (real and synthetic) are available online athttp://www.srl.inf.ethz.ch/resynth.php.

Success Rate Table 2 shows results from applying RESYNTHto our real-world and synthetic benchmarks. RESYNTH suc-cessfully generated a refactoring sequence for all but two ofthe real-world examples and for 84% of the synthetic exam-ples. We bounded the search to a maximum of 20,000 treesfor these results; we explore different bounds shortly. Fromeach explored tree, we run the successors function, whichreturns multiple successor trees and effectively explores ahigher number of refactorings than the number of trees.

On average applying a full Eclipse refactoring takes about0.5 seconds and as we explore thousands (1296 for our realworld tests) of refactorings in order to discover the desiredsequence, had we performed the search with full refactoringsinstead of local refactorings, the process would have taken atleast 10 minutes, clearly an undesirable response time whenperforming interactive edits.

DatasetMetric Real SyntheticNumber of tests 9 100Avg. number of trees searched 87 3752Avg. number of successors in a search 1296 105310Avg. search time 0.014s 1.629sAvg. Eclipse refactoring time 2.953s 1.654sRefactoring sequence length

1 refactoring 0 22 refactorings 1 453 refactorings 5 74 refactorings 1 155 refactorings 0 36 refactorings 2 27 refactorings 0 98 refactorings 0 09 refactorings 0 1Failure to find sequence

after 20000 searched trees 0 16

Table 2. Results for our refactoring sequence search. TheA∗ heuristic function weights (see Section 3.4) were a1 =0.125 and a2 = 0.25.

The two real-world examples for which RESYNTH didnot find the sequence from Fowler’s book were ENCAP-SULATE DOWNCAST and REPLACE PARAMETER WITHMETHOD. This is due to a limitation (see Section 4.3) inour successors function for INLINE METHOD. It did, how-ever, find another valid refactoring sequence that matches thesame edits.

As shown by the mean search time and mean numberof trees searched, the synthetic examples were significantlymore challenging than the real-world examples on average.Also note that the time required to apply the full Eclipserefactorings after the sequence was discovered was roughlyequal to the entire search time for synthetic benchmarks,and much larger than the search time for real examples,indicating the need for local refactorings.

Out of the 16 examples that we fail to solve, nine re-quire minimum 11 steps to accomplish, four tests requireminimum 9 steps and three tests require minimum 7 steps.These are examples where the A∗ search would need to ex-plore more than 20000 trees until it can find a refactoringsequence. Overall, the scalability results for our techniqueare quite encouraging.

Table 3 shows how our success rate on synthetic bench-marks is affected by the search bound. Increasing the searchspace leads to a slightly lower failure rate, but it also sig-nificantly increases the average search time. Further testingwith users is required to discover an appropriate bound inpractice.

Alternate Search Strategies We tested alternate strategiesfor searching for refactoring sequences by tuning the coeffi-cients for the heuristic function h(t) = a1h1(t) + a2h2(t),

Metric Search space limit (# trees)20,000 100,000 500,000

Num. failed tests 16 15 9Avg. number of searched trees 3752 15900 64392Avg. search time 1.629s 7.802s 34.473s

Table 3. Success rate on synthetic benchmarks with differ-ent search bounds. Tested with a1 = 0.125 and a2 = 0.25.

described previously in Section 3.4. Note that setting a1 = 0and a2 = 0 disables the heuristic function, making the A∗

search perform a straightforward breadth-first search.Results for different values of a1 and a2 with a search

budget of 20,000 trees appear in Table 4. The a1 and a2 val-ues we chose for our other experiments are in bold, alongwith the best values in the other columns. The naïve breadth-first search (a1 = a2 = 0) fails for 2 of the real examples and32 of the random tests, corresponding to the tests requiringfour or more refactoring steps. A linear combination of h1and h2 performs better than using each of the heuristic func-tions alone, for both real examples and the synthetic tests. InRESYNTH, we selected a1 = 0.125 and a2 = 0.25 as thesevalues had best results on the real tests and close to the bestresults on the synthetic tests.

5.3 User StudyAlthough RESYNTH is currently a research prototype andsomewhat unpolished, we conducted a small, informal userstudy to gauge how useful programmers find the concept ofrefactoring with synthesis, and to get some feedback on howto improve our tool.

We recruited six participants: two undergraduate stu-dents, three graduate students, and one professional. All ofthe participants had prior experience with Java programming(between one and five years), and all of them were familiarwith Eclipse. One of them was a proficient user of Eclipse’sbuilt-in refactoring tools, while the other participants hadlittle to no experience with tool-supported refactoring.

After a brief demonstration of RESYNTH, we gave eachparticipant a set of three refactoring tasks based on examplesfrom Fowler’s textbook [6], and asked them to complete thetasks using either RESYNTH, Eclipse’s built-in refactorings,or manual editing.10 For one of the tasks (Task 3), RESYNTHcannot, in fact, find the desired solution, but comes up witha slightly different solution.

Four of our participants were able to complete all threetasks using RESYNTH, and two of them failed on one ofthe tasks (though not on the same one): one of them triedto manually compose the transformation out of individualsmall refactorings, but became confused and ultimately gaveup; the other participant was unable to find the right manualedits to perform. In the future, we will investigate improve-

10 A complete description of the tasks is available at http://www.srl.inf.ethz.ch/resynth.php.

Search Parameters Real Examples Synthetic TestsEdit Expr. Num. Avg. Num. Avg.

distance distance failed num. failed num.weight weight tests searched tests searcheda1 a2 trees trees

0.000 0.000 2 5306 32 69430.000 0.125 1 3076 26 53730.000 0.250 0 184 26 53480.000 0.500 0 119 24 45370.000 1.000 1 2350 16 33160.125 0.000 0 1248 25 50650.125 0.125 0 115 19 46420.125 0.250 0 87 16 37520.125 0.500 0 122 15 33960.125 1.000 0 1154 14 32430.250 0.000 0 291 21 44560.250 0.125 0 281 18 38850.250 0.250 1 223 18 36940.250 0.500 1 153 17 34850.250 1.000 1 623 14 35160.500 0.000 2 358 26 44810.500 0.125 1 189 23 44010.500 0.250 1 158 22 42740.500 0.500 1 158 20 41140.500 1.000 1 465 18 40921.000 0.000 2 3033 24 47041.000 0.125 1 641 24 47961.000 0.250 1 637 25 47861.000 0.500 1 477 26 47041.000 1.000 1 768 22 4490

Table 4. Search space with different parameters of theheuristic function for the A* search (When a1 = a2 = 0,the search is a breadth-first search.)

ments to our search techniques and user interface to reducethe number of cases where intuitive edits do not lead to thedesired solution.

Only two participants noticed that the solution found byRESYNTH for Task 3 was not quite the expected solution,but one of them thought the alternative solution was goodenough.

After they had finished the tasks, we asked the partici-pants whether they thought a tool like RESYNTH could beuseful, and what improvements they would like to see. Fourparticipants responded that they did find it useful, althoughone of them qualified his response by saying that he wouldnot trust the tool on a complex code base. This was becauseduring the experiment he discovered a bug in the Eclipse im-plementation of INLINE LOCAL VARIABLE, which led himto doubt the reliability of Eclipse’s built-in refactoring toolson which RESYNTH is built. Of the two participants whowere not convinced of the usefulness of RESYNTH, one wasnot comfortable with the idea of performing long sequencesof refactorings in one go and instead preferred to composerefactorings by hand, while the other was already very fa-

miliar with the Eclipse refactoring tools and did not see theneed for another tool.

Finally, the participants suggested three improvements:(1) handling uncompilable code; (2) eliminating the “StartRefactoring” button; (3) adding support for more refactor-ings. The third suggestion is, in principle, quite easy toimplement by writing local equivalents for more built-inEclipse refactorings. The first suggestion could be addressedby using a more robust error-correcting parser, though it re-mains to be seen whether this would adversely affect thequality of refactoring suggestions.

The second suggestion is more difficult to accommodate.The participant explained that it was easy to forget to clickthe “Start Refactoring” button before initiating a refactoring,which is similar to the “late awareness dilemma” describedby Ge et al. [7]. The participant suggested that instead ofsetting an explicit checkpoint, the IDE should try to infer alikely checkpoint, such as the last version of the programthat compiled. Given that another participant specificallyasked for support for refactoring uncompilable programs,however, it seems unlikely that this heuristic would suit allusers equally well.

While it is impossible to generalize from a study withsuch a limited number of participants, we think that ourresults show that the idea of synthesis-based refactoringhas promise, and with further improvements a tool likeRESYNTH could usefully complement the built-in Eclipserefactoring tools.

6. Related WorkIn this section, we discuss some of the more closely relatedwork in the areas of program refactoring and synthesis.

Refactoring The idea of refactoring catalogs that list com-monly used refactoring operations and specify their behavioralready appeared in Opdyke’s thesis [22], one of the earli-est works in the field. Another influential refactoring cata-log was compiled by Fowler in his book on refactoring forJava [6]. Refactoring tools in current Java IDEs still tend tofollow this catalog in the repertoire of refactorings they offerand the names assigned to them.

However, Murphy-Hill et al. [19] found in a landmarkstudy on refactoring practices that while programmers refac-tor frequently, about 90% of refactorings are performed byhand, even where tool support is available. These numberswere confirmed in a recent, more detailed study by Negaraet al. [20]. To make refactoring tools more attractive to pro-grammers, some authors proposed user-interface improve-ments [18], but even such improved tools still suffer fromdiscoverability issues: the programmer may not know thatan automated implementation is available for a refactoringthey perform by hand, or they may start a manual refactor-ing before remembering that tool support is available.

To address these issues, Ge et al. [7] and Foster et al. [5]proposed systems that observe a programmer’s editing op-

erations and try to discover editing patterns suggestive ofrefactorings. If they discover such a pattern, the user is of-fered the choice of completing the refactoring task usinga refactoring tool. Similarly, Lee et al. [16] advocate anapproach in which refactoring operations can be initiatedthrough drag-and-drop gestures.

In contrast to our work, all these systems can only detectand suggest applications of a single refactoring.

Negara et al. [20] provide evidence to suggest that pro-grammers often perform several refactorings in sequence toachieve a single transformation. Vakilian et al. [35] furthershow that programmers do this even if there is tool sup-port for a single, larger refactoring that would perform thistransformation in one go, preferring the higher level of pre-dictability and control afforded by step-by-step refactorings.

Our approach gives the programmer full flexibility: theycan either only perform a small set of edits and use RESYNTHto perform the associated small-scale refactoring, or performmore edits and let RESYNTH infer a sequence of refactor-ings to achieve a large-scale transformation.

Even more flexibility is achieved by languages for script-ing refactorings such as JunGL [39] or Jackpot [15], whichallow programmers to implement their own custom refactor-ings. Given the effort involved in learning a new languageand API, however, it seems unlikely that any but the mostdetermined developers will use such systems on a regularbasis.

Currently, RESYNTH is built on top of the refactoringsprovided by Eclipse JDT. This means that it will some-times fail to infer a sequence of refactorings if some inter-mediate step cannot be performed using the available refac-torings. This problem could be alleviated by instead bas-ing RESYNTH on more fine-grained transformations as pro-posed in the literature [25, 27].

Steimann et al. [31, 32] take a more radical approachwhich abolishes the notion of individual atomic refactor-ings altogether. Instead, a given program’s static seman-tics (name binding, overriding, accessibility, etc.) is encodedas a set of constraints such that any program that fulfillsthe same constraints must be semantically equivalent to theoriginal program. Refactoring is then simply a search prob-lem in the space of all programs satisfying the constraintsand the intended changes performed by the user between“begin refactoring” and “end refactoring.” While their ap-proach is appealing for its simplicity and generality, it hasonly been shown to work for a very restricted set of refac-torings, namely refactorings for renaming and moving pro-gram elements (but not, for example, EXTRACT METHOD).In contrast to BeneFactor, WitchDoctor or RESYNTH, thisapproach cannot be implemented on top of an existing refac-toring tool and instead requires a complete reimplementationof the entire refactoring engine.

In a slightly different context, several researchers haveconsidered the problem of detecting refactorings or other

forms of systematic code changes from program revisionhistories to better understand program evolution [13, 24, 41],and to adapt clients of evolving frameworks and libraries [3,33, 42]. Vermolen et al. [40] consider the same problem ina modeling setting, where models need to be migrated whentheir metamodel changes. Since the goal of these approachesis to infer completed refactorings, they can assume that alledits arising from the refactorings have already been per-formed. RESYNTH, on the other hand, assumes that the useronly performed some edits by hand, and that further editsmay be needed to complete the intended refactorings, whichleads to a much larger search space.

Synthesis There has long been significant interest in us-ing program synthesis techniques aimed to simplify varioussoftware development tasks. For a recent survey see Gul-wani [9]. Here we describe the techniques that are mostclosely related to our work, focusing on techniques for pro-gram completion.

Prospector [17] introduced a synthesis procedure wheregiven an input type I and output type O, the tool staticallydiscovers (by examining the API specifications) a sequenceof API calls which, starting from an I object, produce anO object. PARSEWeb [34] addresses the same problem, butits search is guided by existing source code mined fromthe web, which helps eliminate many undesirable API se-quences. More recent work [10, 23] searches for suitable ex-pressions of a given type at a particular program, consideringmore of the program context around that point. These sys-tems rely on good ranking algorithms to handle large num-bers of potentially-suitable expressions. In constrast to theaformentioned static approaches, MatchMaker [43] synthe-sizes code based on observed API usage in dynamic execu-tions of real-world programs.

The concept of starting with partial programs and com-pleting them has recently been explored in various synthesisworks. The sketching approach [28] takes as input a programwith holes (a “sketch”), where the user specifies a space ofpossible expressions which can be used to fill the holes. Thesynthesizer then searches for correct program completions(completions which satisfy a given property). The idea hasbeen applied to various application domains including bit-ciphers [30] and concurrency [29]. In the context of concur-rency, recent work has also devised ways to infer varioussynchronization constructs in the context of concurrent pro-gramming. Examples include atomic sections [38], memoryfences [14] and conditional critical regions [37]. These ap-proaches complete an existing concurrent program that maybe potentially incorrect.

Similarly to the above works, our work can also be seenas solving an instance of the program completion problem.Here, our objective is to complete an intermediate programPm into another program Pf , such that Pf is a refactoringof Pi. In some sense, Pi serves as the “specification” forour search in the sense that Pi and Pf should be seman-

tically equivalent. However, unlike the previous work, wedo not aim to complete Pm into Pf directly—such an ap-proach would require partial refactoring transformations tocomplete the edits performed by the user, and it would bequite difficult to even enumerate the space of such trans-formations. Our synthesizer does not make any attempt tocomplete Pm, but instead begins its search from the “spec-ification” Pi, using the intermediate program Pm to decidewhen the search has been successful.

We believe that there are many interesting future direc-tions in combining ideas from the fields of refactoring andsynthesis. For instance, suppose that our approach cannotfind a refactoring sequence from Pi to Pm. An interestingdirection here would be to allow Pm to represent not a sin-gle program but a partial program which symbolically repre-sents a set of programs (e.g. a sketch). Then, the synthesizerwould try to discover a refactoring sequence starting fromPi where the result would be correct if it contains any of theprograms symbolically represented by Pm. If such a refac-toring sequence is discovered, it would still be an accept-able solution (if it passes the global pre-conditions). Thisapproach would give the user more freedom in expressingthe partial changes to the program Pi. In addition, in termsof the synthesis algorithm, an interesting direction could bein formulating the problem as a logical formula and then us-ing an SMT solver to perform the search.

7. Conclusion and Future WorkWe have presented a new approach to automated refactor-ing, inspired by synthesis from examples. In our system, theuser describes a desired transformation by performing someof the required edits on an initial program Pi, and the syn-thesizer searches for a sequence of refactorings of Pi that in-cludes the edits. We described a search strategy based on theconcept of local refactoring combined with a tuned heuris-tic search. We implemented our techniques in a tool calledRESYNTH and showed that it can already handle challeng-ing real-world examples.

In future work, we plan to: i) implement more local refac-torings in RESYNTH and further optimize its search tech-niques, ii) add a user interface for rejecting a proposed refac-toring sequence and continuing the search (the tool alreadyhas the option to display the discovered sequence to theuser), iii) handling code which does not compile, and iv)generalize our approach to transformations beyond refac-toring, e.g., generation of boilerplate code: modern JavaIDEs include many such transformations, and our techniquescould further ease their usage.

References[1] ABADI, A., ETTINGER, R., AND FELDMAN, Y. A. Re-

approaching the Refactoring Rubicon. In WRT (2008).

[2] BECK, K., AND ANDRES, C. Extreme Programming Ex-plained: Embrace Change (2nd Edition). Addison-WesleyProfessional, 2004.

[3] DIG, D., COMERTOGLU, C., MARINOV, D., AND JOHNSON,R. Automated Detection of Refactorings in Evolving Compo-nents. In ECOOP (2006).

[4] Eclipse 4.2. http://www.eclipse.org/eclipse4.

[5] FOSTER, S. R., GRISWOLD, W. G., AND LERNER, S.WitchDoctor: IDE Support for Real-time Auto-completion ofRefactorings. In ICSE (2012).

[6] FOWLER, M. Refactoring: Improving the Design of ExistingCode. Addison Wesley, 2000.

[7] GE, X., DUBOSE, Q. L., AND MURPHY-HILL, E. R. Rec-onciling Manual and Automatic Refactoring. In ICSE (2012).

[8] GRISWOLD, W. G. Program Restructuring as an Aid to Soft-ware Maintenance. Ph.D. thesis, University of Washington,1991.

[9] GULWANI, S. Dimensions in program synthesis. In ACMPPDP (2010).

[10] GVERO, T., KUNCAK, V., KURAJ, I., AND PISKAC, R. OnComplete Completion using Types and Weights. Tech. Rep.182807, EPFL, 2012.

[11] HART, P. E., NILSSON, N. J., AND RAPHAEL, B. A formalbasis for the heuristic determination of minimum cost paths.In IEEE Transactions on Systems Science and Cybernetics(1968), IEEE, pp. 100–107.

[12] IntelliJ IDEA 12 Community Edition. http://www.

jetbrains.com/idea.

[13] KIM, M., NOTKIN, D., AND GROSSMAN, D. AutomaticInference of Structural Changes for Matching Across ProgramVersions. In ICSE (2007).

[14] KUPERSTEIN, M., VECHEV, M., AND YAHAV, E. Automaticinference of memory fences. ACM SIGACT News (2012).

[15] LAHODA, J., BECICKA, J., AND RUIJS, R. B. CustomDeclarative Refactoring in NetBeans. In WRT (2012).

[16] LEE, Y. Y., CHEN, N., AND JOHNSON, R. E. Drag-and-DropRefactoring: Intuitive and Efficient Program Transformation.In ICSE (2013).

[17] MANDELIN, D., XU, L., BODÍK, R., AND KIMELMAN, D.Jungloid mining: helping to navigate the API jungle. In ACMPLDI (2005).

[18] MURPHY-HILL, E. R., AND BLACK, A. P. Breaking theBarriers to Successful Refactoring: Observations and Toolsfor Extract Method. In ICSE (2008).

[19] MURPHY-HILL, E. R., PARNIN, C., AND BLACK, A. P.How We Refactor, and How We Know It. TSE 38, 1 (2012).

[20] NEGARA, S., CHEN, N., VAKILIAN, M., JOHNSON, R. E.,AND DIG, D. A Comparative Study of Manual and Auto-mated Refactorings. In ECOOP (2013).

[21] NetBeans 7.0.1. http://netbeans.org.

[22] OPDYKE, W. F. Refactoring Object-Oriented Frameworks.Ph.D. thesis, University of Illinois at Urbana-Champaign,1992.

[23] PERELMAN, D., GULWANI, S., BALL, T., AND GROSSMAN,D. Type-directed completion of partial expressions. In ACMPLDI (2012).

[24] PRETE, K., RACHATASUMRIT, N., SUDAN, N., AND KIM,M. Template-based Reconstruction of Complex Refactorings.In ICSM (2010).

[25] REICHENBACH, C., COUGHLIN, D., AND DIWAN, A. Pro-gram Metamorphosis. In ECOOP (2009).

[26] ROBERTS, D., BRANT, J., AND JOHNSON, R. E. A Refac-toring Tool for Smalltalk. TAPOS 3, 4 (1997).

[27] SCHÄFER, M., VERBAERE, M., EKMAN, T., AND

DE MOOR, O. Stepping Stones over the RefactoringRubicon – Lightweight Language Extensions to EasilyRealise Refactorings. In ECOOP (2009).

[28] SOLAR-LEZAMA, A. The sketching approach to programsynthesis. In APLAS (2009).

[29] SOLAR-LEZAMA, A., JONES, C. G., AND BODIK, R.Sketching concurrent data structures. In ACM PLDI (2008).

[30] SOLAR-LEZAMA, A., TANCAU, L., BODIK, R., SESHIA,S., AND SARASWAT, V. Combinatorial sketching for finiteprograms. SIGOPS Oper. Syst. Rev. (2006).

[31] STEIMANN, F., KOLLEE, C., AND VON PILGRIM, J. ARefactoring Constraint Language and Its Application to Eiffel.In ECOOP (2011).

[32] STEIMANN, F., AND VON PILGRIM, J. Refactorings WithoutNames. In ASE (2012).

[33] TANEJA, K., DIG, D., AND XIE, T. Automated Detection ofAPI Refactorings in Libraries. In ASE (2007).

[34] THUMMALAPENTA, S., AND XIE, T. Parseweb: a program-mer assistant for reusing open source code on the web. InACM/IEEE ASE (2007).

[35] VAKILIAN, M., CHEN, N., MOGHADDAM, R. Z., NEGARA,S., AND JOHNSON, R. E. A Compositional Paradigm ofAutomating Refactorings. In ECOOP (2013).

[36] VAKILIAN, M., CHEN, N., NEGARA, S., RAJKUMAR,B. A., BAILEY, B. P., AND JOHNSON, R. E. Use, Disuse,and Misuse of Automated Refactorings. In ICSE (2012).

[37] VECHEV, M., YAHAV, E., AND YORSH, G. Inferring syn-chronization under limited observability. In TACAS (2009).

[38] VECHEV, M., YAHAV, E., AND YORSH, G. Abstraction-guided synthesis of synchronization. In ACM POPL (2010).

[39] VERBAERE, M., ETTINGER, R., AND DE MOOR, O. JunGL:A Scripting Language for Refactoring. In ICSE (2006).

[40] VERMOLEN, S., WACHSMUTH, G., AND VISSER, E. Recon-structing Complex Metamodel Evolution. In SLE (2011).

[41] WEISSGERBER, P., AND DIEHL, S. Identifying Refactoringsfrom Source-Code Changes. In ASE (2006).

[42] XING, Z., AND STROULIA, E. API-Evolution Support withDiff-CatchUp. TSE 33, 12 (2007).

[43] YESSENOV, K., XU, Z., AND SOLAR-LEZAMA, A. Data-driven synthesis for object-oriented frameworks. In ACMOOPSLA (2011).

Refactoring with Synthesis - ETH Z · refactoring interface, in which a refactoring is achieved via...

Documents

Transcript of Refactoring with Synthesis - ETH Z · refactoring interface, in which a refactoring is achieved via...