SyGuS-Comp 2017: Results and Analysis · Rajeev Alur University of Pennsylvania ... Microsoft...

D. Fisman, S. Jacobs (Eds.): Sixth Workshop on Synthesis (SYNT 2017)EPTCS 260, 2017, pp. 97–115, doi:10.4204/EPTCS.260.9

c© Alur, Fisman, Singh & Solar-LezamaThis work is licensed under theCreative Commons Attribution License.

SyGuS-Comp 2017: Results and Analysis

Rajeev AlurUniversity of Pennsylvania

Dana FismanBen-Gurion University

Rishabh SinghMicrosoft Research, Redmond

Armando Solar-LezamaMassachusetts Institute of Technology

Syntax-Guided Synthesis (SyGuS) is the computational problem of finding an implementation f thatmeets both a semantic constraint given by a logical formula ϕ in a background theory T , and asyntactic constraint given by a grammar G, which specifies the allowed set of candidate implementa-tions. Such a synthesis problem can be formally defined in SyGuS-IF, a language that is built on topof SMT-LIB.

The Syntax-Guided Synthesis Competition (SyGuS-Comp) is an effort to facilitate, bring togetherand accelerate research and development of efficient solvers for SyGuS by providing a platformfor evaluating different synthesis techniques on a comprehensive set of benchmarks. In this year’scompetition six new solvers competed on over 1500 benchmarks. This paper presents and analysesthe results of SyGuS-Comp’17.

1 Introduction

The Syntax-Guided Synthesis Competition (SyGuS-Comp) is an annual competition aimed to provide anobjective platform for comparing different approaches for solving the Syntax-Guided Synthesis (SyGuS)problem. A SyGuS problem takes as input a logical specification ϕ for what a synthesized function fshould compute, and a grammar G providing syntactic restrictions on the function f to be synthesized.Formally, a solution to a SyGuS instance (ϕ,G, f ) is a function fimp that is expressible in the grammarG such that the formula ϕ[ f/ fimp] obtained by replacing f by fimp in the logical specification ϕ is valid.SyGuS instances are formulated in SyGuS-IF [12], a format built on top of SMT-LIB2 [5].

We report here on the 4th SyGuS competition that took place in July 2017, in Heidelberg, Germanyas a satellite event of CAV’17 (The 29th International Conference on Computer Aided Verification) andSYNT’17 (The Sixth Workshop on Synthesis). As in the previous competition there were four tracks:the general track, the conditional linear integer arithmetic track, the invariant synthesis track, and theprogramming by examples track. We assume most readers of this report are already familiar with theSyGuS problem and the tracks of SyGuS-Comp and thus refer the unfamiliar reader to the report on lastyear’s competition [3].

The report is organized as follows. Section 2 describes the participating benchmarks. Section 3 liststhe participating solvers, and briefly describes the main idea behind their strategy. Section 4 providesdetails on the experimental setup. Section 5 gives an overview of the results per track. Section 6 providesdetails on the results, given from a single benchmark respective. Section 7 concludes.

2 Participating Benchmarks

In addition to last year’s benchmarks, we received 4 new sets of benchmarks this year, which are shownin Table 1.

http://dx.doi.org/10.4204/EPTCS.260.9

http://creativecommons.org

http://creativecommons.org/licenses/by/3.0/

98 SyGuS-Comp 2016: Results and Analysis

Program Repair The 18 program repair benchmarks correspond to the task of generating small ex-pression repairs that are consistent with a given set of input-output examples [8]. These benchmarkswere extracted from real-world Java bugs by manually analyzing the developer commits that involvedchanges to fewer than 5 lines of code. The key idea of the program repair approach is to the first localizethe fault location in a buggy program and generate the corresponding input-output example behaviorfor the buggy expression from passing test cases. In the second phase, the task of repairing the buggyexpression can be framed as a SyGuS problem, where the goal is to synthesize an expression that comesfrom a family of expressions defined using a context-free grammar of expressions and that satisfies theinput-output example constraints.

Crypto Circuits The Crypto Circuits benchmarks comprise of tasks of synthesizing constant-timecircuits that are cryptographically resilient to timing attacks [6]. Consider a circuit C with a set of privateinputs I0 and a set of public inputs I1 such that if an attacker changes the values of the public inputs andobserves the corresponding output, she is unable to infer the values of the private inputs (under standardassumptions about computational resources in cryptography). An attacker can gain information aboutprivate inputs by analyzing the time the circuit takes to compute the output values on public inputs,e.g. when a public input bit changes from 1 to 0, a specific output bit is guaranteed to change from1 to 0 independent of whether a particular private input bit is 0 or 1, but may change faster when thisprivate input is 0, thus leaking information. The timing attack can be prevented if the circuit satisfies theconstant-time property: A constant-time circuit is the one in which the length of all input-to-output pathsmeasured in terms of number of gates are the same.

The problem of synthesizing a new circuit C′ that is functionally equivalent to a given circuit C suchthat C′ is a constant-time circuit can be formalized as a SyGuS problem. A context-free grammar canbe used to define the set of all constant-time circuits with all input-to-output path lengths within a givenbound, and the functional equivalence constraint can be expressed as a Boolean formula [6].

Instruction Selection The Instruction Selection benchmarks consist of tasks for synthesizing a “BitTest and Reset” instruction from the set of basic bitvector operations, in a way similar to the implemen-tations supported by the x86 processors. These benchmarks comprise of 4 different addressing variantswith increasing levels of complexity:

• btr*: Read from register.

• btr-am-base*: Load from memory address base.

• btr-am-base-index*: Load from memory address base with indexing.

• btr-am-base-index-scale-disp*: Load from memory address base with index shifted with scale.

Invariant Generation The invariant generation benchmarks comprise of the task of generating a loopinvariant (as a conditional linear arithmetic expression) given the pre-condition, post-condition and thetransition function corresponding to the loop body. The 7 new benchmarks [10] correspond to loop in-variant tasks adapted from several recent invariant inference papers including generating path invariants,abductive inference, and NECLA Static analysis benchmarks.

R. Alur, D. Fisman, R. Singh & A. Solar-Lezama 99

Benchmark Set # of benchmarks ContributorsInvariant Generation 7 Saswat Padhi (UCLA)

Program Repair 18 Xuan Bach D Le (SMU), David Lo (SMU) and Claire Le Goues (CMU)Crypto Circuits 214 Chao Wang (USC)

Instruction Selection 28 Sebastian Buchwald (KIT) and Andreas Fried (KIT)

Table 1: New Contributed Benchmarks

3 Participating Solvers

Six solvers were submitted to this year’s competition. EUSOLVER2017, an improved version of EU-SOLVER; CVC42017, an improved version of CVC4; EUPHONY, a solver built on top of EUSOLVER;DRYADSYNTH, a solver specialized for conditional linear integer arithmetic; LOOPINVGEN, a solverspecialized for invariant generation problems; and E3SOLVER, a solver specialized for the bitvector cate-gory of the PBE track, built on top of the enumerative solver. Table 2 lists the submitted solvers togetherwith their authors, and Table 3 summarizes which solver participated in which track.

The EUSOLVER2017 is based on the divide and conquer strategy [2]. The idea is to find differentexpressions that work correctly for different subsets of the input space, and unify them into a solutionthat works well for the entire space of inputs. The sub-expressions are typically found using enumerationtechniques and are then unified into the overall expression using machine learning methods for decisiontrees [4].

The CVC42017 solver is based on an approach for program synthesis that is implemented inside anSMT solver [13]. This approach extracts solution functions from unsatisfiability proofs of the negatedform of synthesis conjectures, and uses counterexample-guided techniques for quantifier instantiation(CEGQI) that make finding such proofs practically feasible. CVC42017 also combines enumerative tech-niques, and symmetry breaking techniques [14].

The EUPHONY solver leverages statistical program models to accelerate the EUSOLVER. The under-lying statistical model is called probabilistic higher-order grammar (PHOG), a generalization of prob-abilistic context-free grammars (PCFGs). The idea is to use existing benchmarks and the synthesizedresults to learn a weighted grammar, and give priority to candidates which are more likely according tothe learned weighted grammar.

The DRYADSYNTH solver combines enumerative and symbolic techniques. It considers benchmarksin conditional linear integer arithmetic theory (LIA), and can therefore assume all have a solution insome pre-defined decision tree normal form. It then tries to first get the correct height of a normal formdecision tree, and then tries to synthesize a solution of that height. It makes use of parallelization, usingas many cores as are available, and of optimizations based on solutions of typical LIA SyGuS problems.

Solver AuthorsEUSOLVER2017 Arjun Radhakrishna (Microsoft) and Abhishek Udupa (Microsoft)

CVC42017 Andrew Reynolds (Univ. Of Iowa), Cesare Tinelli (Univ. of Iowa), and Clark Barrett (Stanford)EUPHONY Woosuk Lee (Penn), Arjun Radhakrishna (Microsft) and Abhishek Udupa (Microsoft)

DRYADSYNTH Kangjing Huang (Purdue Univ.), Xiaokang Qiu (Purdue Univ.), and Yanjun Wang (Purdue Univ.)LOOPINVGEN Saswat Padhi (UCLA) and Todd Millstein (UCLA)

E3SOLVER Ammar Ben Khadra (University of Kaiserslautern)

Table 2: Submitted Solvers


Solvers

Tracks EU

SO

LVE

R20

17

CV

C4 2

017

EU

PH

ON

Y

DR

YA

DS

YN

TH

LO

OPIN

VG

EN

E3S

OLV

ER

LIA 1 1 1 1 0 0INV 1 1 1 1 1 0

General 1 1 1 0 0 0PBE Strings 1 1 1 0 0 0

PBE BV 1 1 1 0 0 1

Table 3: Solvers participating in each track

The LOOPINVGEN solver [10] for invariant synthesis extends the data-driven approach to inferringsufficient loop invariants from a collection of program states [11]. Previous approaches to invariantsynthesis were restricted to using a fixed set, or a fixed template for features, e.g., ICE-DT [9, 7] requiresthe shape of constraints (such as octagonal) to be fixed apriori. Instead LOOPINVGEN starts with noinitial features, and automatically grows the feature set as necessary using program synthesis techniques.It reduces the problem of loop invariant inference to a series of precondition inference problems and usesa Counterexample-Guided Inductive Synthesis (CEGIS) loop to revise the current candidate.

The E3SOLVER solver for PBE bitvector programs, is built on top of the enumerative solver [1, 16].It improves on the original ENUMERATIVE solver by applying unification techniques [2] and avoidingcalling an SMT solver, since on PBE tracks there are no semantic constraints other than the input-to-output examples which can be checked without invoking an SMT solver.

4 Experimental Setup

The solvers were run on the StarExec platform [15] with a dedicated cluster of 12 nodes, where each nodeconsisted of two 4-core 2.4GHz Intel processors with 256GB RAM and a 1TB hard drive. The memoryusage limit of each solver run was set to 128GB. The wallclock time limit was set to 3600 seconds (thus, asolver that used all cores could consume at most 14400 seconds cpu time). The solutions that the solversproduce are being checked for both syntactic and semantic correctness. That is, a first post-processorchecks that the produced expression adheres to the grammar specified in the given benchmark, and if thischeck passes, a second post-processor checks that the solution adheres to semantic constraints given inthe benchmark (by invoking an SMT solver).

5 Results Overview

The combined results for all tracks are given in Figure 1. The figure shows the sum of benchmarkssolved by the solvers for each track. We can observe that the EUSOLVER2017 solved the highest numberof benchmarks in the combined tracks, and the EUPHONY solver and the CVC42017 solver solved almostas many.


0

200

400

600

800

1000

1200

1400

EuSolver CVC4 Euphony DryadSynthLoopInvGen e3solver

NumberofBenchmarkSolvedperSolverperTrack

PBEStrings

PBEBitvectors

CLIA

Invariant

General

Figure 1: The overall combined results for each solver on benchmarks from all five tracks.

The primary criterion for winning a track was the number of benchmarks solved, but we also an-alyzed the time to solve and the the size of the generated expressions. Both where classified using apseudo-logarithmic scale as follows. For time to solve the scale is [0,1), [1,3), [3,10), [10,30),[30, 100),[100,300), [300, 1000), [1000,3600), >3600. That is the first “bucket” refers to termination in less thanone second, the second to termination in one to three second and so on. We say that a solver solved a cer-tain benchmark among the fastest if the time it took to solve that benchmark was on the same bucket asthat of the solver who solved that benchmark the fastest. For the expression sizes the pseudo-logarithmicscale we use is [1,10), [10,30), [30,100), [100,300), [300,1000), >1000 where expression size is thenumber of nodes in the SyGuS parse-tree. In some tracks there was a tie or almost a tie in terms of thenumber of solved benchmarks, but the differences in the time to solve where significant. We also reporton the number of benchmarks solved uniquely by a solver (meaning the number of benchmark that solverwas the single solver that managed to solve them).

Figure 2 shows the percentage of benchmarks solved by each of the solvers in each of the tracks (inthe upper part) and the number of benchmarks solved among the fastest by each of the solvers in each ofthe tracks (in the lower part) and the number of benchmarks solved among the fastest.

General Track In the general track the EUSOLVER2017 solved more benchmarks than all others (407),the CVC42017 came second, solving 378 benchmarks, and EUPHONY came third, solving 362 bench-marks. The same order appears in the number of benchmarks solved among the fastest: EUSOLVER2017with 276, CVC42017 with 236, and EUPHONY with 135. In terms of benchmarks solved uniquely by asolver, we have that EUSOLVER2017 solved 34 uniquely, CVC42017 solved 9 uniquely, and EUPHONY

solved 2 uniquely.We partition the benchmarks of the general track according to categories where different categories

consists of related benchmarks. The results per category are given in the Table 4. We can see thatEUSOLVER2017 preformed better than others in the categories of program repair, icfp and cryptographiccircuits. The CVC42017 solver preformed better than others in the categories of let and motion planning,invariant generation with bounded and unbounded integers, arrays, integers and hacker’s delight. TheEUPHONY solver preformed better than others in the categories of multiple functions, compiler opti-mizations and bitvectors. We can also observe that none of the solvers could solve any of the instructionselection benchmarks.


Figure 2: Percentage of benchmarks solved by the different solvers across all tracks, and the percentageof benchmarks a solver solved among the fastest for that benchmarks (according to the logarithmic scale)


Figure 3: Percentage of benchmarks solved by the different solvers across all categories of the generaltrack, and the percentage of benchmarks a solver solved among the fastest for that benchmark (accordingto the logarithmic scale).


Com

pile

rOpt

imiz

atio

nsan

dB

itV

ecto

rs

Let

and

Mot

ion

Plan

ning

Inva

rian

tGen

erat

ion

with

Bou

nded

Ints

Inva

rian

tGen

erat

ion

with

Unb

ound

edIn

ts

Mul

tiple

Func

tions

Arr

ays

Hac

kers

Del

ight

Inte

gers

Prog

ram

Rep

air

ICFP

Cry

ptog

raph

icC

ircu

its

Inst

ruct

ion

Sele

ctio

n

TotalNumber of benchmarks 32 30 28 28 32 31 44 34 18 50 214 28 569

SolvedEUSOLVER2017 16 10 24 24 18 31 35 33 14 50 152 0 407

CVC42017 15 15 24 24 12 31 44 34 14 48 117 0 378EUPHONY 19 10 24 24 18 31 44 33 14 50 95 0 362

FastestEUSOLVER2017 7 2 12 14 6 5 20 14 13 40 143 0 276

CVC42017 11 15 18 19 9 31 44 33 7 19 30 0 236EUPHONY 16 2 8 13 13 4 27 14 9 29 0 0 135

UniquelyEUSOLVER2017 0 0 0 0 0 0 0 0 0 0 34 0 34

CVC42017 1 5 0 0 1 0 0 1 1 0 0 0 9EUPHONY 2 0 0 0 0 0 0 0 0 0 0 0 2

Table 4: Solvers performance across all categories of the general track

Conditional Linear Arithmetic Track In the CLIA track the CVC42017 solved all 73 benchmarks,EUPHONY and EUSOLVER2017 solved 71 benchmarks, and DRYADSYNTH solved 32 benchmarks. TheCVC42017 solver solved 72 benchmarks among the fastest, followed by EUSOLVER2017 which solved29 among the fastest, EUPHONY which solved 11 among the fastetst, and DRYADSYNTH which solved3 among the fastest. None of the benchmarks where solved uniquely.

Invariant Generation Track In the invariant generation track, both the LOOPINVGEN solver and theCVC42017 solver solved 65 out of 74 benchmarks, the DRYADSYNTH solver solved 64 benchmarks, theEuphony solver solved 48 benchmarks and EUSolver solved 40 benchmarks. In terms of the time tosolve the differences where more significant. The LOOPINVGEN solver solved 54 benchmarks amongthe fastest, followed by CVC42017 which solved 38 among the fastest, and DRYADSYNTH which solved36 among the fastest. There was one benchmark that only one solver solved, this is the hola07.sl

benchmark, and the solver is LOOPINVGEN.

Programming By Example BV Track In the PBE track using BV theory, the E3SOLVER solver solvedall 750 benchmarks, EUPHONY solved 747 benchmarks, EUSOLVER2017 solved 242, benchmarks, andCVC42017 solved 687 benchmarks. The E3SOLVER solver solved 692 among the fastest, EUSOLVER2017solved 211 among the fastest, CVC42017 solved 169 among the fastest, and EUPHONY solved 117 amongthe fastest. Three benchmarks where solved uniquely by E3SOLVER, these are: 13_1000.sl, 40_-1000.sl and 89_1000.sl.


Programming By Example Strings Track In the PBE track using SLIA theory, the CVC42017 solved89 out of 108 benchmarks, EUPHONY solved 78, and EUSOLVER2017 solved 69. The EUPHONY solversolved 66 benchmarks among the fastest, CVC42017 solved 49 among the fastest and EUSOLVER2017solved 23 among the fastest. Nine benchmarks where solved by only one solver, which is CVC42017.

6 Detailed Results

In the following section we show the results of the competition from the benchmark’s perspective. For agiven benchmark we would like to know: how many solvers solved it, what is the min and max time tosolve, what are the min and max size of the expressions produced, which solver solved the benchmarkthe fastest, and which solver produced the smallest expression.

We represents the results per benchmark in groups organized per tracks and categories. For instance,Fig. 8 at the top presents details of the program repair benchmarks. The black bars show the range ofthe time to solve among the different solvers in pseudo logarithmic scale (as indicated on the upper partof the y-axis). Inspect for instance benchmark t_2.sl. The black bar indicates that the fastest solver tosolve it used less than 1 second, and the slowest used between 100 to 300 seconds. The black numberabove the black bar indicates the exact number of seconds (floor-rounded to the nearest second) it tookthe slowest solver to solve a benchmark (and ∞ if at least one solver exceeded the time bound). Thus, wecan see that the slowest solver to solve t_2.sl took 141 seconds to solve it. The white number at thelower part of the bar indicates the time of the fastest solver to solve that benchmark. Thus, we can see thatthe fastest solver to solve t_2.sl required less than 1 second to do so. The colored squares/rectanglesnext to the lower part of the black bar, indicate which solvers were the fastest to solve that benchmark(according to the solvers’ legend at the top). Here, fastest means in the same logarithmic scale as theabsolute fastest solver. For instance, we can see that EUPHONY and EUSOLVER2017 were the fastest tosolve t_2.sl, solving it in less than a second and that among the 2 solvers that solved t_3.sl onlyEUSOLVER2017 solved it in less than 1 seconds.

Similarly, the gray bars indicate the range of expression sizes in pseudo logarithmic scales (as indi-cated on the lower part of the y-axis), where the size of an expression is determined by the number ofnodes in its parse tree. The black number at the bottom of the gray bar indicates the exact size expressionof the largest solution (or ∞ if it exceeded 1000), and the white number at the top of the gray bar indicatesthe exact size expression of the smallest solution (when the smallest and largest size of expressions arein the same logarithmic bucket (as is the case in t_2.sl), we provide only the largest expression size,thus there is no white number on the gray bar). The colored squares/rectangles next to the upper part ofthe gray bar indicates which solvers (according to the legend) produced the smallest expression (wheresmallest means in the same logarithmic scale as the absolute smallest expression). For instance, for t_-20.sl the smallest expression produced had size 3, and 2 solvers out of the 3 who solved it managed toproduce an expression of size less than 10.

Finally, at the top of the figure above each benchmark there is a number indicating the number ofsolvers that solved that benchmark. For instance, one solver solved t_14.sl, two solvers solved t_-

12.sl, three solvers solved t_2.sl, and no solver solved t_6.sl. Note that the reason t_6.sl has2 as the upper time bound, is that that is the time to terminate rather than the time to solve. Thus, allsolvers aborted within less than 2 seconds, but either they did not produce a result, or they produced anincorrect result. When no solver produced a correct result, there are no colored squares/rectangles nextto the lower parts of the bars, as is the case for t_6.sl.


Genera

l_plu

s10.s

l

pari

ty-A

IG-d

0.s

l

pari

ty-A

IG-d

1.s

l

pari

ty-N

AN

D-d

0.s

l

pari

ty-N

AN

D-d

1.s

l

pari

ty.s

l

poly

nom

ial.sl

poly

nom

ial1

.sl

poly

nom

ial2

.sl

poly

nom

ial3

.sl

poly

nom

ial4

.sl

qm

_are

a_s

el_

2.s

l

qm

_choose

_01.s

l

qm

_choose

_yz.

sl

qm

_in_r

ange.s

l

qm

_loop_1

.sl

qm

_loop_2

.sl

qm

_loop_3

.sl

qm

_max2.s

l

qm

_max3.s

l

qm

_max4.s

l

qm

_max5.s

l

qm

_neg_1

.sl

qm

_neg_2

.sl

qm

_neg_3

.sl

qm

_neg_4

.sl

qm

_neg_5

.sl

qm

_neg_e

q_1

.sl

qm

_neg_e

q_2

.sl

qm

_neg_e

q_3

.sl

qm

_neg_e

q_4

.sl

qm

_neg_e

q_5

.sl

benchmarks

>1000

[300,1000)

[100,300)

[30,100)

[10,30)

[1,10)

0

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[1,10)

[10,30)

[30,100)

[100,300)

[300,1000)

>1000

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[300,1000)

[1000,3600)

>3600

wallclo

ck-t

ime

exp

r-siz

e

∞

56

24

∞

52

430

34

∞

106

∞

2 2 4 4 8

∞

7

∞

11

∞

5 9

∞

11

7

∞

13

∞ ∞

7

∞

11

∞ ∞ ∞

7

∞

11

∞ ∞ ∞

0

283

0

∞

0

586

0

∞

0

∞

0 0 0 0 0

2

0

∞

1

0

∞

1

∞

0

844

1

∞

44

6

0

∞

0

∞ ∞

0

∞

1

∞ ∞ ∞

1

0

∞

1

∞ ∞ ∞

0 3 2 3 1 0 3 3 3 3 3 0 3 1 0 3 3 2 3 1 0 0 3 2 0 0 0 3 2 0 0 0

Compiler Optimizations and Bitvector Category

Euphony CVC4-2017 EUSolver2017

MPw

L_d1

s3.s

l

MPw

L_d1

s7.s

l

MPw

L_d4

s3.s

l

MPw

L_d4

s7.s

l

MPw

L_d5

s3.s

l

MPw

L_d5

s7.s

l

MPw

oL_

d1

s3.s

l

MPw

oL_

d1

s7.s

l

MPw

oL_

d4

s3.s

l

MPw

oL_

d4

s7.s

l

MPw

oL_

d5

s3.s

l

MPw

oL_

d5

s7.s

l

eig

htf

uncs

.sl

fivefu

ncs

.sl

fivefu

ncs

FALS

Eco

py.s

l

logco

unt.

sl

logco

unt2

.sl

logco

unt2

FALS

Ed1

.sl

logco

unt2

FALS

Ed5

.sl

logco

untF

ALS

Ed1

.sl

logco

untF

ALS

Ed5

.sl

twole

ts1

.sl

twole

ts1

0.s

l

twole

ts3

.sl

twole

ts4

.sl

twole

ts5

.sl

twole

ts6

.sl

twole

ts7

.sl

twole

ts8

.sl

twole

ts9

.sl

benchmarks

>1000

[300,1000)

[100,300)

[30,100)

[10,30)

[1,10)

0

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[1,10)

[10,30)

[30,100)

[100,300)

[300,1000)

>1000

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[300,1000)

[1000,3600)

>3600

wallclo

ck-t

ime

exp

r-siz

e

1 1 7 7

∞

13

∞

1 1 7

∞

7

∞

13

∞ ∞ ∞

27

∞

27

∞ ∞ ∞ ∞ ∞ ∞

16

∞

22 28

∞ ∞ ∞ ∞ ∞

124

0

178

462

5

903

207

∞

268

∞

125

0

174

626

5

∞

207

∞

269

∞ ∞ ∞

150

∞

185

∞

521

∞ ∞ ∞ ∞ ∞

5

1

0

∞

12

0

42

0

∞

87

∞ ∞ ∞ ∞

3 3 3 3 1 0 3 3 3 1 1 0 0 1 1 0 0 0 0 0 0 3 0 3 3 0 0 0 0 0

Let and Motion Planning Category


Figure 4: Evaluation of Compiler Optimizations, Bitvectors, Let and Motion Planning Categories of theGeneral Track.


inv_g

en_a

rray.s

l

inv_g

en_c

egar2

.sl

inv_g

en_c

gr1

.sl

inv_g

en_e

x14.s

l

inv_g

en_e

x23.s

l

inv_g

en_e

x7.s

l

inv_g

en_f

ig1.s

l

inv_g

en_f

ig3.s

l

inv_g

en_f

ig6.s

l

inv_g

en_f

ig8.s

l

inv_g

en_f

ig9.s

l

inv_g

en_f

inf1

.sl

inv_g

en_f

inf2

.sl

inv_g

en_n

_c11.s

l

inv_g

en_s

um

1.s

l

inv_g

en_s

um

3.s

l

inv_g

en_s

um

4.s

l

inv_g

en_t

cs.s

l

inv_g

en_t

erm

2.s

l

inv_g

en_t

erm

3.s

l

inv_g

en_t

rex1.s

l

inv_g

en_t

rex2.s

l

inv_g

en_t

rex4.s

l

inv_g

en_v

mail.

sl

inv_g

en_w

1.s

l

inv_g

en_w

2.s

l

inv_g

en_w

inf1

.sl

inv_g

en_w

inf2

.sl

benchmarks

>1000

[300,1000)

[100,300)

[30,100)

[10,30)

[1,10)

0

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[1,10)

[10,30)

[30,100)

[100,300)

[300,1000)

>1000

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[300,1000)

[1000,3600)

>3600

wallclo

ck-t

ime

exp

r-siz

e

19

∞

19 19

∞

19 19 19 19 19 19 19 19 19

∞

19 19

∞

19 19 19 19 19 19 19 19 19 19

2435

1

649

64

4

0

1

0

∞

2

0

28

1

48

1 1

0

1

0

3

0

7

1

7

1

3

2

325

147

6

1

2573

3

1891

645

1

0

1

0

1

0

1

0

1

0

1

0

1

0

2

0

1

0

1

0

3 0 3 3 0 3 3 3 3 3 3 3 3 3 0 3 3 0 3 3 3 3 3 3 3 3 3 3

Invariant Generation with Bounded Integers Category


unbdd_i

nv_g

en_a

rray.s

l

unbdd_i

nv_g

en_c

egar2

.sl

unbdd_i

nv_g

en_c

gr1

.sl

unbdd_i

nv_g

en_e

x1

4.s

l

unbdd_i

nv_g

en_e

x2

3.s

l

unbdd_i

nv_g

en_e

x7

.sl

unbdd_i

nv_g

en_f

ig1

.sl

unbdd_i

nv_g

en_f

ig3

.sl

unbdd_i

nv_g

en_f

ig6

.sl

unbdd_i

nv_g

en_f

ig8

.sl

unbdd_i

nv_g

en_f

ig9

.sl

unbdd_i

nv_g

en_f

inf1

.sl

unbdd_i

nv_g

en_f

inf2

.sl

unbdd_i

nv_g

en_n

_c1

1.s

l

unbdd_i

nv_g

en_s

um

1.s

l

unbdd_i

nv_g

en_s

um

3.s

l

unbdd_i

nv_g

en_s

um

4.s

l

unbdd_i

nv_g

en_t

cs.s

l

unbdd_i

nv_g

en_t

erm

2.s

l

unbdd_i

nv_g

en_t

erm

3.s

l

unbdd_i

nv_g

en_t

rex1

.sl

unbdd_i

nv_g

en_t

rex2

.sl

unbdd_i

nv_g

en_t

rex4

.sl

unbdd_i

nv_g

en_v

mail.

sl

unbdd_i

nv_g

en_w

1.s

l

unbdd_i

nv_g

en_w

2.s

l

unbdd_i

nv_g

en_w

inf1

.sl

unbdd_i

nv_g

en_w

inf2

.sl

benchmarks

>1000

[300,1000)

[100,300)

[30,100)

[10,30)

[1,10)

0

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[1,10)

[10,30)

[30,100)

[100,300)

[300,1000)

>1000

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[300,1000)

[1000,3600)

>3600

wallclo

ck-t

ime

exp

r-siz

e

19

∞

21 21

∞

19 23 19 19 19 19 19 19 19

∞

19 19

∞

19 19 21 19 19 19 19 21 19 19

73

1

∞

316

3

1 2

∞

2

0

30

1

88

1

0

1

0

2

0

1

0

1

0

5

1

∞

294

1

1533

3

∞

699

1

0 0

111

1

0

1

0

1

0

1

0

8

1 1

0

1

0

3 0 3 3 0 3 3 3 3 3 3 3 3 3 0 3 3 0 3 3 3 3 3 3 3 3 3 3

Invariant Generation with Unbounded Integers Category


Figure 5: Evaluation of Invariant Category of the General Track.


nin

efu

ncs

.sl

s0.s

l

s0_d

0.s

l

s0_d

1.s

l

s0_d

2.s

l

s1.s

l

s10.s

l

s11.s

l

s12.s

l

s2.s

l

s3.s

l

s4.s

l

s5.s

l

s6.s

l

s7.s

l

s8.s

l

s9.s

l

seven_c

1.s

l

seven_c

2.s

l

seven_c

3.s

l

seven_c

4.s

l

seven_t

hre

e_d

0.s

l

seven_t

hre

e_d

1.s

l

sevenfu

ncs

.sl

sixfu

ncs

.sl

tenfu

nc1

.sl

tenfu

nc2

.sl

thre

e.s

l

thre

e_d

0.s

l

thre

e_d

1.s

l

thre

e_d

2.s

l

thre

e_d

3.s

l

benchmarks

>1000

[300,1000)

[100,300)

[30,100)

[10,30)

[1,10)

0

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[1,10)

[10,30)

[30,100)

[100,300)

[300,1000)

>1000

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[300,1000)

[1000,3600)

>3600

wallclo

ck-t

ime

exp

r-siz

e

∞ ∞

54

∞

80

∞

80

∞

80

∞

55

7

∞ ∞

6 1

11

3

17

9

∞

13

9

∞

7

∞

5 5 5

∞

5

∞ ∞ ∞ ∞

26

∞ ∞

5 5

∞ ∞ ∞

∞ ∞

1

∞

1

∞

1

∞

1

∞

2

8

0

∞ ∞

0 0 0 0

6

0

∞

27

0

∞

12

0

∞

6

31

1

35

1

∞

1

∞ ∞ ∞ ∞

0

∞ ∞

0

72

8

∞ ∞ ∞

0 2 2 2 2 2 3 0 0 3 3 3 3 0 3 0 3 2 3 3 2 0 0 0 1 0 0 3 3 0 0 0

Multiple Functions Category


arr

ay_s

earc

h_1

0.s

l

arr

ay_s

earc

h_1

1.s

l

arr

ay_s

earc

h_1

2.s

l

arr

ay_s

earc

h_1

3.s

l

arr

ay_s

earc

h_1

4.s

l

arr

ay_s

earc

h_1

5.s

l

arr

ay_s

earc

h_2

.sl

arr

ay_s

earc

h_3

.sl

arr

ay_s

earc

h_4

.sl

arr

ay_s

earc

h_5

.sl

arr

ay_s

earc

h_6

.sl

arr

ay_s

earc

h_7

.sl

arr

ay_s

earc

h_8

.sl

arr

ay_s

earc

h_9

.sl

arr

ay_s

um

_10

_15

.sl

arr

ay_s

um

_10

_5.s

l

arr

ay_s

um

_2_1

5.s

l

arr

ay_s

um

_2_5

.sl

arr

ay_s

um

_3_1

5.s

l

arr

ay_s

um

_3_5

.sl

arr

ay_s

um

_4_1

5.s

l

arr

ay_s

um

_4_5

.sl

arr

ay_s

um

_5_1

5.s

l

arr

ay_s

um

_6_1

5.s

l

arr

ay_s

um

_6_5

.sl

arr

ay_s

um

_7_1

5.s

l

arr

ay_s

um

_7_5

.sl

arr

ay_s

um

_8_1

5.s

l

arr

ay_s

um

_8_5

.sl

arr

ay_s

um

_9_1

5.s

l

arr

ay_s

um

_9_5

.sl

benchmarks

>1000

[300,1000)

[100,300)

[30,100)

[10,30)

[1,10)

0

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[1,10)

[10,30)

[30,100)

[100,300)

[300,1000)

>1000

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[300,1000)

[1000,3600)

>3600

wallclo

ck-t

ime

exp

r-siz

e

41K

51

101K

56

184K

61

481K

66

1065K

71

1843K

76

26

101

16

256

21

631

26

2K

31

4K

36

8K

41

18K

46

871K

334

870K

82 38

24 18

109

75 77

189 173

28

833

149

3K

186

3K

46

13K

223

13K

55

54K

297

54K

73

217K

334

217K

73

17

0

23

1

29

1

38

5

50

10

75

18

1

0

1

0

2

0

2

0

4

0

5

0

8

0

11

0

176

21

83

20

0 0 0 0

1

0

1

0

4

0

11

0

7

0

19

0

10

0

37

1

19

1

74

5

38

5

3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

Arrays Category


Figure 6: Evaluation of Multiple Functions and Arrays Categories of the General Track.


hd-0

1-d

1-p

rog.s

l

hd-0

1-d

5-p

rog.s

l

hd-0

2-d

0-p

rog.s

l

hd-0

2-d

1-p

rog.s

l

hd-0

2-d

5-p

rog.s

l

hd-0

3-d

0-p

rog.s

l

hd-0

3-d

1-p

rog.s

l

hd-0

3-d

5-p

rog.s

l

hd-0

4-d

0-p

rog.s

l

hd-0

4-d

1-p

rog.s

l

hd-0

4-d

5-p

rog.s

l

hd-0

5-d

0-p

rog.s

l

hd-0

5-d

1-p

rog.s

l

hd-0

5-d

5-p

rog.s

l

hd-0

6-d

0-p

rog.s

l

hd-0

6-d

1-p

rog.s

l

hd-0

6-d

5-p

rog.s

l

hd-0

7-d

0-p

rog.s

l

hd-0

7-d

1-p

rog.s

l

hd-0

7-d

5-p

rog.s

l

hd-0

8-d

0-p

rog.s

l

hd-0

8-d

1-p

rog.s

l

hd-0

8-d

5-p

rog.s

l

hd-0

9-d

0-p

rog.s

l

hd-0

9-d

1-p

rog.s

l

hd-0

9-d

5-p

rog.s

l

hd-1

3-d

0-p

rog.s

l

hd-1

3-d

1-p

rog.s

l

hd-1

3-d

5-p

rog.s

l

hd-1

4-d

0-p

rog.s

l

hd-1

4-d

1-p

rog.s

l

hd-1

4-d

5-p

rog.s

l

hd-1

5-d

0-p

rog.s

l

hd-1

5-d

1-p

rog.s

l

hd-1

5-d

5-p

rog.s

l

hd-1

7-d

0-p

rog.s

l

hd-1

7-d

1-p

rog.s

l

hd-1

7-d

5-p

rog.s

l

hd-1

9-d

0-p

rog.s

l

hd-1

9-d

1-p

rog.s

l

hd-1

9-d

5-p

rog.s

l

hd-2

0-d

0-p

rog.s

l

hd-2

0-d

1-p

rog.s

l

hd-2

0-d

5-p

rog.s

l

benchmarks

>1000

[300,1000)

[100,300)

[30,100)

[10,30)

[1,10)

0

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[1,10)

[10,30)

[30,100)

[100,300)

[300,1000)

>1000

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[300,1000)

[1000,3600)

>3600

wallclo

ck-t

ime

exp

r-siz

e

5 5 5 5 5 4 4 4 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 9 9

10

7 8 8 8 9

∞

9

∞

9 9

10

9

∞

9 9 9 9

∞

19

∞

19

∞

19

∞

20

∞

20

∞

22

0

1

0 0 0

1

0 0 0

1

0 0

1

0

1

0 0 0

1

0 0 0

1

0 0

1

0

1

0 0 0

1

0

1

0

54

0

52

0

1

0

188

0

3248

0

20

0

∞

0

∞

0

23

0

2995

0

∞

0

1

0

2

0

1701

0

∞

0

∞

0

∞

0

∞

0

∞

0

∞

0

3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 3 3 2 3 3 3 2 2 2 2 2 2

Hacker's Delight Category


LinExpr_

eq1

ex.s

l

LinExpr_

eq2

ex.s

l

LinExpr_

inv1

.sl

LinExpr_

inv1

_ex.s

l

com

muta

tive.s

l

const

ant.

sl

max1

0.s

l

max1

2.s

l

max1

3.s

l

max1

4.s

l

max1

5.s

l

max2

.sl

max3

.sl

max4

.sl

max5

.sl

max6

.sl

max7

.sl

max8

.sl

max9

.sl

max_1

1.s

l

mpg_e

xam

ple

1.s

l

mpg_e

xam

ple

2.s

l

mpg_e

xam

ple

3.s

l

mpg_e

xam

ple

4.s

l

mpg_e

xam

ple

5.s

l

mpg_g

uard

1.s

l

mpg_g

uard

2.s

l

mpg_g

uard

3.s

l

mpg_g

uard

4.s

l

mpg_i

te1

.sl

mpg_i

te2

.sl

mpg_p

lane1

.sl

mpg_p

lane2

.sl

mpg_p

lane3

.sl

benchmarks

>1000

[300,1000)

[100,300)

[30,100)

[10,30)

[1,10)

0

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[1,10)

[10,30)

[30,100)

[100,300)

[300,1000)

>1000

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[300,1000)

[1000,3600)

>3600

wallclo

ck-t

ime

exp

r-siz

e

69K

65

∞

14

4K

19

3 1

442 667 796 942

1K

690

6

28

55 93

139 196 267

358

265

552

127

69

263

1K

371

4K 3K

23 27

31

25

33

27

77

105

95

3

17 29

199

0

∞

0

335

1

3

0 0 0

4

0

18

2

14

3

27

5

20

7

0 0

1

0

1

0

2

0

2

0

2

0

3

0

10

1

0

1

0

3

0

7

0

5

0 0 0 0 0

1

0

1

0 0 0 0

3 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

Integers Category


Figure 7: Evaluation of Hacker’s Delight and Integers Categories of the General Track.


t1.s

l

t10.s

l

t11.s

l

t12.s

l

t13.s

l

t14.s

l

t15.s

l

t17.s

l

t18.s

l

t2.s

l

t20.s

l

t3.s

l

t4.s

l

t5.s

l

t6.s

l

t7.s

l

t8.s

l

t9.s

l

benchmarks

>1000

[300,1000)

[100,300)

[30,100)

[10,30)

[1,10)

0

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[1,10)

[10,30)

[30,100)

[100,300)

[300,1000)

>1000

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[300,1000)

[1000,3600)

>3600

wallclo

ck-t

ime

exp

r-siz

e

4 3 2

∞

3

∞ ∞

3 1 3

∞

3

38

3 3 3 1

∞

3 3 3

1

0 0

140

0 0

∞

0

141

0

132

0

∞

0

141

0

1

0

138

0

120

0

138

0

2

0 0

139

0

1

0

3 3 3 2 0 1 3 3 0 3 3 3 3 3 0 3 3 3

Program Repair Category


icfp

_10

3_1

0.s

l

icfp

_10

4_1

0.s

l

icfp

_10

5_1

00

.sl

icfp

_10

5_1

00

0.s

l

icfp

_11

3_1

00

0.s

l

icfp

_11

4_1

00

.sl

icfp

_11

8_1

0.s

l

icfp

_11

8_1

00

.sl

icfp

_12

5_1

0.s

l

icfp

_13

4_1

00

0.s

l

icfp

_13

5_1

00

.sl

icfp

_13

9_1

0.s

l

icfp

_14

3_1

00

0.s

l

icfp

_14

4_1

00

.sl

icfp

_14

4_1

00

0.s

l

icfp

_14

7_1

00

0.s

l

icfp

_14

_10

00

.sl

icfp

_15

0_1

0.s

l

icfp

_21

_10

00

.sl

icfp

_25

_10

00

.sl

icfp

_28

_10

.sl

icfp

_30

_10

.sl

icfp

_32

_10

.sl

icfp

_38

_10

.sl

icfp

_39

_10

0.s

l

icfp

_45

_10

.sl

icfp

_45

_10

00

.sl

icfp

_51

_10

.sl

icfp

_54

_10

00

.sl

icfp

_56

_10

00

.sl

icfp

_5_1

00

0.s

l

icfp

_64

_10

.sl

icfp

_68

_10

00

.sl

icfp

_69

_10

.sl

icfp

_72

_10

.sl

icfp

_73

_10

.sl

icfp

_7_1

0.s

l

icfp

_7_1

00

0.s

l

icfp

_81

_10

00

.sl

icfp

_82

_10

.sl

icfp

_82

_10

0.s

l

icfp

_87

_10

.sl

icfp

_93

_10

00

.sl

icfp

_94

_10

0.s

l

icfp

_94

_10

00

.sl

icfp

_95

_10

0.s

l

icfp

_96

_10

.sl

icfp

_96

_10

00

.sl

icfp

_99

_10

0.s

l

icfp

_9_1

00

0.s

l

benchmarks

>1000

[300,1000)

[100,300)

[30,100)

[10,30)

[1,10)

0

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[1,10)

[10,30)

[30,100)

[100,300)

[300,1000)

>1000

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[300,1000)

[1000,3600)

>3600

wallclo

ck-t

ime

exp

r-siz

e

59

21 23 13 13 22

41 87

29

59

18

366

32

13 11

9

15

462

31

3K

31

13

∞

36 62 33

18 29

2

14 15 17 20

8 8

21 12

1K

46 63

16 20

58

27 21 13 22 24

616

23

50

22

73

16 16

30

17

69

19

43

11

43

13

500

21

69 90

16 25

∞

14

8

2

0

2

0

13

4

26

3

50

2

6

2

11

4 8

1776

13 11

1 2

0

1748

4

208

4

916

11

65

3

∞

14

6

1

42

4

149

9

1

0

546

1 2

0

2

0

2 2

0

11

3

2

0

92

3

388

14 12

3

2

28

3

0

44

1

3

0

3

0

164

13 13

4 8

2

14

3 3

0

92

4 3

0

14

3

1123

16

8

163

25

47

1

∞

7

3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2

ICFP Category


Figure 8: Evaluation of Program Repair and ICFP Categories of the General Track.


fg_a

rray_s

earc

h_1

0.s

l

fg_a

rray_s

earc

h_1

1.s

l

fg_a

rray_s

earc

h_1

2.s

l

fg_a

rray_s

earc

h_1

3.s

l

fg_a

rray_s

earc

h_1

4.s

l

fg_a

rray_s

earc

h_1

5.s

l

fg_a

rray_s

earc

h_2

.sl

fg_a

rray_s

earc

h_3

.sl

fg_a

rray_s

earc

h_4

.sl

fg_a

rray_s

earc

h_5

.sl

fg_a

rray_s

earc

h_6

.sl

fg_a

rray_s

earc

h_7

.sl

fg_a

rray_s

earc

h_8

.sl

fg_a

rray_s

earc

h_9

.sl

fg_a

rray_s

um

_10_1

5.s

l

fg_a

rray_s

um

_10_5

.sl

fg_a

rray_s

um

_2_1

5.s

l

fg_a

rray_s

um

_2_5

.sl

fg_a

rray_s

um

_3_1

5.s

l

fg_a

rray_s

um

_3_5

.sl

fg_a

rray_s

um

_4_1

5.s

l

fg_a

rray_s

um

_4_5

.sl

fg_a

rray_s

um

_5_1

5.s

l

fg_a

rray_s

um

_6_1

5.s

l

fg_a

rray_s

um

_6_5

.sl

fg_a

rray_s

um

_7_1

5.s

l

fg_a

rray_s

um

_7_5

.sl

fg_a

rray_s

um

_8_1

5.s

l

fg_a

rray_s

um

_8_5

.sl

fg_a

rray_s

um

_9_1

5.s

l

fg_a

rray_s

um

_9_5

.sl

fg_e

ightf

uncs

.sl

fg_f

ivefu

ncs

.sl

fg_m

ax10.s

l

fg_m

ax11.s

l

fg_m

ax12.s

l

fg_m

ax13.s

l

benchmarks

>1000

[300,1000)

[100,300)

[30,100)

[10,30)

[1,10)

0

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[1,10)

[10,30)

[30,100)

[100,300)

[300,1000)

>1000

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[300,1000)

[1000,3600)

>3600

wallclo

ck-t

ime

exp

r-siz

e

∞411

∞496

∞589

∞690

∞799

∞916

53

19

∞

40

∞

69

∞

106

∞

151

∞

204

∞

265

∞334

∞ ∞

18 18

57

19

57

25

∞

112

∞

112

∞

229

∞492

∞492

∞679

∞679

∞ ∞ ∞953

∞953

44

24 17

8

∞406

∞491

∞584

∞685

∞

0

∞

0

∞

0

∞

0

∞

0

∞

0

7

0

∞

0

∞

0

∞

0

∞

0

∞

0

∞

0

∞

0

∞

0

∞

0

1

0

1

0

67

0

109

0

∞

0

∞

0

∞

0

∞

0

∞

0

∞

0

∞

0

∞

0

∞

0

∞

0

∞

0

1

0

1

0

∞

0

∞

1

∞

2

∞

3

3 3 3 3 3 3 4 3 3 3 3 3 3 3 3 3 4 4 4 4 3 3 3 3 3 3 3 3 3 3 3 4 4 3 3 3 3

CLIA Track - Part 1

Euphony CVC4-2017 DryadSynth EUSolver2017fg

_VC

22

_a.s

l

fg_V

C2

2_b

.sl

fg_m

ax1

4.s

l

fg_m

ax1

5.s

l

fg_m

ax2

.sl

fg_m

ax3

.sl

fg_m

ax4

.sl

fg_m

ax5

.sl

fg_m

ax6

.sl

fg_m

ax7

.sl

fg_m

ax8

.sl

fg_m

ax9

.sl

fg_m

pg_e

xam

ple

1.s

l

fg_m

pg_e

xam

ple

2.s

l

fg_m

pg_e

xam

ple

3.s

l

fg_m

pg_e

xam

ple

4.s

l

fg_m

pg_e

xam

ple

5.s

l

fg_m

pg_g

uard

1.s

l

fg_m

pg_g

uard

2.s

l

fg_m

pg_g

uard

3.s

l

fg_m

pg_g

uard

4.s

l

fg_m

pg_i

te1

.sl

fg_m

pg_i

te2

.sl

fg_m

pg_p

lane1

.sl

fg_m

pg_p

lane2

.sl

fg_m

pg_p

lane3

.sl

fg_n

inefu

ncs

.sl

fg_p

oly

nom

ial.sl

fg_p

oly

nom

ial1

.sl

fg_p

oly

nom

ial2

.sl

fg_p

oly

nom

ial3

.sl

fg_p

oly

nom

ial4

.sl

fg_s

evenfu

ncs

.sl

fg_s

ixfu

ncs

.sl

fg_t

enfu

nc1

.sl

fg_t

enfu

nc2

.sl

benchmarks

>1000

[300,1000)

[100,300)

[30,100)

[10,30)

[1,10)

0

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[1,10)

[10,30)

[30,100)

[100,300)

[300,1000)

>1000

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[300,1000)

[1000,3600)

>3600

wallclo

ck-t

ime

exp

r-siz

e

∞

6

∞

12

∞794

∞794

14

42 90

∞

101

∞

146

∞

199

∞

260

∞329

46

21

∞

219

∞391

∞505

2K

74 30

21

68

26

72

26

76

26

141

33

397

42

3

11

9

11

9

65

∞

2 2 5 6 8

41

18 28

4K

36

4K

36

∞

3

∞

34

∞

5

∞

7

1

0

7

0

3462

0

∞

0

∞

0

∞

0

∞

0

∞

0

5

0

∞

0

∞

0

∞

0

8

0

1

0

2

0

6

0

5

0

6

0

5

0

1

0

1

0

1

0

2

0

∞

0

1

0

7

0

6

0

829

1 1

0

1

0

2

0

2

0

2 2 3 3 4 4 4 3 3 3 3 3 4 3 3 3 4 4 4 4 4 4 4 4 4 4 4 3 4 4 4 4 4 4 4 4

CLIA Track - Part 2

Euphony CVC4-2017 DryadSynth EUSolver2017

Figure 9: Evaluation of CLIA track benchmarks.


anfp

-new

.sl

anfp

.sl

arr

ay-n

ew

.sl

arr

ay.s

l

arr

ay_s

imp.s

l

arr

ay_v

ars

.sl

cegar1

-new

.sl

cegar1

.sl

cegar1

_vars

-new

.sl

cegar1

_vars

.sl

cegar2

-new

.sl

cegar2

.sl

cegar2

_vars

-new

.sl

cegar2

_vars

.sl

cggm

p-n

ew

.sl

cggm

p.s

l

dec-

new

.sl

dec.

sl

dec_

sim

pl-

new

.sl

dec_

sim

pl.sl

dec_

vars

-new

.sl

dec_

vars

.sl

ex11-n

ew

.sl

ex11.s

l

ex11_s

impl-

new

.sl

ex11_s

impl.sl

ex11_v

ars

-new

.sl

ex11_v

ars

.sl

ex14-n

ew

.sl

ex14.s

l

ex14_s

imp.s

l

ex14_v

ars

.sl

ex23.s

l

ex7_v

ars

.sl

ex_2

3_v

ars

.sl

benchmarks

>1000

[300,1000)

[100,300)

[30,100)

[10,30)

[1,10)

0

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[1,10)

[10,30)

[30,100)

[100,300)

[300,1000)

>1000

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[300,1000)

[1000,3600)

>3600

wallclo

ck-t

ime

exp

r-siz

e

∞

19

∞

19

41

7

21

7

41

7

41

7

∞

5

31

7

∞

5

31

7

∞

18

∞

22

∞

18

∞

22

∞

19

∞

19

9 9

∞

13

7

∞

13

7

∞

7

∞

7

12

7

51

12 12

7

∞

17

3

17

3

12

3

12

3

∞

22

50

7

∞

17

204

0

164

0

8

0

8

0

18

0

70

0

∞

0

8

0

∞

0

101

0

391

1

∞

1

3009

1

∞

1

∞

0

∞

0

2

0

2

0

∞

0

7

0

∞

0

17

0

∞

0

29

0

6

0

751

0

176

0

∞

0

1

0

1

0

1

0

1

0

∞1173

788

0

∞

0

4 4 5 5 5 5 4 5 4 5 4 4 4 4 3 4 5 5 0 5 0 5 4 3 5 5 5 0 5 5 5 5 2 5 2

INV Track - Part 1

Euphony CVC4-2017 DryadSynth EUSolver2017 LoopInvGenex7

.sl

fig1

-new

.sl

fig1

.sl

fig1

_vars

-new

.sl

fig1

_vars

.sl

fig3

.sl

fig3

_vars

.sl

fig9

.sl

fig9

_vars

.sl

form

ula

22

.sl

form

ula

25

.sl

form

ula

27

.sl

hola

.05

.sl

hola

.07

.sl

hola

.20

.sl

hola

.41

.sl

hola

.44

.sl

hola

_add.s

l

hola

_countu

d.s

l

inc.

sl

inc_

sim

p.s

l

inc_

vars

.sl

matr

ix2

.sl

matr

ix2

_sim

p.s

l

sum

1.s

l

sum

1_v

ars

.sl

sum

3.s

l

sum

3_v

ars

.sl

sum

4.s

l

sum

4_s

imp.s

l

sum

4_v

ars

.sl

taca

s.sl

taca

s_vars

.sl

treax1

.sl

trex1

_vars

.sl

trex3

.sl

trex3

_vars

.sl

vse

nd.s

l

w1

.sl

benchmarks

>1000

[300,1000)

[100,300)

[30,100)

[10,30)

[1,10)

0

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[1,10)

[10,30)

[30,100)

[100,300)

[300,1000)

>1000

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[300,1000)

[1000,3600)

>3600

wallclo

ck-t

ime

exp

r-siz

e

41

7

13

7

13

7

16

7

51

7

51

8

45

8

12

7

26

7

∞

16

∞

17

∞

27

57

7

∞

25

∞

23

∞

31

∞

15 24

5

∞

9

∞

3

13

7

13

7

∞ ∞ ∞

17

∞

17 17

3

17

3

∞

13

∞

17

∞

17

31

8

31

8 8 8

∞ ∞

13

3

17

3

16

1

5

0

5

0

58

0

59

0

9

0

157

0

10

0

37

0

∞

0

∞

0

∞

0

1280

0

∞

2

∞

1

∞

0

∞

2

2911

0

∞

0

2152

0

6

0

31

0

∞

0

∞

0

761

0

∞

0

5

0

6

0

∞

0

∞

0

∞

0

16

0

167

0

1

0

1

0

∞

0

∞

0

3

0

3

0

5 5 5 5 5 5 5 5 5 4 3 3 5 1 2 3 3 5 4 4 5 5 0 0 4 4 5 5 4 4 4 5 5 5 5 0 0 5 5

INV Track - Part 2

Euphony CVC4-2017 DryadSynth EUSolver2017 LoopInvGen

Figure 10: Evaluation of Invariant track benchmarks.


bik

es-

long-r

epeat.

sl

bik

es-

long.s

l

bik

es.

sl

bik

es_

small.

sl

dr-

nam

e-l

ong-r

epeat.

sl

dr-

nam

e-l

ong.s

l

dr-

nam

e.s

l

dr-

nam

e_s

mall.

sl

firs

tnam

e-l

ong-r

epeat.

sl

firs

tnam

e-l

ong.s

l

firs

tnam

e.s

l

firs

tnam

e_s

mall.

sl

init

ials

-long-r

epeat.

sl

init

ials

-long.s

l

init

ials

.sl

init

ials

_sm

all.

sl

last

nam

e-l

ong-r

epeat.

sl

last

nam

e-l

ong.s

l

last

nam

e.s

l

last

nam

e_s

mall.

sl

nam

e-c

om

bin

e-2

-long-r

epeat.

sl

nam

e-c

om

bin

e-2

-long.s

l

nam

e-c

om

bin

e-2

.sl

nam

e-c

om

bin

e-2

_short

.sl

nam

e-c

om

bin

e-3

-long-r

epeat.

sl

nam

e-c

om

bin

e-3

-long.s

l

nam

e-c

om

bin

e-3

.sl

nam

e-c

om

bin

e-3

_short

.sl

nam

e-c

om

bin

e-4

-long-r

epeat.

sl

nam

e-c

om

bin

e-4

-long.s

l

nam

e-c

om

bin

e-4

.sl

nam

e-c

om

bin

e-4

_short

.sl

nam

e-c

om

bin

e-l

ong-r

epeat.

sl

nam

e-c

om

bin

e-l

ong.s

l

nam

e-c

om

bin

e.s

l

nam

e-c

om

bin

e_s

hort

.sl

phone-1

-long-r

epeat.

sl

phone-1

-long.s

l

phone-1

.sl

phone-1

0-l

ong-r

epeat.

sl

phone-1

0-l

ong.s

l

phone-1

0.s

l

phone-1

0_s

hort

.sl

phone-1

_short

.sl

phone-2

-long-r

epeat.

sl

phone-2

-long.s

l

phone-2

.sl

phone-2

_short

.sl

phone-3

-long-r

epeat.

sl

phone-3

-long.s

l

phone-3

.sl

phone-3

_short

.sl

phone-4

_short

.sl

phone_s

hort

.sl

benchmarks

>1000

[300,1000)

[100,300)

[30,100)

[10,30)

[1,10)

0

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[1,10)

[10,30)

[30,100)

[100,300)

[300,1000)

>1000

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[300,1000)

[1000,3600)

>3600

wallclo

ck-t

ime

exp

r-siz

e

7 7

11

7

24

∞

11

∞

11 16 16

7 7 7 7

∞

16

∞

16

∞

16

∞

16 14 14 14 14

9 9 9 9 9 9 9 9

11 11 11 11

5 5 5 5 4 4 4

∞

11

∞

11

∞

11

∞

11

4

17

6

17

6

17

6

17

6 8 8 8 8 7

11

4

231

1

235

1

208

0

91

0

∞

1

∞

0

2476

0

596

0

1 1 1

0 0

∞

1

∞

1

∞

0

∞

0

2625

1

2496

1

2328

0

966

0

20

0

7

0

3

0

7

0

6

0

6

0

5

0

2

0

1514

0

1634

0

1461

0

401

0

1

0 0 0 0

1

0

1

0 0

∞

2

∞

1

∞

1

∞

1

0

2 2

0 0

13

0

13

0

13

0

3

0

4

0 0

3 3 2 1 2 2 3 3 3 3 3 3 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 0 0 0 0 3 3 3 3 3 0 0 0 0 3 3

PBE Strings Track - Part 1

Euphony CVC4-2017 EUSolver2017phone-4

-long-r

epeat.

sl

phone-4

-long.s

l

phone-4

.sl

phone-5

-long-r

epeat.

sl

phone-5

-long.s

l

phone-5

.sl

phone-5

_short

.sl

phone-6

-long-r

epeat.

sl

phone-6

-long.s

l

phone-6

.sl

phone-6

_short

.sl

phone-7

-long-r

epeat.

sl

phone-7

-long.s

l

phone-7

.sl

phone-7

_short

.sl

phone-8

-long-r

epeat.

sl

phone-8

-long.s

l

phone-8

.sl

phone-8

_short

.sl

phone-9

-long-r

epeat.

sl

phone-9

-long.s

l

phone-9

.sl

phone-9

_short

.sl

phone-l

ong-r

epeat.

sl

phone-l

ong.s

l

phone.s

l

revers

e-n

am

e-l

ong-r

epeat.

sl

revers

e-n

am

e-l

ong.s

l

revers

e-n

am

e.s

l

revers

e-n

am

e_s

hort

.sl

univ

_1-l

ong-r

epeat.

sl

univ

_1-l

ong.s

l

univ

_1.s

l

univ

_1_s

hort

.sl

univ

_2-l

ong-r

epeat.

sl

univ

_2-l

ong.s

l

univ

_2.s

l

univ

_2_s

hort

.sl

univ

_3-l

ong-r

epeat.

sl

univ

_3-l

ong.s

l

univ

_3.s

l

univ

_3_s

hort

.sl

univ

_4-l

ong-r

epeat.

sl

univ

_4-l

ong.s

l

univ

_4.s

l

univ

_4_s

hort

.sl

univ

_5-l

ong-r

epeat.

sl

univ

_5-l

ong.s

l

univ

_5.s

l

univ

_5_s

hort

.sl

univ

_6-l

ong-r

epeat.

sl

univ

_6-l

ong.s

l

univ

_6.s

l

univ

_6_s

hort

.sl

benchmarks

>1000

[300,1000)

[100,300)

[30,100)

[10,30)

[1,10)

0

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[1,10)

[10,30)

[30,100)

[100,300)

[300,1000)

>1000

[0,1)

[1,3)

[3,10)

[10,30)

[30,100)

[100,300)

[300,1000)

[1000,3600)

>3600

wallclo

ck-t

ime

exp

r-siz

e

7 7 7

21

9

21

9

21

9

21

9

26

9

26

9

26

9

26

9

26

9

26

9

26

9

26

9

22

7

22

7

22

7

20

7

∞

15

∞

15

∞

15

∞

15 11

4

11

4

11

4 5 5 5 5 7 7 7 7

∞

81

∞

77

∞

36

∞

29

∞ ∞

14

∞

30

∞

14

∞ ∞

193

∞

154

∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞ ∞

7

1

7

1

4

0

2915

2

2927

1

2910

1

1569

1

3122

2

3120

1

2652

1

2542

1

2865

1

2840

1

2771

1

2201

1

2881

2

2897

1

1732

1

465

1

∞

1

∞

1

∞

1

∞

0

1

0

1

0 0

1

0

1

0 0 0

1

0

1

0

1

0 0

∞

1

∞

1

∞

1

∞

0

∞3237

∞

1

∞

0

∞

0

∞2375

∞2726

∞2744

∞ ∞ ∞ ∞ ∞ ∞1475

∞1446

∞1450

∞

3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 1 1 1 1 0 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0

PBE Strings Track - Part 2


Figure 11: Evaluation of PBE Strings track benchmarks.


7 Summary

This year’s competition consisted of over 1500 benchmarks, 250 of which where contributed this year.Six solvers competed this year, out of which four by developers submitting a tool for SyGuS-Comp forthe first time. All tools preformed remarkably, on both existing and new benchmarks. In particular, 65%of the new benchmarks were solved.

An impressive progress was shown this year in solving the strings benchmarks of the programing byexample track. Analyzing the features of benchmarks that are still hard to solve, we see that these includethose with either (i) multiple functions to synthesize or (ii) where the specification invokes the functionswith different parameters or (iii) those that use the let expression for specifying auxiliary variables, or(iv) the grammar is very general consisting of much more operators than needed, or (v) the specificationis partial in the sense that the domain of semantic solutions is not a singleton.

References

[1] Rajeev Alur, Rastislav Bodık, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman, Sanjit A. Seshia,Rishabh Singh, Armando Solar-Lezama, Emina Torlak & Abhishek Udupa (2013): Syntax-guided synthesis.In: Formal Methods in Computer-Aided Design, FMCAD 2013, Portland, OR, USA, October 20-23, 2013,pp. 1–8.

[2] Rajeev Alur, Pavol Cerny & Arjun Radhakrishna (2015): Synthesis Through Unification. In: ComputerAided Verification - 27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015,Proceedings, Part II, pp. 163–179, doi:10.1007/978-3-319-21668-3 10.

[3] Rajeev Alur, Dana Fisman, Rishabh Singh & Armando Solar-Lezama (2015): Results and Analysis of SyGuS-Comp’15. In: SYNT, EPTCS, pp. 3–26, doi:10.4204/EPTCS.202.3.

[4] Rajeev Alur, Arjun Radhakrishna & Abhishek Udupa (2017): Scaling Enumerative Program Synthesis viaDivide and Conquer. In: Tools and Algorithms for the Construction and Analysis of Systems - 23rd Inter-national Conference, TACAS 2017, Held as Part of the European Joint Conferences on Theory and Prac-tice of Software, ETAPS 2017, Uppsala, Sweden, April 22-29, 2017, Proceedings, Part I, pp. 319–336,doi:10.1007/978-3-662-54577-5 18.

[5] Clark Barrett, Aaron Stump & Cesare Tinelli: The SMT-LIB Standard Version 2.0.

[6] Hassan Eldib, Meng Wu & Chao Wang (2016): Synthesis of Fault-Attack Countermeasures for CryptographicCircuits. In: Computer Aided Verification - 28th International Conference, CAV 2016, Toronto, ON, Canada,July 17-23, 2016, Proceedings, Part II, pp. 343–363, doi:10.1007/978-3-319-41540-6 19.

[7] Pranav Garg, Daniel Neider, P. Madhusudan & Dan Roth (2016): Learning invariants using decision treesand implication counterexamples. In: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposiumon Principles of Programming Languages, POPL 2016, St. Petersburg, FL, USA, January 20 - 22, 2016, pp.499–512, doi:10.1145/2837614.2837664.

[8] Xuan-Bach D. Le, Duc-Hiep Chu, David Lo, Claire Le Goues & Willem Visser (2017): S3:syntax- and semantic-guided repair synthesis via programming by examples. In: FSE, pp. 593–604,doi:10.1145/3106237.3106309.

[9] Daniel Neider, P. Madhusudan & Pranav Garg (2015): ICE DT: Learning Invariants using Decision Treesand Implication Counterexamples. Private Communication.

[10] Saswat Padhi & Todd D. Millstein (2017): Data-Driven Loop Invariant Inference with Automatic FeatureSynthesis. CoRR abs/1707.02029. Available at http://arxiv.org/abs/1707.02029.

[11] Saswat Padhi, Rahul Sharma & Todd D. Millstein (2016): Data-driven precondition inference withlearned features. In: Proceedings of the 37th ACM SIGPLAN Conference on Programming Lan-

http://dx.doi.org/10.1007/978-3-319-21668-3_10

http://dx.doi.org/10.4204/EPTCS.202.3

http://dx.doi.org/10.1007/978-3-662-54577-5_18

http://dx.doi.org/10.1007/978-3-319-41540-6_19

http://dx.doi.org/10.1145/2837614.2837664

http://dx.doi.org/10.1145/3106237.3106309

http://arxiv.org/abs/1707.02029


guage Design and Implementation, PLDI 2016, Santa Barbara, CA, USA, June 13-17, 2016, pp. 42–56,doi:10.1145/2908080.2908099.

[12] Mukund Raghothaman & Abhishek Udupa (2014): Language to Specify Syntax-Guided Synthesis Problems.CoRR abs/1405.5590.

[13] Andrew Reynolds, Morgan Deters, Viktor Kuncak, Cesare Tinelli & Clark W. Barrett (2015):Counterexample-Guided Quantifier Instantiation for Synthesis in SMT. In: Computer Aided Verification- 27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, PartII, pp. 198–216, doi:10.1007/978-3-319-21668-3 12.

[14] Andrew Reynolds & Cesare Tinelli (2017): SyGuS Techniques in the Core of an SMT Solver. To appear inthis issue.

[15] Aaron Stump, Geoff Sutcliffe & Cesare Tinelli (2014): StarExec: A Cross-Community Infrastructure forLogic Solving. In: Automated Reasoning - 7th International Joint Conference, IJCAR 2014, Held as Partof the Vienna Summer of Logic, VSL 2014, Vienna, Austria, July 19-22, 2014. Proceedings, pp. 367–373,doi:10.1007/978-3-319-08587-6 28.

[16] Abhishek Udupa, Arun Raghavan, Jyotirmoy V. Deshmukh, Sela Mador-Haim, Milo M. K. Martin & RajeevAlur (2013): TRANSIT: specifying protocols with concolic snippets. In: ACM SIGPLAN Conference onProgramming Language Design and Implementation, PLDI ’13, Seattle, WA, USA, June 16-19, 2013, pp.287–296, doi:10.1145/2462156.2462174.

http://dx.doi.org/10.1145/2908080.2908099

http://dx.doi.org/10.1007/978-3-319-21668-3_12

http://dx.doi.org/10.1007/978-3-319-08587-6_28

http://dx.doi.org/10.1145/2462156.2462174

SyGuS-Comp 2017: Results and Analysis · Rajeev Alur University of Pennsylvania ... Microsoft...

Documents

Transcript of SyGuS-Comp 2017: Results and Analysis · Rajeev Alur University of Pennsylvania ... Microsoft...