Implicit Programming Viktor Kuncak Swiss Federal Institute of Technology Lausanne (EPFL) Joint work...

Post on 26-Dec-2015

222 views 2 download

Tags:

Transcript of Implicit Programming Viktor Kuncak Swiss Federal Institute of Technology Lausanne (EPFL) Joint work...

Implicit ProgrammingViktor Kuncak

Swiss Federal Institute of Technology Lausanne(EPFL)

Joint work with: Tihomir Gvero, EPFL Ali Sinan Köksal, now grad student at UC Berkeley Ruzica Piskac, now at Max-Planck Inst. for Software Systems Philippe Suter, EPFL (graduating)

http://lara.epfl.ch/

Three related activities:• Development within an IDE

(Eclipse, Visual Studio, emacs)• Compilation and static checking

(optimizing compiler,static analyzer, contract checker)

• Execution on a (virtual) machine

Programming Activities and Toolsrequirements

def f(x : Int) = { y = 2 * x + 1}

iload_0iconst_1iadd

42

More compute power available for each step use it to improve programmer productivity

Advance techniques and tools for Development within an IDE What can we prove in 0.5 s?

Compilation and verification Eliminate unknowns from programs Constraint diagnostics

Execution When is unfolding enough?

to improve software productivity.

Implicit Programming Agendarequirements

def f(x : Int) = { y = 2 * x + 1}

iload_0iconst_1iadd

42

Introduced by Martin Odersky, EPFLUnifies functional and object-oriented programmingRuns on the Java Virtual Machine (and .NET)Now used by over 100’000 developers (incl. Twitter, UBS, LinkedIn)DSL embedding mechanisms. IDE support (e.g. Eclipse)

Background: Scala Programming Language

sealed abstract class Treecase class Leaf() extends Treecase class Node(left:Tree, data:Int, right:Tree) extends Treedef elems(t:Tree) :Set[Int] = t match { case Leaf() ⇒ Set.empty case Node(l,d,r) ⇒ elems(l) ++ Set(d) ++ elems(r)} http://www.scala-lang.org

Kaplan: Executable Spec for Sortingsealed abstract class Listcase class Nil() extends Listcase class Cons(head : Int, tail : List) extends List

((lst: List) isSorted(lst) && elems(lst) == Set(0, 1, -3))⇒ .find.get

> Cons(-3, Cons(0, Cons(1, Nil())))

def elems(ls : List) : Set[Int] = …def isSorted(lst : List) : Boolean = lst match { case Nil() true⇒ case Cons(_, Nil()) true⇒ case Cons(x, Cons(y, ys)) x < y && isSorted(Cons(y,ys))⇒}

Koksal, Kuncak, Suter: Constraints as Control, POPL 2012

Minimizing Solutionsval (a, b) = (12, 8)val (lcm, _, _) = ((m, fa, fb) ⇒ m > 0 && m == fa * a && m == fb * b) .minimizing((m,fa,fb) ⇒ m).find.get

> 24

((t:Tree)⇒ isBinarySearchTree(t) && elems(t)==Set(1,2,3)).findAll.toList

> List(Node(Node(L, 1, L), 2, Node(L, 3, L)), Node(Node(Node(L, 1, L), 2, L), 3, L), Node(L, 1, Node(L, 2, Node(L, 3, L))))

Enumerating Complex Values

Red-Black Trees

invariants

Implementation:next 30 pages

Formalize Invariants (in Scala)sealed abstract class Treecase class Empty() extends Treecase class Node(color: Color, left: Tree, value: Int, right: Tree) extends Tree

def bBalanced(t : Tree) : Boolean = t match { case Node(_,l,_,r) => bBalanced(l) && bBalanced(r) && bHeight(l) ==bHeight(r) case Empty() => true}

def bHeight(t : Tree) : Int = t match { case Empty() => 1 case Node(Black(), l, _, _) => bHeight(l) + 1 case Node(Red(), l, _, _) => bHeight(l)}def isRBT(t : Tree) : Int = …

Insertion in a few linesdef insert(x : Int, t : Tree) = ((t1:Tree) => isRBT(t1) && elems(t1) = elems(t) ++ Set(x)).find

Are invariants together with declarative shorter than functional implementation?– not necessarily,– but we know the result satisfies the invariants– RBT only makes sense with invariants

Extending the Program

Suppose we wish to implement ‘remove’ as well

def remove(x : Int, t : Tree) = ((t1:Tree) => isRBT(t1) && elems(t1) = elems(t) -- Set(x)).find

Declarative remove is again just a few lines (keep the same invariants!)

declarative knowledge can be more reusable

void RBDelete(rb_red_blk_tree* tree, rb_red_blk_node* z){ rb_red_blk_node* y; rb_red_blk_node* x; rb_red_blk_node* nil=tree->nil; rb_red_blk_node* root=tree->root; y= ((z->left == nil) || (z->right == nil)) ? z : TreeSuccessor(tree,z); x= (y->left == nil) ? y->right : y->left; if (root == (x->parent = y->parent)) { /* assignment of y->p to x->p is intentional */ root->left=x; } else { if (y == y->parent->left) { y->parent->left=x; } else { y->parent->right=x; } } if (y != z) { /* y should not be nil in this case */#ifdef DEBUG_ASSERT Assert( (y!=tree->nil),"y is nil in RBDelete\n");#endif /* y is the node to splice out and x is its child */ if (!(y->red)) RBDeleteFixUp(tree,x); tree->DestroyKey(z->key); tree->DestroyInfo(z->info); y->left=z->left; y->right=z->right; y->parent=z->parent; y->red=z->red; z->left->parent=z->right->parent=y; if (z == z->parent->left) { z->parent->left=y; } else { z->parent->right=y; } free(z); } else { tree->DestroyKey(y->key); tree->DestroyInfo(y->info); if (!(y->red)) RBDeleteFixUp(tree,x); free(y); } #ifdef DEBUG_ASSERT Assert(!tree->nil->red,"nil not black in RBDelete");#endif}

void RBDeleteFixUp(rb_red_blk_tree* tree, rb_red_blk_node* x) { rb_red_blk_node* root=tree->root->left; rb_red_blk_node* w; while( (!x->red) && (root != x)) { if (x == x->parent->left) { w=x->parent->right; if (w->red) {

w->red=0;x->parent->red=1;LeftRotate(tree,x->parent);w=x->parent->right;

} if ( (!w->right->red) && (!w->left->red) ) {

w->red=1;x=x->parent;

} else {if (!w->right->red) { w->left->red=0; w->red=1; RightRotate(tree,w); w=x->parent->right;}w->red=x->parent->red;x->parent->red=0;w->right->red=0;LeftRotate(tree,x->parent);x=root; /* this is to exit while loop */

} } else { /* the code below is has left and right switched from above */ w=x->parent->left; if (w->red) {

w->red=0;x->parent->red=1;RightRotate(tree,x->parent);w=x->parent->left;

} if ( (!w->right->red) && (!w->left->red) ) {

w->red=1;x=x->parent;

} else {if (!w->left->red) { w->right->red=0; w->red=1; LeftRotate(tree,w); w=x->parent->left;}w->red=x->parent->red;x->parent->red=0;w->left->red=0;RightRotate(tree,x->parent);x=root; /* this is to exit while loop */

} } } x->red=0;#ifdef DEBUG_ASSERT Assert(!tree->nil->red,"nil not black in RBDeleteFixUp");#endif}

140 lines of tricky C,reusing existing functionsUnreadable without pictures(from Emin Martinian)

Imperativeremove

Key Question

When will such declarative programming work?Which constraints can we solve predictably?

Need algorithms for solving constraints overintegers, lists, trees, sets, maps, multisets, …

( ? ).find

Executing Constraints: Then and NowLong tradition

– Prolog (inspired by Robinson’s resolution theorem proving)

– CLP(R): added linear constraints

– Functional Logic Programming (e.g.Curry): narrowing technique

then came SAT and Kodkod

– Aleksandar Milicevic, Derek Rayside, Kuat Yessenov, Daniel Jackson:

Unifying execution of imperative and declarative code. ICSE 2011

– Hesam Samimi, Ei Darli Aung, Todd D. Millstein: Falling Back on

Executable Specifications. ECOOP 2010

and SMT - Satisfiability Modulo TheoriesImplicit Programming : SMT = Prolog : Resolution

SMT

SMT = SAT + Cooperating Decision Procedures

x < y+1 && y < x+1 && x’=f(x) && y’=f(y) && x’=y’+1

x < y+1y < x+1 x’=y’+1

x’=f(x)y’=f(y)

x=y

x’=y’0=1

linear programing solver

congruence closureimplementation

e.g. Z3 theorem prover (N. Bjorner, L. de Moura)

SAT solver cleverlyenumerates conjunctions

Leon: Constraint Solver of Kaplan

• Supports recursive functions and data-types.• Originally developed for program verification with

counter-examples. [SAS 2011]

• Algorithm is a semi-decision procedure; always terminates when there is a counter-example.

• Works as a decision procedure for a well-defined class of functions. [POPL 2010]– tree fold functions f : Tree Set such that |f-1(x)| is large enough

= +recursivefunction

definitions

“SMT modulo Recursive Functions”Algorithm to find solution for constraint with recursive functions:

– Disable recursive call branches. If z3-sat, report SAT– Enable recursive branches, replace the result with fresh constant;

if z3-unsat, report UNSAT– If none gives answer, replace call with function body (unroll) and

also add function contracts (k-induction)

Results for Computing with Constraints

send + more = money 1.17 sAll red-black trees up to size 7: 27.45 s (comparable to JPF)

last list element:

Note: domains not bounded

upfront

Example Results for VerificationUnrolling depth

Same property also esnures that enumeration often terminates.http://lara.epfl.ch/leon/

AdvancingDevelopment within an IDE

Compilation and static checking

Execution on a (virtual) machine

Automated reasoning is key enabling technology

Implicit Programming Agendarequirements

def f(x : Int) = { y = 2 * x + 1}

iload_0iconst_1iadd

42

1)

2)

3)

def secondsToTime(totalSeconds: Int) : (Int, Int, Int) = ((h: Int, m: Int, s: Int) (⇒ h * 3600 + m * 60 + s == totalSeconds && h ≥ 0 && m ≥ 0 && m < 60 && s ≥ 0 && s < 60)).find

Synthesis for Arithmetic

def secondsToTime(totalSeconds: Int) : (Int, Int, Int) = val t1 = val t2 = val t3 = val t4 = Some(t1, t3, t4)

?

def secondsToTime(totalSeconds: Int) : (Int, Int, Int) = choose((h: Int, m: Int, s: Int) (⇒ h * 3600 + m * 60 + s == totalSeconds && h ≥ 0 && m ≥ 0 && m < 60 && s ≥ 0 && s < 60))

Choose Notation

def secondsToTime(totalSeconds: Int) : (Int, Int, Int) = val t1 = val t2 = val t3 = val t4 = Some(t1, t3, t4)

?

Starting point: quantifier elimination (QE)

• A specification statement of the form

• Corresponds to constructively solving the quantifier elimination (QE) problem

where a are parameters• Witness terms from QE are the synthesized code

r = choose(x F( a, x ))⇒

∃ x . F( a, x )

“let r be x such that F(a, x) holds”

Methodology QE Synthesis

Quantifier Elimination: x. S(x,a) P(a)Synthesis: x. S(x,a) S(t(a),a)• For all QE procedures we examined, we were

able to find the corresponding witness terms• One-point rule immediately gives a term

x. (x = t(a) && S(x,a)) S(t(a),a)Example for other domains:

$ x. (a1 < x & x < a2) a1 < a1 + 1 & a1 + 1 < a2 t(a1,a2)=a1+1

Synthesis Procedure: EqualitiesProcess equalities first:• compute parametric description of solution set• replace n variables with n-1

h * 3600 + m * 60 + s == totalSeconds s = totalSeconds - h * 3600 - m * 60

In general we obtain divisibility constraints– use Extended Euclid’s Algorithm,

matrix pseudo inverse in Z

Synthesis Procedure: Inequalities

• Solve for one by one variable:– separate inequalities depending on polarity of x:

Ai ≤ αix

βjx ≤ Bj

– define terms a = maxi A⌈ i/αi and b = min⌉ j B⌈ j/ βj ⌉

• If b is defined, return x = b else return x = a• Further continue with the conjunction of all

formulas A⌈ i/αi ≤ B⌉ ⌈ j/ βj⌉

• Similar to Fourier-Motzkin elimination (remove floor and ceiling using divisibility)

26

2y − b − a ≤ 3x 2x ∧ ≤ 4y + a + b 4y − 2b − 2a ≤ 6x ≤ 12y + 3a + 3b (4y − 2b − 2a) / 6 ≤ x ≤ (12y + 3a + 3b) / 6 (4y − 2b − 2a) / 6 ≤ (12y + 3a + 3b) / 6⌊ ⌋two extra variables: (4y − 2b − 2a) / 6 ≤ l ∧ 12y + 3a + 3b = 6 l + k∗ ∧ 0 ≤ k ≤ 5 4y − 2b − 2a ≤ 6 * l ∧ 12y + 3a + 3b = 6 l + k∗ ∧ 0 ≤ k ≤ 5 pre: 6|3a + 3b − k 4y − 2b − 2a ≤ 12y + 3a + 3b – k − 5b − 5a +k ≤ 8y y = (⌈ k − 5a − 5b)/8⌉

2y − b ≤ 3x + a 2∧ x − a ≤ 4y + b

27

Generated Code Contains Loops

val (x1, y1) = choose(x: Int, y: Int => 2*y − b =< 3*x + a && 2*x − a =< 4*y + b)

val kFound = false for k = 0 to 5 do { val v1 = 3 * a + 3 * b − k if (v1 mod 6 == 0) { val alpha = ((k − 5 * a − 5 * b)/8).ceiling val l = (v1 / 6) + 2 * alpha val y = alpha val kFound = true break } } if (kFound) val x = ((4 * y + a + b)/2).floor else throw new Exception(”No solution exists”)

28

def secondsToTime(totalSeconds: Int) : (Int, Int, Int) = choose((h: Int, m: Int, s: Int) (⇒ h * 3600 + m * 60 + s == totalSeconds && h ≥ 0 && m ≥ 0 && m < 60 && s ≥ 0 && s < 60 ))

Result of Synthesis

def secondsToTime(totalSeconds: Int) : (Int, Int, Int) = val t1 = totalSeconds div 3600 val t2 = totalSeconds -3600 * t1 val t3 = t2 div 60 val t4 = totalSeconds - 3600 * t1 - 60 * t3 (t1, t3, t4)

Implemented as an extension of the Scala compiler.

Example: Date Conversion in CKnowing number of days since 1980, find current year and dayBOOL ConvertDays(UINT32 days) { year = 1980; while (days > 365) { if (IsLeapYear(year)) { if (days > 366) { days -= 366; year += 1; } } else { days -= 365; year += 1; } ... }

Enter December 31, 2008All music players (of a major brand) froze in the boot sequence.

Implicit Programming for Date Conversion

val origin = 1980@spec def leapsTill(y : Int) = (y-1)/4 - (y-1)/100 + (y-1)/400

let (year1, day1)=choose( (year:Int, day:Int) => { days == (year-origin)*365 + leapsTill(year)-leapsTill(origin) + day && 0 < day && day <= 366print(year1, day1)})

Analysis and termination simpler than with loop

Knowing number of days since 1980, find current year and day

Properties of Synthesis Algorithm

• For every formula in linear integer arithmetic– synthesis algorithm terminates– produces the most general precondition

(assertion saying when result exists)– generated code gives correct values whenever

correct values exist• If there are multiple or no solutions for some

parameters, we get a warning

Compile-time warningsdef secondsToTime(totalSeconds: Int) : (Int, Int, Int) = choose((h: Int, m: Int, s: Int) (⇒ h * 3600 + m * 60 + s == totalSeconds && h ≥ 0 && h < 24 && m ≥ 0 && m < 60 && s ≥ 0 && s < 60 ))

Warning: Synthesis predicate is not satisfiable for variable assignment: totalSeconds = 86400

Compile-time warningsdef secondsToTime(totalSeconds: Int) : (Int, Int, Int) = choose((h: Int, m: Int, s: Int) (⇒ h * 3600 + m * 60 + s == totalSeconds && h ≥ 0 && m ≥ 0 && m ≤ 60 && s ≥ 0 && s < 60 ))

Warning: Synthesis predicate has multiple solutions for variable assignment: totalSeconds = 60Solution 1: h = 0, m = 0, s = 60Solution 2: h = 0, m = 1, s = 0

Arithmetic pattern matching

• Goes beyond Haskell’s (n+k) patterns• Compiler checks that all patterns are reachable

and whether the matching is exhaustive

def fastExponentiation(base: Int, power: Int) : Int = { def fp(m: Int, b: Int, i: Int): Int = i match { case 0 m⇒ case 2 * j fp(m, b*b, j)⇒ case 2 * j + 1 fp(m*b, b*b, j)⇒ } fp(1, base, p)}

Synthesis for parametrized arithmetic

def decomposeOffset(offset: Int, dimension: Int) : (Int, Int) = choose((x: Int, y: Int) (⇒ offset == x + dimension * y && 0 ≤ x && x < dimension ))

• The predicate becomes linear at run-time• Synthesized program must do case analysis on

the sign of the input variables• Some coefficients are computed at run-time

Synthesis for (multi)sets (BAPA)

def splitBalanced[T](s: Set[T]) : (Set[T], Set[T]) = choose((a: Set[T], b: Set[T]) (⇒ a union b == s && a intersect b == empty && a.size – b.size ≤ 1 && b.size – a.size ≤ 1 ))

def splitBalanced[T](s: Set[T]) : (Set[T], Set[T]) = val k = ((s.size + 1)/2).floor val t1 = k val t2 = s.size – k val s1 = take(t1, s) val s2 = take(t2, s minus s1) (s1, s2) a

b

s

NP-Hard Constructs

• Divisibility combined with inequalities:– corresponding to big disjunction in q.e. ,

we will generate a for loop with constant bounds(could be expanded if we wish)

• Disjunctions– Synthesis of a formula computes program and exact

precondition of when output exists– Given disjunctive normal form, use preconditions

to generate if-then-else expressions (try one by one)

Synthesis for Disjunctions

More on this Synthesis Approach

• Software Synthesis Procedures, Communications of the ACM, February 2012

• STTT 2012• PLDI 2010Ongoing work– Use it to compile certain Kaplan specifications– Synthesis for further theories

Advancing• Development within an IDE

• Compilation and static checking

• Execution on a (virtual) machine

Automated reasoning is key enabling technology

Implicit Programming Agendarequirements

def f(x : Int) = { y = 2 * x + 1}

iload_0iconst_1iadd

42

1)

2)

3)

Type-Driven Synthesis within an IDE

Type-Driven Synthesis within an IDE

Program point(cursor)

pre-computedweights

Create type environment:- encode subtypes- assign initial weights

Extract:- visible symbols- expected type

Search algorithmwith weights

(generates intermediate expressions and their types)

…………………………………………………………………………………………………………………………………………………………………………………………………………………………………

5 suggested expressions

Synthesis Approach Outline

sourcecode in editor

Ranking

w/ Tihomir Gvero and Ruzica Piskac, CAV’11

Search Algorithm

• Semi-decidable problem (cf. Leon)• Related to proof search in intuitionistic logic• Weights are an important additional aspect:– in our application we find many solutions quickly– must select interesting ones– interesting = small weight– algorithm with weights preserves completeness

• Identified decidable (even polynomial) cases, in the absence of generics and subtyping

Results for Expression SuggestionBenchmark Length #Initial #Derived #Snip.Gen. Rank Time [ms]

ByteArrayInputStreambytebufintoffsetintlength 4 22 4049 102 3 546

CharArrayReadercharbuf 3 26 782 343 1 546

HashSetiterator 2 60 1832 201 1 546

Hashtableelements 2 32 869 445 1 546

HashtableentrySet 2 31 874 441 1 546

HashtablekeySet 2 32 968 492 3 546

Hashtablekeys 2 30 818 477 2 515

PriorityQueuepoll 2 27 1208 363 1 562

Examples demonstrating API usageRemove 1-2 lines; ask tool to synthesize itTime budget for search: 0.5 secondsGenerates up to 1000s of well-formed and well-typed expressionsDisplays top 5 of themThe one that was removed appeared as rank 1, 2, or 3Similarly good behavior in over 50% of the 120 examples evaluated

Other Topics

• Specification decision procedures, including MAPA, BAPA, extensions, term powers (Piskac, Yessenov)

• Synthesis based on automata (Jobstmann, Spielmann)• Static analysis and verification

– Jahob (Rinard, Zee)– Static analysis of PHP (Kneuss, Suter),

Scala (Kneuss,Suter), C (Vujosevic-Janicic)– Predicate abstraction with interpolation

(Hojjat, Ruemmer, Iosif)• Numerical computation analysis (Darulova)• Collaborations on distributed systems (Kostic), speculative

linearizability (Guerraoui), testing (Marinov)

Implicit Programming, Explicit DesignExplicit = written down, machine readableImplicit = omitted, to be (re)discoveredCurrent practice:– explicit program code– implicit design (key invariants, properties, contracts)A lot of hard work exists on verification and analysis: checking design against given code and recovering (reverse engineering) implicit invariants.

Our goal:– explicit design– implicit program

Total work of developer is not increased! Moreover:– can be decreased for certain types of specifications– confidence in correctness higher – program is spec

AdvancingDevelopment within an IDE: synthesize entire expressionsCompilation and static checking:transform spec into program

Execution on a (virtual) machine:use and extend SMT solvers

Conclusionsrequirements

choose x,y,z,n=> xn

+ yn = zn

SMT solver

421)

2)

3)

4) more human-oriented programming, empoweringuser to program

3)

2)

1)

http://lara.epfl.ch/~kuncak