Conditional Program Generation for Bimodal Program...

22
Conditional Program Generation for Bimodal Program Synthesis (Joint work with Chris Jermaine, Vijay Murali, and Letao Qi) Swarat Chaudhuri Rice University www.cs.rice.edu/~swarat

Transcript of Conditional Program Generation for Bimodal Program...

Page 1: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Conditional Program Generation for Bimodal Program Synthesis

(Joint work with Chris Jermaine, Vijay Murali, and Letao Qi)

Swarat ChaudhuriRice University

www.cs.rice.edu/~swarat

Page 2: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Program synthesis[Simon 1963, Summers 1977, Manna-Waldinger 1977, Pnueli-Rosner 1989]

Specification

Synthesizer

Program +

Correctness Certificate

Specification: Logical constraint that must be satisfied exactly

Algorithm: Search for a program that satisfies the specification.

Page 3: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

“Bimodal” program synthesis

3

An idealized program

Candidate implementations

Prior distribution

Synthesizer

Posterior distribution over programs

Neural Sketch Learning for Conditional Program Generation. Murali, Qi, Chaudhuri, and Jermaine. Arxiv 2017.

Learnedfrom a real-world

code corpus

Ambiguous “evidence” +Logical requirements

Page 4: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

“Bimodal” program synthesis

4

Ambiguous “evidence” +Logical requirements

Candidate implementations

Prior distribution

Learnedfrom a real-world

code corpus

Synthesizer

Posterior distribution over programs

• API calls or types that the program uses • “Soft” I/O examples or constraints• Natural language description of what the program does• ...

An idealized program

Page 5: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

The Bayou synthesizer: A demo

5

http://bit.ly/2zgP5fj

Page 6: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Conditional program generation

Assume random variables 𝑋 and 𝑃𝑟𝑜𝑔, over labels and programsrespectively, following a joint distribution 𝑄(𝑋, 𝑃𝑟𝑜𝑔).

Offline: • You are given a set 𝑋*, 𝑃𝑟𝑜𝑔* of samples from 𝑄(𝑋, 𝑃𝑟𝑜𝑔). From

this, learn a function 𝑔 that maps evidence to programs.

• Learning goal: maximize 𝐸 ,,-./0 ∼2 𝐼 , where

Online: Given 𝑋, produce 𝑔(𝑋).

6

I =

⇢1 if g(X) ⌘ Prog

0 otherwise.

Page 7: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

In what we actually do

The map g is probabilistic.

Learning is maximum conditional likelihood estimation:• Given {(𝑋*, 𝑃𝑟𝑜𝑔*)}, solve argmax

;∑ log 𝑃 𝑃𝑟𝑜𝑔* 𝑋*, 𝜃)�* .

7

Page 8: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Programs

Language capturing the essence of API usage in Java.

8

Prog ::= skip | Prog1;Prog2 | call Call |let x = Call |if Exp then Prog1 else Prog2 |while Exp do Prog1 | try Prog1 Catch

Exp ::= Sexp | Call | let x = Call : Exp1Sexp ::= c | xCall ::= Sexp0.a(Sexp1, . . . , Sexpk)

Catch ::= catch(x1) Prog1 . . . catch(xk) Progk

API method name

API call

Page 9: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Labels

Set of API calls• readline, write,…

Set of API datatypes• BufferedReader, FileReader,…

Set of keywords that may appear while describing program actions in English • read, file, write,…• Obtained from API calls and datatypes through a camel case

split

9

Page 10: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Challenges

Directly learning over source code simply doesn’t work

• Source code is full of low-level, program-specific names and operations.

• Programs need to satisfy structural and semantic constraints such as type safety. Learning to satisfy these constraints is hard.

10

Page 11: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Language abstractions to the rescue!

Learn not over programs, but typed, syntactic models of programs.

11

Page 12: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Sketches

The sketch of a program is obtained by applying an abstraction function 𝛼.

From sketch 𝑌 to program 𝑃𝑟𝑜𝑔: a fixed concretization distribution 𝑃(𝑃𝑟𝑜𝑔|𝑌).

Learning goal changes to• Given {(𝑋*, 𝑌*)}, solve argmax

;∑ log 𝑃 𝑌* 𝑋*, 𝜃)�* .

12

Page 13: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

13

Abstract API call

Sketches

Y ::= Call | skip | while Cond do Y1 | Y1;Y2 |try Y1 Catch | if Cond then Y1 else Y2

Catch ::= catch(⌧1) Y1 . . . catch(⌧k) Yk

Cond ::= {Call1, . . . ,Callk}Call ::= a(⌧1, . . . , ⌧k)

Page 14: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Program synthesis

14

Combinatorial “concretization”

Samplesketches

Sketch →Executable code

Implementationssatisfying 𝜑

𝑃 𝑌 𝑋)Evidence 𝑋

Logical requirement 𝜑

Type-directed,compositionalsynthesizer

End-to-enddifferentiableneuralarchitecture

Learned from 𝑿𝒊, 𝒀𝒊 pairs

Page 15: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Program synthesis

15

Combinatorial “concretization”

Samplesketches

Sketch →Executable code

Implementationssatisfying 𝜑

𝑃 𝑌 𝑋)Evidence 𝑋

Logical requirement 𝜑

Type-directed,compositionalsynthesizer

End-to-enddifferentiableneuralarchitecture

Learned from 𝑿𝒊, 𝒀𝒊 pairs

Not all sketches may be realizable as

executable programs

Page 16: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Learning using a probabilistic encoder-decoder

16

𝑋: Evidence𝑌: Sketches𝑍: Latent “intent”

Representation of hidden intent

Prior for regularization

Encoderf

Decoderg

𝑍

𝑓(𝑋) Y𝑋

𝑍

𝑋 𝑌

Page 17: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Learning using a probabilistic encoder-decoder

17

𝑋: Evidence𝑌: Sketches𝑍: Latent “intent”

Encoderf

Decoderg

𝑍

𝑓(𝑋) Y𝑋

𝑃 𝑍 = 𝑁𝑜𝑟𝑚𝑎𝑙 0, 𝐼

𝑃 𝑓(𝑋) 𝑍) = 𝑁𝑜𝑟𝑚𝑎𝑙 𝑍, 𝜎S𝐼

During learning, use Jensen’s inequality to get smooth loss function

Page 18: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Learning using a probabilistic encoder-decoder

18

𝑋: Evidence𝑌: Sketches𝑍: Latent “intent”

Encoderf

Decoderg

𝑍

𝑓(𝑋) Y𝑋

𝑃 𝑍 = 𝑁𝑜𝑟𝑚𝑎𝑙 0, 𝐼𝑃 𝑓(𝑋) 𝑍) = 𝑁𝑜𝑟𝑚𝑎𝑙 𝑍, 𝜎S𝐼

During inference,get P(Z | X) using normal-normal conjugacy

Page 19: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Neural decoder

19

…0.3 0.7

Distribution on rules that can be fired at a point, given history so far.

History encoded as a real vector.

Page 20: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Concretization

20

Ruled out by type system

Page 21: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Results

• Trained method on 100 million lines of Java/Android code. ~2500 API methods, ~1500 types.

• Synthesis of method bodies from scratch, given 2-3 API calls and types.

• Sketch learning critical to accuracy.

• Good performance compared to GSNNs (state of the art conditional generative model).

• Good results on label-sketch pairs not encountered in training set.

21

Page 22: Conditional Program Generation for Bimodal Program Synthesiscrest.cs.ucl.ac.uk/cow/55/slides/cow55_Chaudhuri.pdf · Program synthesis [Simon 1963, Summers 1977, Manna-Waldinger1977,

Thank you!

Questions?

22

[email protected]://www.cs.rice.edu/~swarat

(Research funded by the DARPA MUSE award #FA8750-14-2-0270)