Mining Specifications (lots of) code specifications of correctness

Glenn Ammons Ras Bodík Jim Larus

Univ. of Wisconsin Univ. of Wisconsin Microsoft Research

Mining Specifications

(lots of) code specifications of correctness

Motivation: why specifications?

Verification tools• find bugs early• make guarantees• scale with

programs• need specifications

verifier

program

Specifications

programprogramprogram

Language-usage specifications

verifier

program

•array accesses•memory allocation•type safety•...

programprogramprogram

Easy to write,big payoff

Library-usage specifications

verifier

program

•cut-and-paste (X11)•network server (socket API)•device drivers (kernel API)•...

programHarder to write,smaller payoff

Program specifications

verifier

program

•symbol table well-formed•IR well-formed•...

Hardest to write,smallest payoff

Solution: specification mining

Specification mining gleans specifications from artifacts of program development:

• From programs (static)?• From executions of test cases (dynamic)?• From other artifacts?

Mining from traces

Advantages:• No infeasible paths• Pointer/alias analysis is easy• Few bugs, as program passes its tests• Common behavior is correct behavior

...socket(domain = 2, type = 1, proto = 0, return = 7)accept(so = 7, addr = 0x40, addr_len = 0x50, return = 8)write(so = 8, buf = 0x100, len = 23, return = 23)read(so = 8, buf = 0x100, len = 12, return = 12)close(so = 8, return = 0)close(so = 7, return = 0)...

Output: a specification

socket(return = X)

accept(so = X, return = Y)

close(so = Y)close(so = X)

read(so = Y)

write(so = Y)

Specification says what programs should do:•Temporal dependences (accept follows socket)•Data dependences (accept input is socket output)

How we mine specifications

extract scenarios

standardizePFSA learner

...socket(domain = 2, type = 1, proto = 0, return = 7))...

socket(...)

accept(...)

read(...) write(...)

close(...)

socket(...)

accept(...)

close(...)

socket(...)

accept(...)

close(...)

Traces Scenarios(dep. graphs)

Strings

postprocessSpecification

PFSAsocket(return = X)

close(so = Y)close(so = X)

read(so = Y)

write(so = Y)

. .1010

Outline of the talk

• The specification mining problem

• Our specification mining system• Annotating traces with dependences

• Extracting and standardizing scenarios

• Probabilistic learning and postprocessing

• Experimental results• Related work

An impossible problem

C (all correct traces)

T (training traces)

Find a Turing machine thatgenerates C, given T.I (all traces)

Unsolvable:• No restrictions on C• No connection between C and T • Simple variants are also undecidable [Gold67]

A simpler problem

Find a PFSA that generatesan approximation of P.

1P a probabilitydistribution

CorrectNoise

A simpler problem

All scenarios0

1P a probabilitydistribution overall scenarios

Correct scenarios Noise

A simpler problem

Tractable, plus• Scenarios are small• Noise handled• Finite-state• Weights useful for postprocessing

All scenarios0

1P a probabilitydistribution overall scenarios

Correct scenarios Noise

Outline of the talk

• The specification mining problem• Our specification mining system

• Annotating traces with dependences

• Verifying traces• Experimental results• Related work

Dependence annotation

socket(domain = 2, type = 1, proto = 0, return = 7)accept(so = 7, addr = 0x40, addr_len = 0x50, return = 8)write(so = 8, buf = 0x100, len = 23, return = 23)close(so = 8, return = 0)close(so = 7, return = 0)

dependence annotatorTraces

Annotated traces

Dependence annotation

Definers:• socket.return• accept.return• close.so

Users:• accept.so• read.so• write.so• close.so

dependence annotatorTraces

Annotated traces

Dependence annotationdependence annotatorTraces

Annotated traces

Definers:• socket.return• accept.return• close.so

Users:• accept.so• read.so• write.so• close.so

Outline of the talk

Extracting scenariosscenario extractor

Annotatedtraces

Abstract scenarios

Annotatedtraces

Abstract scenarios

Annotatedtraces

Abstract scenarios

Simplifying scenariosscenario extractor

Annotatedtraces

Abstract scenarios

socket(domain = 2, type = 1, proto = 0, return = 7) [seed]accept(so = 7, addr = 0x40, addr_len = 0x50, return = 8)write(so = 8, buf = 0x100, len = 23, return = 23)close(so = 8, return = 0)close(so = 7, return = 0)

Simplifying scenarios

socket(return = 7) [seed]accept(so = 7, return = 8)write(so = 8)close(so = 8)close(so = 7)

Drops attributesnot used independences.

scenario extractorAnnotatedtraces

Abstract scenarios

Standardizing scenarios

Simplified scenarios

Equivalentscenarios

Abstractscenarios

Standardization

Two transformations:•Naming: foo(val = 7) foo(val = X)•Reordering: foo(); bar(); bar(); foo();

Finds the least standardized scenario, inlexicographic order

Abstract scenarios

Standardizing scenariosscenario extractor

Annotatedtraces

Abstract scenarios

socket(return = 7) [seed]accept(so = 7, return = 8)write(so = 8)read(so = 8)close(so = 8)close(so = 7)

Use-def and def-defdependences

Reorder

Abstract scenarios

socket(return = 7) [seed]accept(so = 7, return = 8)read(so = 8)write(so = 8)close(so = 8)close(so = 7)

ReorderName

Abstract scenarios

socket(return = X) [seed]accept(so = X, return = Y)read(so = Y)write(so = Y)close(so = Y)close(so = X)

Each interaction is a letter to the PFSA learner.

Abstract scenarios

socket(return = X) [seed]accept(so = X, return = Y)read(so = Y)write(so = Y)close(so = Y)close(so = X)

Outline of the talk

PFSA learning

Algorithm due to Raman et al.:1. Build a weighted retrieval tree2. Merge similar states

automaton learnerAbstractscenarios

Specification

PFSA learning

Specification

PFSA learning

Specification

PFSA learning

Specification

Postprocessing: coring

Specification

1. Remove infrequent transitions2. Convert PFSA to NFA

Postprocessing: coring

Specification

1. Remove infrequent transitions2. Convert PFSA to NFA

Outline of the talk

Where to find bugs?

• in programs (static verification)?

• or in traces (dynamic verification)?

How we verify specifications

extract scenarios

standardizeCheck automaton

membership

...socket(domain = 2, type = 1, proto = 0, return = 7))...

socket(...)

accept(...)

close(...)

socket(...)

accept(...)

close(...)

socket(...)

accept(...)

close(...)

Traces Scenarios(dep. graphs)

Strings

Verifying traces

...socket(return = 7)accept(so = 7, return = 8)write(so = 8)read(so = 8)close(so = 8)close(so = 7)...

...socket(return = 7)accept(so = 7, return = 8)write(so = 8)read(so = 8)close(so = 8)...

OK (both sockets closed) Bug! (socket 7 not closed)

socket(return = X) [seed]

close(fd = Y)close(fd = X)

read(so = Y)

write(so = Y)

Attempted to mine and verify two published X11 rules

Experimental results

Challenge: small, buggy training sets (16 programs)

Learning by trial and error

Start with a rule learned from one, trusted trace.Then:

Randomly select an unused trace

Trace obeys rule?

Add trace to trainingset; learn a new rule

Expert: is trace buggy?

no (rule too specific)

Report bug

1. A timestamp-passing rule• 4 traces did not need inspection• learned the rule! (compact: 7 states)• bugs in 2 out of 16 programs (ups, e93)• English specification was incomplete (3 traces)• expert and corer agreed on 81% of the hot core

2. SetOwner(x) must be followed by GetSelection(x)

• failed to learn the rule (very small learning set) but

• bugs in 2 out of 5 programs (xemacs, ups)

Results

Outline of the talk

Related workArithmetic pre/post conditions

• Daikon [Ernst et al], Houdini [Flanagan and Leino]• properties orthogonal from us • eventually, we may need to include and learn some

arithmetic relationships

Temporal relationships over calls • intrusion detection: [Ghosh et al], [Wagner and Dean]

• software processes: [Cook and Wolf]

• error checking: [Engler et al SOSP 2001]• lexical and syntactic pattern matching • user must write templates (e.g., <a> always follows

• design patterns: [Reiss and Renieris]

Conclusion

• Introduced specification mining, a new approach for learning correctness specifications

• Refined the problem into a problem of probabilistic learning from traces

• Developed and demonstrated a practical specifications miner

End of talk

tracer rundependence annotator

Program

Instrumentedprogram

Test inputs

Traces Annotatedtraces

...socket(domain = 2, type = 1, proto = 0, return:T0 = 7)[SETUP socket:T0 7]accept(so:T0 = 7, addr = 0x40, addr_len = 0x50, return:T0 = 8)[USE socket:T0 8]close(so:T0 = 8, return = 0)close(so:T0 = 7, return = 0)...

Program

int s = socket(AF_INET, SOCK_STREAM, 0); [DO SETUP]while(cond1) { int ns = accept(s, &addr, &len); while(cond2) { [USE NS] if (cond3) return; } close(ns); }close(s);

tracer

Program

Instrumentedprogram

int s = socket(AF_INET, SOCK_STREAM, 0); [DO SETUP]while(cond1) { int ns = accept(s, &addr, &len); while(cond2) { [USE NS] if (cond3) return; } close(ns); }close(s);

tracer run

Program

Instrumentedprogram

Test inputs

Traces

...socket(domain = 2, type = 1, proto = 0, return = 7)[SETUP socket 7]accept(so = 7, addr = 0x40, addr_len = 0x50, return = 8)[USE socket 8]close(so = 8, return = 0)close(so = 7, return = 0)...

tracer rundependence annotator

Program

Instrumentedprogram

Test inputs

...socket(domain = 2, type = 1, proto = 0, return = 7)[SETUP socket 7]accept(so = 7, addr = 0x40, addr_len = 0x50, return = 8)[USE socket 8]close(so = 8, return = 0)close(so = 7, return = 0)...

tracer runscenario

extractordependence annotator

Program

Instrumentedprogram

Test inputs

Scenarioseeds

Abstractscenarios

socket(return = X) [seed][SETUP socket X]accept(so = X, return = Y)[USE socket Y]close(so = Y)close(so = X)

tracer runscenario

extractorautomaton

learnerdependence annotator

Program

Instrumentedprogram

Test inputs

Scenarioseeds

Abstractscenarios

Specification

[SETUP X]

close(fd = Y)close(fd = X)

[USE Y]

Reducing the problem

T (training traces)

The problem: find anautomaton that generatesC, given T.

I (all traces)

Issues:•What if C is not r.e.?•Checkers and learnersneed finite specs.

T (training traces)

The problem: find anautomaton that generatesC, given T.

I (all traces)

Issues:•What if C is not r.e.?•Checkers and learnersneed finite specs.

The problem: find anautomaton that generatesC, given T. Assume thatC is regular.

Issue:•What if the program isnot regular?

C (all correct traces, regular)

T (training traces)

I (all traces)

Unrestricted

The problem: find anautomaton that generatesCS, given TS. Assume thatthe size of scenarios isbounded.

Issue:•No connectionbetween CS and TS!

CS (all correct scenarios, regular)

TS (training scenarios)

IS (all scenarios, bounded size)

Unrestricted RegularI

The problem: find anautomaton that generatesCS, given TS. Assume thatTS presents each element ofCS at least once.

Issue:•Undecidable (Gold67)

CS (all correct scenarios, regular)

TS = c0, c1, ...

IS (all scenarios, bounded size)

Scenarios

The problem: find a PFSAthat generates P’, whereP and P’ are close (by somedistance metric). AssumeP is generated by a PFSA.

ScenariosIS

Completepresentation

IS (all scenarios)

TS = c0, c1, ...

1P a probabilitydistribution over IS,generated by a PFSA

Digression: postprocessing

• PFSA = NFA with weights• Specification = NFA• Convert PFSA to specification:

1. Find hot core (that is, drop noise)• drop infrequent scenarios• drop infrequent parts of scenarios

2. Drop weights

Preparing input traces

socket(domain = 2, type = 1, proto = 0, return:T0 = 7)[SETUP socket:T1 7]accept(so:T2 = 7, addr = 0x40, addr_len = 0x50, return:T3 = 8)[USE socket:T4 8]close(so:T5 = 8, return = 0)close(so:T5 = 7, return = 0)

Extracting scenarios

Abstract scenarios

socket(domain = 2, type = 1, proto = 0, return:T0 = 7) [seed][SETUP socket:T0 7]accept(so:T0 = 7, addr = 0x40, addr_len = 0x50, return:T0 = 8)[USE socket:T0 8]close(so:T0 = 8, return = 0)close(so:T0 = 7, return = 0)

Abstract scenarios

socket(return:T0 = 7) [seed][SETUP socket:T0 7]accept(so:T0 = 7, return:T0 = 8)[USE socket:T0 8]close(so:T0 = 8)close(so:T0 = 7)

Drop untypedattributes.

Abstract scenarios

Standardization puts equivalent scenarios into a canonicalabstract form:

Simplified scenarios

Equivalentscenarios

Abstractscenarios

Standardization

A search using two transformations:•Naming: foo(val = 7) foo(val = X)•Reordering: foo(); bar(); bar(); foo();

Abstract scenarios

socket(return:T0 = 7) [seed][SETUP socket:T0 7]accept(so:T0 = 7, return:T0 = 8)[USE Y]close(so:T0 = 8)close(so:T0 = 7)

Abstract scenarios

socket(return:T0 = 7) [seed][SETUP socket:T0 7]accept(so:T0 = 7, return:T0 = 8)write(so:T0 = 8)read(so:T0 = 8)close(so:T0 = 8)close(so:T0 = 7)

Abstract scenarios

socket(return:T0 = 7) [seed][SETUP socket:T0 7]accept(so:T0 = 7, return:T0 = 8)read(so:T0 = 8)write(so:T0 = 8)close(so:T0 = 8)close(so:T0 = 7)

Reorder

Abstract scenarios

socket(return:T0 = X) [seed][SETUP socket:T0 X]accept(so:T0 = X, return:T0 = Y)read(so:T0 = Y)write(so:T0 = Y)close(so:T0 = Y)close(so:T0 = X)

ReorderName

Abstract scenarios

socket(return:T0 = X) [seed][SETUP socket:T0 X]accept(so:T0 = X, return:T0 = Y)read(so:T0 = Y)write(so:T0 = Y)close(so:T0 = Y)close(so:T0 = X)

Each interaction is a letter to the PFSA learner.

Abstract scenarios

Coring

Coring removes PFSA transitions that occur infrequentlyand converts the PFSA into an NFA.

[SETUP X]

close(fd = Y)[USE Y]

close(fd = X)

Specification

Verification

Do all traces of a program P satisfy a specification A?

Verification

Do all traces of a program P satisfy a specification A?Does a trace T

Definition: T satisfies A if every seed in T is surroundedby a scenario that satisfies A.

Verification

Do all traces of a program P satisfy a specification A?Does a trace TDoes a scenario S

Language of A

Abstract scenariossatisfying A

Simplified scenariossatisfying A

Concrete scenariossatisfying A

SimplificationStandardization

Experiments

• What we wanted to find out• Hypothesis 1: the process will find bugs and

reduce the number of traces that the expert must inspect.

• Hypothesis 2: the miner’s final specification will match the English rule.

• Hypothesis 3: the corer and the human will agree on the hot core.

• Gathered traces from 16 programs:• 5 programs in the X11 distribution and

• 11 contributed programs

Testing vs. verification

testing:

programinputinputinput is the output correct?

inputinputinput

propertyproperty

verification:

checkerproperty does property hold?

programX11sockets

sample properties:• allocated memory is freed.• locks are released.• …

Testing vs. verification

Completeness (“coverage”):• verification (if sound) guarantees that program

contains no bugs of a well-specified class.

testing verification

aspects all some

control some all

data some all

our focus

Verification: recent successes

Recent successes. specifications languages: temporal logics, automata, … abstractors: SLAM, FeaVer checkers: model checking, theorem proving, type

systems

What’s still missing?? specifications

property holds?

program

checker

abstractprogram

abstractor

formal specificationof correctness

property

So who formulates specifications?

Programmers? Probably not.

Why they won’t: • too busy: yet another language to learn?• specifications aren’t cool.• specification languages are hard: LTL, anyone?

Why they shouldn’t:• may misunderstand usage rules.• may not know all usage rules.

Mining Specifications: Convenient and easy: anyone can do it Like in data mining, discover surprise rules.

Advantages of mining

Exploits the massive programmers’ effort reflected in the code.

• Programmers resolved many problems:

• incomplete system requirements.

• incomplete API documentation.

• implementation-dependent API rules.• Want redundancy? (without redundant programming)

• ask multiple programmers (and vote).

Exploits the testers’ effort in devising test inputs

Our output: a specification

x = socket()

bind(x)

listen(x)

y = accept(x)

write(y)

close(y)

close(x)

read(y)

How do we mine?

Underlying premise:

Even bad software is debugged enough to show hints of correct behavior.

Maxim: Common usage is the correct usage.

Mining = machine learningReduce the problem into the well-known

problem of learning regular languages.

Obstacles:1. source code is too detailed and hard to analyze2. what is “common” behavior?

Solutions:

1. learn from dynamic behavior

2. learn probabilistically

learn from traces into probabilistic FSMs

Input: trace(s)7 = socket(2, 1, 0);bind(7, 0x400120, 16);listen(7, 5);8 = accept(7, 0x400200, 0x400240);read(8, 0x400320, 255);write(8, 0x400320, 12);read(8, 0x400320, 255);write(8, 0x400320, 7);close(8);10 = accept(7, 0x400200, 0x400240);read(10, 0x400320, 255);write(10, 0x400320, 13);close(10);close(7);……

x = socket()

bind(x)

listen(x)

y = accept(x)

write(y)

close(y)

close(x)

read(y)

7 = socket(2, 1, 0);bind(7, 0x400120, 16);listen(7, 5);8 = accept(7, 0x400200, 0x400240);read(8, 0x400320, 255);write(8, 0x400320, 12);read(8, 0x400320, 255);write(8, 0x400320, 7);close(8);10 = accept(7, 0x400200, 0x400240);read(10, 0x400320, 255);write(10, 0x400320, 13);close(10);close(7);……

The mining algorithm

dynamicexecution

(traces)

trace abstraction

usage scenarios

(strings)

(off-the-shelf)

RegExp learner

generalizedscenarios

(probabilistic FSA)

user: extract heavy core(and approve)

specification(NFA)

dynamic checker

dynamic exe.to be checked

(trace)

OK/bug

Trace abstraction: 4 challenges• Traces interleave useful and useless events.

• sockets created by accept are independent, …

• Specifications must include both temporal and value-flow constraints.

• Only some of API calls’ arguments impose “true” dependences.• accept does not alter the state of the bound socket,

• Specifications may impose only partial order.• filling in fields of a structure before a call, …

Finding dependendences7 = socket(2, 1, 0);bind(7, 0x400120, 16);listen(7, 5);8 = accept(7, 0x400200, 0x400240);read(8, 0x400320, 255);write(8, 0x400320, 12);read(8, 0x400320, 255);write(8, 0x400320, 7);close(8);10 = accept(7, 0x400200, 0x400240);read(10, 0x400320, 255);write(10, 0x400320, 13);close(10);close(7);……

Some args and return valuesare handles to data structures.Calls may

•write through the handle•read through the handle•read and write

Def-use dependences connectwriters to readers

h(_, )

a( , )d( , )b(_, )

Trace abstraction

h(3, 5) c(10)a(4, 5)d(4, 7)b(0, 5)f(10)h(8, 11)e(7)f(50)d(15, 1) c(7)a(9, 11)b(6, 7)d(9, 14)f(20)e(7)…

h(_, X)

a(Y, X)b(_, X)d(Y, Z)

h(_, X) a(Y,

X)b(_, X)d(Y,

Z)e(Z)

h(_, 5) c(10)a(4, 5)d(4, 7)b(_, 5)f(10)h(_, 11)e(7)f(_)d(_, _) c(7)a(9, 11)b(_, 11)d(9, _)e(_)f(_)…

h(_, X)

a(Y, X)d(Y, Z)b(_, X)

h(_, X) a(Y,

X)b(_, X)d(Y,

The output PFSA

h(_, X) a(Y, X) b(_, X) d(Y, Z) e(Z)

2 2 2 1 1

d(Y, Z)

Renaming and reordering the chop

outline of the algorithm

input: a chop (a dag of data dependences)output: the canonical chop

1. reorder: list all possible chop schedules• trick: only list those with calls in lexicographic order

2. rename: abstract arguments in each schedule3. select lexicographically least schedule

lexicographic order:a(…) b(…) < b(…) b(…)a(X) b(…) < a(Y) b(…)

Checking: the meaning of the spec

means:whenever seed(x) is executed, it must be preceded by a(x), b(x) and followed by c(x).

does not mean:a(x) must be followed by b(x), seed(x), c(x) (because a is not a seed).

seed(x) c(x)b(x)a(x)

Dynamic checking

• Used in our experiments

• checker mirrors the learner:

specification(NFA)

dynamic checker

for each seed in the trace extract a chop if some substring from chop in NFA

seed verified! else

extract a larger chop(up to a bound)

fail if no chop verifies

dynamic executionto be checked

(trace)

OK/bug

Static checking

Conversion to a “checkable” specification:

seed(x) c(x)b(x)a(x)

seed(x)

c(x)b(x)a(x)

^seed(x)

^c(x) | end

seed(x)

Related workArithmetic pre/post conditions

• Daikon, Houdini• properties orthogonal from us • eventually, we may need to include and learn some

arithmetic relationships

Temporal relationships over calls • intrusion detection: [Ghosh et al], [Wagner and Dean]

• software processes: [Cook and Wolf]

• error checking: [Engler et al SOSP 2001]• lexical and syntactic pattern matching • user must write templates (e.g., <a> always follows

Summary• Semi-automatically formulating well-formed,

non-trivial specifications is an important part of the verification tool chain.

• Contributions:• introduced specifications mining

• phrased it as probabilistic learning from dynamic traces

• decomposed it into a sequence of subproblems (using an off-the-shelf learner)

• developed dynamic checker

• found bugs

The supply/demand pyramids

Visual Basic

javascript, html, XML

skill(supply)

effort(demand)

s/w development

requirements

analysis

verification and testing

Mining Specifications (lots of) code specifications of correctness

Documents

Transcript of Mining Specifications (lots of) code specifications of correctness

CERTIFICATE OF CORRECTNESS

Correctness of the Chord Protocol

The Limits of Correctness

Program Correctness & Efficiency

What About Correctness?

Correctness of Speculative Optimizations with Dynamic ...aviral.io/publications/correctness-of-speculative-optimizations-with... · 49 Correctness of Speculative Optimizations with

Striving for correctness*

Proof of Correctness

The Notion of Correctness

A Semi-Formal Method to Verify Correctness of Functional Requirements Specifications ... · 2017-09-09 · A Semi-Formal Method to Verify Correctness ... 65 * Physical connectivity

Correctness in Causal Systems

Political Correctness

Designing for Correctness

Inequality, Development and Economic Correctness

Program Correctness and Efficiency

Low-Level Program Verification. Components of a Certifying Framework certified code (machine code + proof) specifications: program safety/security/correctness.

Media and political correctness

Formal Specification and Verification. Specifications Imprecise specifications can cause serious problems downstream Lots of interpretations even with.

Correctness Correctness. Quality Perceptions The perception of quality associated with your code is typically bound to: Correctness Efficiency (speed.

Criteria of Correctness