Static and Runtime Verification A Monte Carlo Approach State University of New York at Stony Brook...

Static and Runtime VerificationA Monte Carlo Approach

State University of New York at Stony Brook

grosu@cs.sunysb.edu

Radu Grosu

Talk Outline

1. Embedded Software Systems

2. Automata-Theoretic Verification

3. Monte Carlo Verification

4. Monte Carlo Model Checking

5. Static Verification of Software-Systems

6. Dynamic Verification Software-Systems

• Systems with ongoing interaction with their environment.

- Termination is rather an error than expected behavior

• Becoming an integral part of nearly every engineered product.

- They control:

Embedded Software Systems

Embedded Systems

Commercial Aircraft

Medical devices

Household devicesTelecommunication

Nuclear PowerPlants

Automobiles

Boeing 777: Super Computers with Wings

> 4M lines of code > 1K embedded processors

In order to

- control subsystems - aid pilots in flight mngmnt.

• interacts with humans in a sophisticated way.

A great challenge of software engineering:

• hard real-time deadlines,

• mission and safety-critical, • complex and embedded within another complex system,

Embedded Software Systems

• Difficult to develop & maintain:– Concurrent and distributed (OS, ES, middleware),

– Complicated by DS improving performance (locks, RC,...),

– Mostly written in C programming language.

• Have to be high-confidence: – Provide the critical infrastructure for all applications,

– Failures are very costly (business, reputation),

– Have to protect against cyber-attacks.

Temporal Properties

• Safety (something bad never happens):

- Airborne planes are at least 1 mile apart- Nuclear reactor core never overheats - Gamma knife never exceeds prescribed dose

• Liveness (something good eventually happens):

– Core eventually reaches nominal temperature– Dishwasher tank is eventually full– Airbag inflates within 5ms of collision

Linear Temporal Logic

• An LTL formula is made up of atomic propositions p, boolean connectives , , and temporal modalities X (neXt) and U (Until).

• Safety: “nothing bad ever happens” E.g. G( (pc1=cs pc2=cs)) where G is a derived

modality (Globally).

• Liveness: “something good eventually happens” E.g. G( req F serviced ) where F is a derived modality (Finally).

LTL Semantics

• Semantics given in terms of the inductively defined entailment relation ⊨ .

is an infinite word (execution) over the power set of the set of atomic propositions.

is an LTL formula.

LTL Semantics

X p :p

F p :p

p p p p U q :

p p pG p :

What is High-Confidence?

system-software S satisfies LTL property φ

Ability to guarantee that

Talk Outline

Checking if

• Statically (at compile time)

–Abstract interpretation (sequential IS programs),

–Model checking (concurrent FS programs),

• Dynamically (at run time)

– Runtime analysis (sequential program optimization).

• Basic Idea:

– Intelligently explore S’s state space in attempt to establish that S ⊨

Automata-Theoretic Approach

• Büchi automaton: NFA over -words with acceptance condition - a final state must be visited - often.

• Every LTL formula can be translated to a Büchi automaton B such that L() = L(B).

• State transition graph of S can also be viewed as a Büchi automaton.

Automata-theoretic approach

• Satisfaction reduced to language emptiness:

S ⊨ ≅ L(BS) L(B ) ≅ L(BS) ∩ L(B )

≅ L(BS) ∩ L(B ) ≅ L(BS B )

Büchi Automata

• Finite automata over infinite words.

• Checking non-emptiness is equivalent to finding a reachable accepting cycle (lasso).

L(A) = { ab } L(B) =

recurrencediameter

LassosComputation Tree (CT) of B

Explore all lassos in the CT

DDFS,SCC: time efficient DFS: memory efficient

Checking Non-Emptiness

Talk Outline

Randomized Algorithms

• Huge impact on CS: (distributed) algorithms, complexity theory, cryptography, etc.

• Takes of next step algorithm may depend on random choice (coin flip).

• Benefits of randomization include simplicity, efficiency, and symmetry breaking.

Randomized Algorithms

• Monte Carlo: may produce incorrect result but with bounded error probability.

– Example: Election’s result prediction

• Las Vegas: always gives correct result but running time is a random variable.

– Example: Randomized Quick Sort

recurrencediameter

Explore N(,) independent lassos in the CT

Error margin and confidence ratio

Monte Carlo Approach

flip a k-sided coin

LassosComputation tree (CT) of B

Lassos Probability Space

• Sample Space: lassos in BS B

• Bernoulli random variable Z (coin flip):

– Outcome = 1 if randomly chosen lasso accepting

– Outcome = 0 otherwise

• pZ = ∑ pi Zi (expectation of an accepting lasso)

where pi is lasso prob. (uniform random walk)

Example: Lassos Probability Space

¼ ⅛

qZ = 7/8

pZ = 1/8

Geometric Random Variable

• Value of geometric RV X with parameter pz:

– No. of independent trials (lassos) until success

• Probability mass function:

– p(N) = P[X = N] = qzN-1 pz

• Cumulative Distribution Function:

– F(N) = P[X N] = ∑i Np(i) = 1 - qzN

How Many Lassos?

• Requiring 1 - qzN = 1- δ yields:

N = ln (δ) / ln (1- pz)

• Lower bound on number of trials N needed to achieve success with confidence ratio δ.

What If pz Unknown?

• Requiring pz ε yields:

M = ln (δ) / ln (1- ε) N = ln (δ) / ln (1- pz)

and therefore P[X M] 1- δ

• Lower bound on number of trials M needed to achieve success with

confidence ratio δ and error margin ε .

Statistical Hypothesis Testing

• Null hypothesis H0: pz ε

• Inequality becomes: P[ X M | H0 ] 1- δ

• If no success after N trials, i.e., X > M, then reject H0

• Type I error: α = P[ X > M | H0 ] < δ

Monte Carlo Verification (MV)

input: B=(Σ,Q,Q0,δ,F), ε, δ

N = ln (δ) / ln (1- ε)

for (i = 1; i N; i++)

if (RL(B) == 1) return (1, error-trace);

return (0, “reject H0 with α = Pr[ X > N | H0 ] < δ”);

RL(B): performs a uniform random walk through B storing states encountered in hash table to obtain a random sample (lasso).

Correctness of MV

Theorem: Given a Büchi automaton B, error margin ε, and confidence ratio δ, if MV rejects H0, then

its type I error has probability

α = P[ X > M | H0 ] < δ

Complexity of MV

Theorem: Given a Büchi automaton B having diameter D, error margin ε, and confidence ratio δ, MV runs in time O(N∙D) and uses space O(D), where N = ln(δ) / ln(1- ε)

Cf. DDFS which runs in O(2|S|+|φ|) time for B = BS B

Talk Outline

Model Checking [ISOLA’04, TACAS’05]

• Implemented DDFS and MV in jMocha model checker for synchronous systems specified using Reactive Modules.

• Performance and scalability of MV compares very favorably to DDFS.

Dining Philosophers

DDFS MC2ph time entr time mxl cxl N

4 0.02 31 0.08 10 10 3 8 1.62 512 0.20 25 8 712 3:13 8191 0.25 37 11 1116 >20:0.0 - 0.57 55 8 1820 - oom 3.16 484 9 2030 - oom 35.4 1478 11 100

40 - oom 11:06 13486 10 209

(Deadlock freedom)

DPh: Symmetric Unfair Version

DDFS MC2ph time entr time mxl cxl N

4 0.17 29 0.02 8 8 2 8 0.71 77 0.01 7 7 112 1:08 125 0.02 9 9 116 7:47:0 173 0.11 18 18 120 - oom 0.08 14 14 130 - oom 1.12 223 223 1

40 - oom 1.23 218 218 1

(Starvation freedom)

DPh: Symmetric Unfair Version

DDFS MC2Phi time entries time max avg

4 0:01 178 0:20 49 216 0:03 1772 0:45 116 428 0:58 18244 2:42 365 99

10 16:44 192476 7:20 720 23412 - oom 21:20 1665 56414 - oom 1:09:52 2994 144216 - oom 3:03:40 7358 314418 - oom 6:41:30 13426 589620 - oom 19:02:00 34158 14923

DPh: Asymmetric Fair Version(Deadlock freedom)

δ = 10-1 ε = 1.8*10-3 N = 1278

DDFS MC2Phi time entries time max avg

4 0:01 538 0:20 50 216 0:17 9106 0:46 123 428 7:56 161764 2:17 276 97

10 - oom 7:37 760 24012 - oom 21:34 1682 57014 - oom 1:09:45 3001 136316 - oom 2:50:50 6124 298318 - oom 8:24:10 17962 739020 - oom 22:59:10 44559 17949

DPh: Asymmetric Fair Version (Starvation freedom)

δ = 10-1 ε = 1.8*10-3 N = 1278

Related Work

• Random walk testing: – Heimdahl et al: Lurch debugger

• Random walks to sample system state space:– Mihail & Papadimitriou (and others)

• Monte Carlo Model Checking of Markov Chains: – Herault et al: LTL-RP, bonded MC, zero/one ET

– Younes et al: Time-Bounded CSL, sequential analysis

– Sen et al: Time-Bounded CSL, zero/one ET

• Probabilistic Model Checking of Markov Chains:– ETMCC, PRISM, PIOAtool, and others.

Talk Outline

Checking for High-Confidence(in-principle)

Instrumenter(Product)

ExecutionEngine

All LassosNon-accepting

AcceptingLasso L

• Combine static & runtime verification techniques:– Abstract interpretation (sequential IS programs),

– Model checking (concurrent FS programs),

– Runtime analysis (sequential program optimization).

• Make scalability a priority: – Open source compiler technology started to mature,

– Apply techniques to source code rather than models,• Models can be obtained by abstraction-refinement techniques,

– Probabilistic techniques trade-of between precision-effort.

Checking for High-Confidence(in-practice)

GCC Compiler

• Early stages: a modest C compiler.- Translation: source code translated directly to RTL.

- Optimization: at low RTL level.

- High level information lost: calls, structures, fields, etc.

• Now days: full blown, multi-language compiler generating code for more than 30 architectures.

- Input: C, C++, Objective-C, Fortran, Java and Ada.

- Tree-SSA: added GENERIC, GIMPLE and SSA ILs.

- Optimization: at GENERIC, GIMPLE, SSA and RTL levels.

- Verification: Tree-SSA API suitable for verification, too.

GCC Compilation Process

Java FileC++ FileC File

C Parser

C++ Parser

Java Parser

Genericize

Gimplify

Parse Tree

GEN AST

GPL AST

Code Gen

Build CFG

GPL AST

Rest Comp

SSA/GPL CFG

RTL Code

Obj Code

GCC Compilation Process

Java FileC++ FileC File

C Parser

C++ Parser

Java Parser

Genericize

Gimplify

Parse Tree

GEN AST

GPL AST

Code Gen

Build CFG

GPL AST

Rest Comp

SSA/GPL CFG

RTL Code

Obj Code

APIPlug-In

C Program and its GIMPLE IL

int main() {

int a,b,c;

a = 5;

b = a + 10;

c = a + foo(a,b);

if (a > c)

c = b++/a + b*a;

bar(a,b,c); }

int main {

int a,b,c; int T1,T2,T3,T4;

a = 5; b = a + 10; T1 = foo(a,b); T2 = a + T1;

if (a > T2) goto fi; T3 = b / a; T4 = b * a; c = T2 + T3; b = b + 1;fi: bar(a,b,c); }

Gimplify

Associated GIMPLE CFG

a = 5;b = a + 10;T1 = foo(a,b);T2 = b + T1;if (a > T2) goto B;

=T3 = b / a;T4 = b * a;c = T3 + T4;b = b + 1;

bar(a,b,c);return;

true falseBC

FUNCTION DECL

Entry int int int int int int inta T4T3T2c T1b

MC Static Verification of ESS[SOFTMC’05, NGS’06]

Gimplify

InstrumentLTL-P

CFG BS

VerifierGAM static

Monte Carlo Algorithm

• Input: a set of CFGs.– Main function: A specifically designated CFG.

• Random walks in the Büchi automaton: generated on-the-fly.– Initial state: of the main routine + bookkeeping information.

– Next state: choose process + call GAM on its CFG.

– Processes: created by using the fork primitive.

– Optimization: GAM returns only upon context switch.

• Lassos: detected by using a hierarchic hash table.– Local variables: removed upon return from a procedure.

Shared Variables Valuation(channels & semaphores)

List Of Process statesp1 p2 p3 …

CFG Name Statement #

Control State Data State

Program State

Shared Variables Valuation(channels & semaphores)

List Of Process statesp1 p2 p3 …

Heap Global Variables Valuation

Control State Data State

Frame Stack

Return Control State Local Variables Valuation

f1 f2 …

Program State

GIMPLE Abstract Machine (GAM)

• Interprets GIMPLE statements: according to their semantics. Interesting:– Inter-procedural: call(), return(). Manipulate the frame

stack.

• Catches and interprets: function calls to various modeling and concurrency primitives:– Modeling: toss(), assert(). Nondeterminism and checks.

– Processes: fork(), … Manipulate the process list.

– Communication: send(), recv(). Manipulate shared vars. May involve a context switch.

GMVproperty rule bugs time sampl

1 no 0.23 1278 Safe Advisory Selection 2 yes 0.03 147

1 no 0.23 1278 Best Advisory Selection 2 yes 0.04 206

1 yes 0.01 36 Avoid unnecessary Crossing 2 yes 0.03 180

1 yes 0.01 27No. Crossing Adv. Selection 2 yes 0.01 8

1 no 0.23 1278Optimal Advisory Selection 2 yes 0.06 217

Results: TCAS

GMV Verisoftph time sampl ce.len time states trans

4 0:00.07 2 12 0:00.61 16 37 6 0:00.11 4 12 0:16.60 773 11718 0:00.78 11 20 2:57.29 5431 8449 10 0:02.17 31 24 10:41 17908 31433 12 0:04.82 24 27 >2hr N/A N/A 14 0:06.22 22 44 >2hr N/A N/A

16 0:11.56 14 32 >2hr N/A N/A

(Deadlock freedom)

DPh: Symmetric Fair Version

GMV Verisoft Genetic time sampl time states time errors

6h 37' 10,682,639 >8h N/A 2h 33' 3

Needham-Schroeder Protocol

• Quite sophisticated C implementation.

• However, of a sequential nature:- Essentially executes only one round of a reactive system

Related Work

• Software model checkers for concurrent C/C++: – VeriSoft, Spin, Blast (Slam), Magic, C-Wolf. Bogor?

• Cooperative Bug Isolation [Liblit, Naik & Zheng]:– Compile-time instrumentation. Distribute binaries/collect bugs.

– Statistical analysis to isolate erroneous code segments.

• Random interpretation [Gulvany & Necula]: – Execute random paths and merge with random linear operators.

• Monte Carlo and abstract interpretation [Monniaux]: – Analyze programs with probabilistic and nondeterministic input.

Talk Outline

MC Runtime Verification of ESS[MBT’06, NGS’06]

Gimplify

InstrumentLTL-P

CFG BS

VerifierGAM static

Rst-CompGCC

Linker DispatcherHWMruntime

Runtime Verification Challenges

• Inserting instrumentation code

• Verifying states and transitions

• Reducing overheads

Inserting Instrumentation Code

struct inode* my_inode;atomic_t my_atomic;

my_atomic = my_inode->i_count;

if(instrument) log_event(ATOMIC_INC, INODE, my_atomic);

atomic_inc(my_atomic);

Instrumentation Plug-Ins

• Ref-Counts: detects misuse of reference counts– Instruments: inc(rc), dec(rc),– Checks: st-inv (rc0), tr-inv (|rc′-rc|=1), leak-inv (rc>0 ~> rc=0), – Maintains: a list of reference counts and their container type.

• Malloc: detects allocation bugs at runtime– Instruments: malloc() and free() function calls,– Checks sequences: free()free(), $free() and malloc()$,– Maintains: a list of existing allocations.

• Bounds: checks for invalid memory access– Instruments: malloc(), free() and f(a),– Checks: accesses to non-allocated areas,– Maintains: heap, stack and text allocations– Higher accuracy than ElectricFence-like libraries.

Instrumentation Plug-Ins

• Lasso concept weakened (abstracted):

- Execution where: RC vary 0 ↗ … ↘ 0

- State: may include FS caches, HW regs, etc

• Lasso sampling used to reduce overhead:

- Check: for acceptance (error)

- Dynamically adjust: sampling rate

RC Runtime Verification

Sampling Granularity

Accesses

Sample

State and Transition Invariants

Accesses

Change >1

Change <1

Value <0

The Leak Invariant

Timeout

Proof of Concept

• Check Linux file system cache objects

– inodes: on-disk files

– dentries: name-space nodes

• Optionally, log all events

• Simple per-category sampling policy

– Initially: sample all objects

– Hypothesize: err. rate ε > 10-5 and con. ratio δ = 10-5

– Stop sampling: if hypothesis is false.

Benchmarks

• Directory traversal benchmark

– Create a directory tree (depth 5, degree 6)– Traverse the tree– Recursively delete the tree

• Also tested GNU tar compilation

• Testbed:

– 1.7GHz Pentium 4 (256Kb cache)

– 1Gbyte RAM

– Linux 2.6.10

Results

0 5 10 15 20 25

Run number

Logging: ~10x

Results

0 5 10 15 20 25

Run number

Checking: ~2x

1,33x1,1x

Sampling-Policy Automata

• Specify how to respond to events

– Violating trajectories

– Invalidations of violation rate estimates

• Control trajectory sampling rate

• A simple SPA:

cs = n cs = n+1

ε > pz

Related Work: SWAT

• Chilimbi & Hauswirth: – Low-Overhead Memory Leak Detection Using Adaptive

Statistical Profiling

• Instrument heap accesses

• Block-level dynamic instrumentation

• Reduce instrumentation based on number of times a block has been hit

• No formal measure of confidence provided

Conclusions

• GSRV is a novel tool suite for randomized: – Static and runtime verification of ESS (growing)

• General purpose tools (plug-ins):– Code instrumenter: constructs the product BA

– Intra/inter-procedural slicer: in work

• Static verification tools (plug-ins):– GAM: CFG-GIMPLE abstract machine

– Monte Carlo MC: statistical algorithm for LTL-MC

• Runtime verification tools (static libraries):– Dispatcher: catches and dispatches events to RV

– Monte Carlo RV: statistical algorithm for LTL-RV

Static and Runtime Verification A Monte Carlo Approach State University of New York at Stony Brook...

Documents

Transcript of Static and Runtime Verification A Monte Carlo Approach State University of New York at Stony Brook...

Jideg Apostu Radu

David Brumley dbrumley@cs.cmu.edu Tzi-cker Chiueh chiueh@cs.sunysb.edu Robert Johnson rtjohnso@cs.sunysb.edu Huijia Lin huijia@cs.cornell.edu

PDE 2979-20210401095409...Cristinela GROSU, Director General Adjunct Digitally signed by GROSU GROSU CRISTINELA CRISTIN ELA Date: 2021.03.23 1 +02'00' Pag. 1 din I BENTA COMPETENT

Zastita Na Radu

Radu Resume

Compiler Assisted Software Verification Using Plug-Ins Radu Grosu SUNY at Stony Brook

Discrete Abstraction of Multia ne Systems · Discrete Abstraction of Multia ne Systems Hui Kong 1, Ezio Bartocci 2, Sergiy Bogomolov , Radu Grosu 2 Thomas A. Henzinger 1, uY Jiang

Learning Cycle-Linear Hybrid Automata for Excitable Cells Radu Grosu SUNY at Stony Brook

Hierarchical Design and Analysis of Reactive Systems Radu Grosu Stony Brook University radu.

Radu Grosu SUNY at Stony Brook

Butum Radu Eugen

Alexander Grosu - On the Distribution of Genitive Phrases in Romanian

Efficient Modeling of Excitable Cells Using Hybrid Automata Radu Grosu SUNY at Stony Brook

Shared Variables Interaction Diagrams Radu Grosu State University of New York at Stony Brook joint work with Rajeev Alur University of Pennsylvania.

From Signal Temporal Logic to FPGA MonitorsFrom Signal Temporal Logic to FPGA Monitors Stefan Jakˇsi c´ , Ezio Bartocci †, Radu Grosu , Reinhard Kloibhofer , Thang Nguyen‡ and

Radu Cristian Barna

Hybrid Systems Modeling, Analysis and Control · Radu Grosu Vienna University of Technology Hybrid Systems Modeling, Analysis and Control Lecture 11 Constraint Satisfaction Using

Radu Cosmin 8218

· solovastru camelia rodica istoc grosu robert zsolt istoc grosu anda paula solomon marcela alina chiper raluca andreea marin ildiko marin julia renata truta adrian balea liviu

Learning Cycle-Linear Hybrid Automata for Excitable Cells Radu Grosu SUNY at Stony Brook Joint work with Sayan Mitra and Pei Ye.