Distributed Symbolic Model Checking
description
Transcript of Distributed Symbolic Model Checking
Distributed Symbolic Model CheckingDistributed Symbolic Model Checking
Tamir Heyman Tamir Heyman
AdvisorsAdvisors
Orna Grumberg and Assaf SchusterOrna Grumberg and Assaf Schuster
Technion HaifaTechnion Haifa
The Size ProblemThe Size Problem
Model Checking takes a model and a Model Checking takes a model and a specification specification
This presentation focus on the sub problem This presentation focus on the sub problem known as reachability analysis (RA)known as reachability analysis (RA)
The number of states/vertices is exponential The number of states/vertices is exponential in the number of model variablesin the number of model variables
The Sequential SolutionThe Sequential Solution
Symbolic Model Checking Symbolic Model Checking – Computation is done over sets of states, usually Computation is done over sets of states, usually
represented as BDDsrepresented as BDDs– Representation size may be polynomialRepresentation size may be polynomial
Memory requirements still a problemMemory requirements still a problem– limits model size to ~300 state variables (Bits)limits model size to ~300 state variables (Bits)
Distributed MethodDistributed Method
The goal is to solve verification problems that The goal is to solve verification problems that cannot fit into the memory of a single cannot fit into the memory of a single machinemachine
We use a large cluster of nodes as if they We use a large cluster of nodes as if they were one big node.were one big node.– Each node contributes a local memory and a Each node contributes a local memory and a
processorprocessor
Distributed ChallengesDistributed Challenges
What Distributed has to do with NP What Distributed has to do with NP problems?problems?– We keep the representation efficient as in the We keep the representation efficient as in the
sequential algorithm therefore works on sequential algorithm therefore works on polynomial problems.polynomial problems.
Why not a single node with larger memory?Why not a single node with larger memory?– The cluster’s memory capacity is proportional to The cluster’s memory capacity is proportional to
the cluster CPU power.the cluster CPU power. What is required in order to handle any size?What is required in order to handle any size?
– Keep the efficiency while the system is growing.Keep the efficiency while the system is growing.
Distributed Symbolic MethodDistributed Symbolic Method
A Complete set of window functions: WA Complete set of window functions: W11……
WWnn, defines for each process the part of the , defines for each process the part of the
state space it ownsstate space it owns S is partitioned to SS is partitioned to Sii=S=S/\/\WWii
– The parts SThe parts Sii are smaller than the whole set S are smaller than the whole set S
SW1W3
W2
Elements of Distributed Symbolic Model Elements of Distributed Symbolic Model Checking [HGGS CAV00]Checking [HGGS CAV00]
Developed for reachability analysis, Developed for reachability analysis, extended to full model checkingextended to full model checking
Slicing algorithmSlicing algorithm Exchange algorithmExchange algorithm Balance algorithmBalance algorithm
Slicing algorithmSlicing algorithm
Given a set S, the slicing algorithm Given a set S, the slicing algorithm computes window functions computes window functions
SW1W2
Slicing algorithmSlicing algorithm
Slicing S according to window functions Slicing S according to window functions
S1 S2
P1 P2
S1S2S2 S1
Exchange algorithmExchange algorithm During a calculation, states may be found During a calculation, states may be found
that belong to other windowthat belong to other window Exchange a set according to window Exchange a set according to window
functions functions
S1S2 S1S2 S1
S2S2
S1
During calculation, the sets that distributed During calculation, the sets that distributed based on current window function may be based on current window function may be unbalancedunbalanced
Balance window functions and exchange Balance window functions and exchange the set accordinglythe set accordingly
Memory balanceMemory balance
What a Researcher NeedsWhat a Researcher Needs??
Get a Sequential model checker, implement Get a Sequential model checker, implement message passing interface, implement message passing interface, implement transmission of objects, implement transmission of objects, implement transmission of sets of states represented as transmission of sets of states represented as BDDsBDDs
OrOr Use the Division system ,under Use the Division system ,under
construction. construction. – By Tamir Heyman and Amnon HeymanBy Tamir Heyman and Amnon Heyman
What is in the Division?What is in the Division?
Open sourceOpen source Platform for researchPlatform for research General system General system Supporting distributed model checkingSupporting distributed model checking Special support in distributed symbolic Special support in distributed symbolic
model checkingmodel checking
The Division’s StructureThe Division’s Structure
Infrastructure
Standard Building Blocks
Distributed Tool Kit
Basic Model Checking Operations
Model Checking Mu-Calculus
InfrastructureInfrastructure
Operating systemOperating system CommunicationCommunication Distributed files systemDistributed files system
MPI STL SMC
Standard Building Blocks
Standard Building BlocksStandard Building Blocks
Message Passing Interface (MPI)Message Passing Interface (MPI) Standard Template Library (STL)Standard Template Library (STL) Symbolic Model Checker (SMC)Symbolic Model Checker (SMC) Interface implemented by the SMCInterface implemented by the SMC
DTK Interface
Division tool kitDivision tool kit
Collection of independent tools for:Collection of independent tools for: Distributed computationDistributed computation Distributed model checkingDistributed model checking Distributed symbolic model checkingDistributed symbolic model checking
Basic Model Checking OperationBasic Model Checking Operation
ExchangeExchange Termination detectionTermination detection SplitSplit
Model Checking Mu-calculusModel Checking Mu-calculus
Distributed fixpointDistributed fixpoint Distributed Reachability analysisDistributed Reachability analysis Distributed Full Mu-CalculusDistributed Full Mu-Calculus
Focus on DTKFocus on DTK
Infrastructure
Standard Building Blocks
Distributed Tool Kit
Basic Model Checking Operations
Model Checking Mu-Calculus
DTK for distributed AlgorithmDTK for distributed Algorithm
Distributed outputDistributed output– Collected from many processesCollected from many processes– Filtered Filtered
Transmission of objects Transmission of objects – Like in CORBALike in CORBA
Transmission of commands Transmission of commands – Executing remote codeExecuting remote code
DTK for Model CheckingDTK for Model Checking
Interface for model checking engineInterface for model checking engine– Simple, short, hid the complexitySimple, short, hid the complexity
Manager for Pool of processesManager for Pool of processes– Response to partners requestsResponse to partners requests– Collect Idle processes callsCollect Idle processes calls
DTK for Symbolic MCDTK for Symbolic MC
Transmitting BDDsTransmitting BDDs Save/load BDD from DiskSave/load BDD from Disk Set of states that uses BDD Set of states that uses BDD
– Implicit mark/release BDDImplicit mark/release BDD– Implementation of operators: +,-,*,==,!,=Implementation of operators: +,-,*,==,!,=
ResultsResults
Slicing is effective at least with 512 slicesSlicing is effective at least with 512 slices
Model checking is effective at least using Model checking is effective at least using 32 machines32 machines
Finds bugs that could not be found by Finds bugs that could not be found by single machine running the sequential single machine running the sequential algorithmalgorithm
Future workFuture work
Massive parallelism using hundreds of Massive parallelism using hundreds of nodesnodes
Including known orthogonal optimizations Including known orthogonal optimizations to further reduce memory requirementsto further reduce memory requirements
Improve speedup, by further optimizationsImprove speedup, by further optimizations
Future DevelopmentFuture Development
Distributed ReorderDistributed Reorder– Force the same order in all processForce the same order in all process– Let Each process choose locallyLet Each process choose locally– Do something in betweenDo something in between
New fixpoint algorithmNew fixpoint algorithm– To better utilize O(100) nodesTo better utilize O(100) nodes