Download - Formal Methods Diffusion: Formal Methods Diffusion

Transcript
Formal Methods Diffusion: Formal Methods Diffusion: ProspectsPostfach 20 03 63, 53133 Bonn, Germany
achieving dependable systems
Adelard, Coborn House, 3 Coborn Road, London E3 2DA Tel: +44 (0)181 983 1708, Fax: +44 (0)181 983 1845
Formal Methods Diffusion: Formal Methods Diffusion:
Prospects
Version 1.0 September 2000
Formal Methods Diffusion: Past Lessons and Future Prospects Page 3 of 71
Version 1.0 September 2000
Foreword by the Sponsor
The idea to initiate this study was born during the discussion of a 'Formal Methods Road Map' during the workshop “Current Trends in Applied Formal Methods”.in Boppard (Germany), 1998. There it was impossible between the participants of the workshop to agree on a common view on the future role of formal Methods in practice. The judgement was quite varying. Many experts were optimistic in their opinion of the increasing use of formal methods in safety and security critical applications in the future. On the other hand there were quite a few experts who did not share this optimistic view regarding the fact that there has been a lot of financial support for formal methods during the last decade - without real success. The result of the discussion during the workshop was not a statement but a question:
'What are the identifying factors that lead to success or failure of the application of formal methods in software development ?'
In this study we carefully try to find an answer to this question.
This report is the result of a study by Adelard plc, London, United Kingdom for the Bundesamt für Sicherheit in der Informationstechnik (BSI), Bonn, Germany. It is advanced by the BSI with selected German perspectives and programmes in the past.
Page 4 of 71 Formal Methods Diffusion: Past Lessons and Future Prospects
September 2000 Version 1.0
Summary
The objective of this study is to identify factors leading to the success or failure of the application of formal methods as exemplified by their use by industry and through past R&D programmes. The overall aim is to inform future formal methods dissemination activities and other initiatives. The objective has been achieved through the review of existing surveys, the review of programme evaluations, and interviews with formal methods practitioners, sponsors and other (past or present) technology stakeholders.
The application of formal methods has a long history but the software engineering community has not substantially adopted them at large and we have identified numerous reasons for the failure to adopt. However there is a significant take up of formal methods in critical industries. Large hardware vendors are either developing in-house capabilities in formal verification (e.g. Intel, Siemens) or are adopting externally developed technologies and in the area of safety there is active research and serious application of formal methods throughout the safety lifecycle. We calculate that about $1-1.5B is spent annually on formal methods activities world-wide.
We identify factors to increase the adoption of formal methods. One major recommendation is that unlike other R&D programmes we have investigated, any future programme should adopt a systematic technology adoption framework, of which we provide two examples, and take a more explicit view of how the market in high technology products actually develops. We consider this to be the single most likely factor to increase the chance of successful adoption. We also identify the need for sustained investment in tools and continued R&D.
Authors
R E Bloomfield D Craigen (ORA Canada)
Formal Methods Diffusion: Past Lessons and Future Prospects Page 5 of 71
Version 1.0 September 2000
3.3.1 Introduction..........................................................................................19 3.3.2 Safety related systems...........................................................................19 3.3.3 Security applications.............................................................................22 3.3.4 Hardware and microcode verification....................................................25
Appendix A What are Formal Methods?..................................................................................51
Appendix B Future Programmes in US....................................................................................55
Page 6 of 71 Formal Methods Diffusion: Past Lessons and Future Prospects
September 2000 Version 1.0
Appendix E Trust in cyberspace ..............................................................................................63
Appendix F European R&D projects .......................................................................................65 F.1 Esprit projects......................................................................................................65 F.2 ESSI....................................................................................................................67
Appendix E Selected German R&D Programmes ....................................................................68
Formal Methods Diffusion: Past Lessons and Future Prospects Page 7 of 71
Version 1.0 September 2000
1 Introduction The application of formal methods has a long history but they have not been substantially adopted by the software engineering community at large. To gain a perspective of what is working and what is not in the formal methods area we have reviewed their use by industry and the results of past R&D programmes. The objective is to identify crucial factors leading to the success or failure of the application of formal methods and in doing so provide a perspective on the current formal methods landscape. The overall aim is to inform future formal methods dissemination activities and other initiatives.
The report is organised as follows. The overall approach to the study is outlined in Section 2. The results of the reviews are presented in Section 3 as follows:
• from the perspective of European and US R&D programmes (Section 3.1)
• from the viewpoints provided from the current conference circuit (Section 3.2)
• we then review applications in key industrial areas of safety, security and chip manufacture (Section 3.3)
We then make some further analysis, from a market point of view, of the size and
nature of the formal methods markets (Section 4.1) and in Section 4.2 provide an analysis based on technology diffusion models and the technology adoption lifecycle. This analysis is then drawn together in Section 5 into a set of conclusions and recommendations.
Appendix A provides a brief introduction to formal methods.
2 The study approach
The study was based on a review of existing surveys (especially [Survey]), the review of programme evaluations (e.g. [SPRU91]), a proposal for a new programme [AFM] and interviews with formal methods practitioners, sponsors and other (past or present) technology stakeholders.
An interview brief was developed to act as “aide memoire” for those conducting the interviews. In our experience it is not appropriate to conduct this type of interview through a rigid question and answer format: we are dealing with senior and technically sophisticated interviewees. Instead it is more productive to have a number of topics that we wish to cover and use these to trigger lines of discussion and to revisit at the end of the interview. Often the interviewees had a particular story to tell and we wished to hear it. The topics raised in the interviews are shown in Table 1 below.
Page 8 of 71 Formal Methods Diffusion: Past Lessons and Future Prospects
September 2000 Version 1.0
Table 1: Interview brief
Preliminaries
Describe role of interviewee in past and present including role in formal methods.
Outline the objectives of study.
People
What is the team and the interviewee doing now? E.g. left FM work on: other R&D, industry as software engineer, a different discipline.
Tools and methods
What is used now?
Ideas
What persists? What key ideas have flourished, where have they come from?
Impact
Research policy
Has the right balance been struck between competition vs dilution of ideas and loss of focus?
What about the idea of picking winners?
We augmented the interview results with the formal methods page of the World Wide Web Virtual Library [FMVL] and our observations from a cross-section of conferences:
• The 5th and 6th International SPIN Workshops on Practical Aspects of Model Checking [SPIN99].
• Computer Aided Verification, 11th International Conference,
CAV’99 [CAV99].
• Formal Methods for Trustworthy Computer Systems [FM89].
• FM’99: World Congress on Formal Methods [FM99].
Formal Methods Diffusion: Past Lessons and Future Prospects Page 9 of 71
Version 1.0 September 2000
• First and Second Conferences on Formal Methods in Computer- Aided Design [FMCAD96] [FMCAD98].
• Applied Formal Methods: FM- Trends 98 [FMTrends].
• Formal Software Verification, Intel Research Symposium [Intel98].
• Theorem Proving in Higher Order Logics [TPHOL99].
• ZUM’98: The Z Formal Specification Notation [ZUM98].
We also considered three key industrial areas: safety, security and chip development.
The intention is that in addressing these different viewpoints, we provide a sufficiently broad and accurate picture without resorting to an exhaustive survey.
We then developed a more market oriented analysis of the results with a discussion of the formal methods adoption through two key models. The first of these is the generic technology diffusion work of Rogers and the second is the high technology marketing of Moore.
3 Results
3.1.1 Alvey
The Alvey programme was a five year programme of pre-competitive R&D in IT that started in 1983 as the UK’s response to the Japanese 5th Generation computing project. It supported 192 collaborative projects involving a mix of academic and industrial partners and about 117 “Uncled” projects, and ran in parallel to ESPRIT1.
Government funding was £200M with about £27M in software engineering. The software engineering part of the programme had a strong academic flavour and many of the small projects were expected to lead to tools for in-house use or commercial exploitation. There were also some large industrially led projects. The official evaluation [SPRU91] was that exploitation performance was low. It identifies barriers to uptake as:
• lack of skilled user base
• high investment costs
The reasons for projects failing were also assessed, and changes to objectives and over ambition were seen as more common than technical problems. The turbulence in the UK IT industry often meant that change in ownership of companies and subsequent changes in strategy occurred during the programme. Some of the lessons, in terms of needing product groups from large companies to be involved, not just R&D groups, have fed through to other programmes since Alvey. In terms of the evaluation of the programme [SPRU91] the formal methods component had little success in promoting widespread adoption or developing lasting tools. However it was successful in:
• Raising awareness and expectation of formal methods around the time Def Stan 00-55 was being planned and developed.
• Raising the perception of the UK strength in formal methods and the perceived formal methods gap between the US and the UK (a gap debated at FM89 [FM89]).
• Leading to a significant number of people with some research experience in formal methods.
Page 10 of 71 Formal Methods Diffusion: Past Lessons and Future Prospects
September 2000 Version 1.0
Esprit— the European Strategic Programme on Research on Information Technology— was a large multi-annual programme in four phases:
Esprit 4: 1994-1998
Esprit 3: 1990-1994
Esprit 2: 1988-1991
Esprit 1: 1984-1988
There was also significant IT activity within programmes on advanced telematics (ACTS) in which the European Infosec work was supported and stimulated. The R&D policy underpinning Esprit was aiming to provide for the new information infrastructure:
• providing and demonstrating the building blocks for information society applications
• led by user and market needs
• emphasising access to information & technologies, on usability and on best practice
• focusing on applicability.
Esprit focused on eight intertwined areas of research:
Long-Term Research aimed to ensure that, at any one time, the potential for the next wave of industrial innovation was maintained and that the expertise underpinning European information technology R&D was replenished in those areas where it was most needed. This area was open for new ideas and people, responsive to industrial needs, and proactive with respect to
technologies that would shape future markets.
A further three areas dealt with underpinning technologies:
Software Technologies aimed to maintain a strong base of high quality and relevant skills and key technologies within all sectors of the European economy for which software development formed an important component of business activity.
Technologies for Components and Subsystems concerned the development and broad exploitation of a wide range of microelectronics solutions for electronic systems. Work encompassed equipment, materials and processes used in manufacturing semiconductors, through to electronic design tools, packaging and interconnect solutions. The area included work on peripheral subsystems such as storage and displays, and work on microsystems.
Multimedia Systems encouraged the development of the technologies and tools necessary for industry to implement multimedia end-user systems.
The other four areas were “focused clusters”— sets of projects and complementary measures combined and managed in order to achieve particular research and industrial objectives.
The Open Microprocessor Systems Initiative’s strategic goal was to provide Europe with a recognised capability in microprocessor and microcontroller systems, and to promote their worldwide use.
The High-Performance Computing and Networking cluster emphasised areas that are only now nearing wide applicability. For example, the use of
Formal Methods Diffusion: Past Lessons and Future Prospects Page 11 of 71
Version 1.0 September 2000
parallel systems for the substitution of simulation for experimentation and testing.
Technologies for Business Processes aimed to support the change and transformation of enterprises to take best advantage of information technologies, business process re- engineering and human resources.
Integration in Manufacturing aimed to accelerate and enhance the ability of European manufacturing industry to capitalise on the emergence of a
powerful global information infrastructure.
The participation in ESPRIT was approximately equally divided between R&D organisations (30%), SMEs (30%) and large companies (40%). About 1100 organisations were selected in the first 6 calls of ESPRIT with a budget of 1089 Mecu with, on average, later projects being smaller. In all 1 in 4 proposals were accepted for funding. A condition of most industrial R&D projects was that they brought together companies and research institutions from at least two EU/EEA countries.
Domain Number Funds /Mecu
Software Technologies 376 186
Multimedia Systems 64 90
Open Microprocessor Systems Initiative 59 97
High Performance Computing and Networking 114 125
Technologies for Business Processes 88 85
Integration in Manufacturing (IiM) 79 111
A significant part of the programme was devoted to measures designed to increase interaction between users and developers, disseminate results more widely, build trial applications, and boost product and process adoption in the market. These complementary measures represent around 20% of overall funding in Esprit 4.
Formal methods aspects
Formal methods related projects and activities have been funded throughout Esprit . We have searched a number of Commission databases to try and establish the scope and extent of the funding for formal methods R&D. Searching on “formal methods”, “formal verification”,
“correctness” and “model checking “ produced a list of some 150 projects. Filtering out those who use “formal methods” in less specific sense or for whom we judge the formal aspect was not a large component leaves about 60 projects. In all, 9 projects mention “model checking”. A selected list of projects that mention formal methods in their description is provided in Appendix F.
The type of project varies enormously from fundamental work in computer science (e.g., the independent discovery in Europe of model checking, work on concurrency) through to application experiments. The earlier programmes contain more on methods and tools with a much more
Page 12 of 71 Formal Methods Diffusion: Past Lessons and Future Prospects
September 2000 Version 1.0
specific and application driven approach in the last phase (Esprit 4). Of the 60 projects about 5–10 are basic research funding university groups, about another 10–15 are small process improvement experiments or dissemination activities, leaving about 40 that one might consider as core formal methods projects. If we take the average funding to be 1Mecu this gives an R&D spend by the Commission of 40Mecu which will rise to about 60Mecu if the industry contribution is included. Overall formal methods are a very small part of the Esprit programme, representing some 0.6–1% of the projects funded.
There is also a database of success stories (Promosa) and a search using keywords of formal verification, formal methods and model checking produced the following results:
A surface compiler for temporal logic (SCTL) Temporal surface language representations can now be translated into a classical logic language and elaborated using fully automatic tools, for easier formal verification of systems.
Automatic Shiploading Optimisation System (ANKO) A ship load planning & optimisation software package which uses formal methods in its design and specification, so as to achieve very high levels of accuracy and reliability.
Automating Safety Analysis (SAM) Based on standard fault analysis techniques, a systematic and automated methodology for analysing safety-critical software reduces total development costs and improves overall safety levels.
Development Environment for PLC- based applications (HERACLES) HERACLES helps suppliers and producers of automated systems to
save time— and thus cut costs— from the design phase onwards. HERACLES shortens the engineering process and reduces the number of errors. The path from the design stage to the final implementation of the automated system becomes simpler, faster, more predictable and economical.
EDA tools verify hardware designs (CHECK-OFF) Using formal methods, the CheckOff family of tools provides levels of verification of digital hardware designs which would be impossible using traditional simulation based techniques.
SPECTRUM shows feasibility of the integration of two industrially relevant formal methods (VDM and B) (VDM+B) VDM and B are two mature formal methods currently in use by industry and supported by commercial tools. Though the methods are basically similar, the coverage of their supporting tools differ significantly. The SPECTRUM project has shown the feasibility of integrating support for the two methodologies.
The right tools for design and development (TOOLS) By using a single common design framework that integrates existing and new system design tools, designers will be able to reduce design cycles and time-to-market drastically, for complex embedded systems.
Toolset for the development of Safety Critical Real-time Embedded Systems (SACRES) The results of the SACRES project are a set of tools and the supporting methodologies for designers of embedded systems. The emphasis within the project is on formal
Formal Methods Diffusion: Past Lessons and Future Prospects Page 13 of 71
Version 1.0 September 2000
development of systems, providing formal specification, model checking technology and validated code- generation.
The search also highlighted a number of awareness activities that have a formal methods component: the European Network of Clubs for Reliability and Safety of Software Industrial Seminars on Formal Methods, and Information Resources on Formal Methods.
The “success” stories and the focus of the later Esprit projects highlight the themes of:
• domain specificity (safety critical, PLCs)
• the transition of model checking into the software verification area
• use of B and VDM
• hardware verification
However, overall the investment in formal methods R&D, especially in the earlier projects, has not led to any significant industrial take up of formal methods. In the description of the LACOS (Large-Scale Correct Systems Using Formal Methods) project it is claimed that “Generally speaking, industry still needs evidence that formal methods can be used in large applications in practice. The aim of the LACOS project is to establish and demonstrate formal methods, specifically RAISE, as a viable industrial technique in the scalable production of large, correct IT systems”. This does not appear to have been achieved. It would also appear that the key tools that seem to be being used have not come from partly funded industrial collaborative projects but from basic research or industrial investment.
In order to improve the exploitation of later Esprit projects, projects were required to develop exploitation and business plans to
demonstrate a return on their investment. Interestingly the guidance on these projects did not reference the work of Moore [Moore1] or provide a specific model for the diffusion of high technology products. All too often, we suspect, companies would seek to capture a small percentage of a large market with incremental increases in sales or services without any real credibility that they would achieve this.
3.1.3 NASA and US Programmes
The U.S. National Aeronautics and Space Administration’s (NASA) past programmes on formal methods have been well regarded and produced substantial technical results. For example, see [NASA], which summarises the NASA Langley Research Center research programme. NASA Ames has also been involved with formal methods R&D (as in their use of model checking to discover errors in the Deep Space I probe software). NASA’s positive technical results with formal methods have only been mildly reflected in technology transfer. There has been some limited uptake in places such as Rockwell-Collins and Union Switch and Signal, but there is little doubt that within NASA, formal methods are still very much in a missionary market. The general approach at NASA appears to be
1. building up a corpus of formal methods successes
2. participating in various RTCA standards committees with the aim of including formal methods in relevant standards (usually as a complementary technology
3. encouraging teaming between formal methods experts and researchers with industrial players
NASA’s strategy is long term and has funded or undertaken a wide variety of work. NASA Langley has been a significant funder of SRI’s PVS system and has supported the development of the fundamental concepts that led to abstract
Page 14 of 71 Formal Methods Diffusion: Past Lessons and Future Prospects
September 2000 Version 1.0
datatypes, a tabular notation, and the formal semantics of the language. NASA Langley also funded the creation of a PVS validation suite. Other tools are also used in the programme for such things as hardware verification and the reverse engineering of Ada programs.
Other programs, depending upon perspective, have been less satisfactory. The discussion below on Computational Logic Incorporated (CLI) is indicative of a programme where there were successful results on a technical level, but limited (perhaps negative) results with regards to mission orientation.
For some programmes, the sponsoring organisations had a vague sense as to why investment into formal methods could be of importance to them, but the organisations did not clearly enunciate their interests in terms of mission orientation, diffused research responsibilities through various departments, and often used relative newcomers to monitor professional calibre R&D groups. Without the technical knowledge and mission orientation, it was next to impossible to usefully direct sponsored R&D groups and, consequently, the R&D groups became primarily technically focused with little consideration of technology transfer and market realities.
3.1.4 R&D funding
World-wide, governments have been investing in technology development and transfer. While we would not wish to paint ourselves as experts in the various international investment strategies, loosely, one can identify two types of investment: Programme Driven and Company Driven. By Programme Driven, we mean that sponsoring organisations define the primary themes and research directions. The definition may be loose by defining a general area of interest (e.g., Critical Infrastructure) to quite specific (e.g., apply Model Checking to Authentication Protocols). By Company Driven, we mean
that the primary R&D agenda is provided by commercial entities with Government sponsorship through various tax easements or proportionate investments/subsidies. Programme Driven often requires collaboration between organisations, at least in Europe.
It should be noted that there is a grey area in which the distinction between Programme Driven and Company Driven is blurred as most programmes such as Esprit define themes within which a particular consortium has to convince the funders that they have a commercial, yet pre- competitive, interest in the work.
Some example programmes are:
• Industrial Research Assistance Programme (IRAP) [Canada]. This programme provides proportionate investment/subsidy for pre-commercialisation demonstrations of innovative technologies. However, IRAP restricts the areas of interest to environmental technologies, enabling technologies (in advanced manufacturing, biotechnology, information technology and advanced materials), and Aerospace and Defence. [Niche market driven.]
• Scientific Research and Experimental Development (SR&ED) [Canada]. Depending upon whether the commercial entity is a Canadian controlled or foreign controlled entity, and upon the size of the entity’s revenues, the Government will either provide proportionate refundable tax credits (i.e., cheques) or non-refundable tax credits (credits against tax owing). The programme provides tax incentives based on scientific and technical research defined by the corporation. However, the
Formal Methods Diffusion: Past Lessons and Future Prospects Page 15 of 71
Version 1.0 September 2000
company must identify the research objectives, the likely advancements, the scientific and technological uncertainty and summarise the work performed. Revenue Canada makes use of technical experts to review submissions so as to determine whether the purported research is actual research. [Market driven.]
• Small Business Innovation Research (SBIR) [U.S.] and Small Business Technology Transfer (STTR) [U.S.] These two programmes are used to harness innovative talents of small U.S. technology companies. SBIRs fund early- stage R&D whereas STTRs fund co-operative industry/research organisation R&D. The second programme is primarily a technology transfer vehicle. Funding is phased, with a first phase focusing on testing the scientific, technical and commercial merit and feasibility of a particular concept. If the first phase proves successful, then a company can apply for second phase funding that further develops the concept to a prototype stage. SBIRs are competitive. [Programme driven; niche marketing]
• ESSI This European programme provides funding for software process improvement experiments as defined by the needs of the user company
• NASA Research Announcements (NRAs) [U.S.] and Broad Area Announcements (BAAs) [U.S.] Though there may be some technical distinctions, we have combined these two forms of U.S. government procurements. Both types of announcements
describe generalised research goals (perhaps with some indicative research approaches) and solicit proposals to achieve the goals. For example, the recent NASA Langley NRA for research, development, prototyping and implementation of flight critical systems design and validation technologies includes proposals related to design correctness and certification, fault-tolerant integrated modular avionics and operational malfunction mitigation. Formal methods R&D is to be supported under this NRA. [Programme driven].
It is not within our purview or expertise to pass judgement on these programmes, but a few observations can be made. A number of these programmes can be used to accelerate technology adoption and to provide funding leverage for achieving prototype development and experimentation. IRAP, SBIRs, STTRs all provide potentially useful commercialisation channels if the supported market niche is one in which true commercialisation opportunities exist. However, the success or failure of such efforts is not only dependent upon the actual concept being developed, but also upon the commercial acumen and interests of the commercial company. There are companies that seem to exist on SBIRs and are unable to actually move anything to a true commercial opportunity. There may well be cultural forces that make it difficult for a company with the R&D culture that seeks to win SBIRS and Esprit projects to break out and commercialise wider the technologies being developed.
3.2 The formal methods landscape - the conference circuit
The formal methods page of the World Wide Web Virtual Library (FMVL) has a reasonably comprehensive listing of
Page 16 of 71 Formal Methods Diffusion: Past Lessons and Future Prospects
September 2000 Version 1.0
notations, methods, tools, publications and individuals involved with formal methods. There is, as one would suspect, something of an academic bias to the overview, but it is still a good starting point.
As of late August 1999, there is a listing of eighty notations, methods and tools in FMVL. (Collectively, we will call the notations, methods and tools “formal artefacts.”) Geographically, the substantial majority of the formal artefacts are from either Europe or North America. Many of the formal artefacts have their roots in either academic organisations or corporate R&D groups. Few of the formal artefacts have had much impact on industry. The main underlying theme of the formal artefacts is an underlying view, by their developers, of the importance of mathematics to the development of computer systems.
In 1989, a survey of formal methods tools was taken of the attendees of a workshop on formal methods for trustworthy systems (FM’89) [FM89]. While biased towards work performed in Canada, the United Kingdom and the United States, it provided a good indication of where significant R&D resources were being spent (especially by those interested in computer security). Of all the systems mentioned, few appear to have survived the 1990s or been used extensively outside the core tool development group(s). Those that have appear to be Asspegique, NQTHM (or its successor ACL2), EHDM (as manifested by its successor PVS), EVES (primarily as manifested by its successor Z/EVES), HOL, Larch, OBJ, SPARK and CADiZ.
What is particularly striking about FM’89 is the lack of any real discussion of model checking (which is also true of the international survey [Survey] discussed below), arguably the most successful commercial application of formal methods to date. Tools and techniques for developing and analysing critical systems were a predominant theme of the workshop.
The recent Formal Methods in Computer- Aided Design (FMCAD) conferences [FMCAD96, FMCAD98] are particularly noteworthy for their mix of industry and academic participants and the real excitement (especially in 1996) that formal verification techniques (primarily as embodied by various model checking technologies) were being seriously adopted by major hardware vendors. Conference sponsorship by companies such as Cadence, Hewlett-Packard, Synopsys and Intel were suggestive of corporate interests. So too was the presence of corporate headhunters looking for scarce formal verification talent. (Apparently, there is an ongoing shortage of talent well-versed in hardware design and formal analysis.) Much like the CAV (Computer Aided Verification) conferences, FMCAD is a good mix of academic research motivated by realistic industrial concerns. However, one does have the impression that the complexity of today’s chips is moving faster than the analysis technologies used to validate chip behaviour. This is also true for software with each generation being larger than that before, as exemplified by the purported 60 million lines of code estimated for Windows 2000.
A significant amount of research work on Binary Decision Diagrams (BDDs), their optimisation and applications has been reported at FMCAD. Theorem proving, however, was not absent as indicated by papers on the application of PVS and ACL2. There were also research papers discussing how an industry standard hardware description language (VHDL) could be used formally.
Industrial interest in formal “software” verification was further demonstrated through a recent Intel Research Symposium held in Santa Clara [Intel98]. At this symposium a collection of leading experts in formal methods (John Rushby, Dan Craigen, Ed Clarke, Gunnar Stalmarck, Michael Norrish, J Moore and David Stringer-Calvert) were invited to give
Formal Methods Diffusion: Past Lessons and Future Prospects Page 17 of 71
Version 1.0 September 2000
presentations relating to software verification. Interestingly, it was the complexity issue that is a prime motivation for Intel’s (and we suspect others’) interests in new technologies. Recent hirings by Microsoft Research are also suggestive that the increasing complexity of the software underlying their products requires new development and verification technologies, and Microsoft has recently started a joint project with Oxford University PRG.
As discussed in [CAV99], the Computer- Aided Verification conferences are dedicated to the advancement of the theory and practice of computer-assisted formal analysis methods for software and hardware systems. The conference covers a spectrum from theoretical results to concrete applications. Contributions have come from both academia and industry and from researchers and practitioners. According to one of the conference series founders (Ed Clarke, CMU), the principal CAV ethic is one of solving real problems. The ideal CAV paper would consists of theoretical advances plus supporting experimental data.
Presentation topic areas were processor verification, protocol verification and testing, infinite state space, theory of verification, linear temporal logic, modelling of systems, symbolic model- checking, theorem proving, automata- theoretic methods, abstraction, and tool presentations.
The World Congress on Formal Methods (FM99) [FM99] was probably the largest formal methods conference held to date with over 700 attendees. The conference consisted of a technical symposium featuring 92 scientific papers, a tools exhibition featuring commercial and experimental tools, twelve user group meetings and workshops (Abstract State Machines, B, CoFI (Common Framework Initiative for Algebraic Specification and
Development of Software), Coq, Larch, Obj/CafeOBJ/Maude, ProCos (Provably Correct Systems), PVS, SPIN, TLA+ (Temporal Logic of Actions), VDM and the Z User Group Meeting), and twelve industrial tutorials (Avionics, B, Embedded safety critical systems development as supported by the ESPRESS approach, formal methods and human computer interaction, Petri Nets, PVS, Railway Systems, Requirements elicitation and specification, security, telecommunications, TLA+, and testing). As summarised by the FM’99 general chair (Dines Bjorner), the technical symposium consisted of
• 20 papers on software engineering issues
• 19 papers on theoretical aspects of formal methods
• 22 papers on the application areas of avionics, safety, security and telecommunications
• 31 papers on tools, notably model checking and notations including ASM, B, CSP-OZ, Esterel, Larch, OBJ/CafeOBJ/Maude, Statecharts, VSPEC and Z
Though it is true that the tools exhibition was advertised to have a mix of commercial and academic tools, the tools are predominantly from academic institutions. Interestingly, most of the commercial companies involved in the tools exhibition are small in size (e.g., B-Core, Formal Systems Ltd., Prover Technology and IFAD).
Summarising the associated user group meetings and workshops:
• ASM, focusing on the practical aspects of the Abstract State Machine method. (Note that Microsoft Research has recently
Page 18 of 71 Formal Methods Diffusion: Past Lessons and Future Prospects
September 2000 Version 1.0
hired one of the main ASM researchers: Yuri Gurevich).
• The B-Method, focusing on tools and applying B in an industrial context.
• CoFI: Primarily an introduction to CoFI, focusing on the features of the CASL language, methodology and tools.
• Coq: Recent developments with the Coq proof environment, exchange of experiences, and general discussions.
• Larch: Focusing on discussion of interface specification languages.
• OBJ/CafeOBJ/Maude: General presentations on various topics pertaining to OBJ, CafeOBJ and Maude.
• ProCoS: Basically a discussion of the results (including the Duration Calculus) arising from the ESPRIT funded ProCoS project.
• PVS: Recent uses of PVS and planned enhancements.
• 6th SPIN International Workshop: Focusing on the practical aspects of using automata-based model checking in the systems engineering process.
• TLA+ (Temporal Logic of Actions): Discussion of the application of TLA+ and various tools, e.g VSE-II).
• VDM: Language and methodology issues motivated by industrial experiences and recent
industrial applications.
• Z: Educational issues, Z/EVES and an update on the proposed ISO standard.
The two SPIN Workshops [SPIN99] provide a sense of the active work ongoing in model checking. The 5th International SPIN Workshop on “Theoretical” aspects of model checking was held in July 1999 as part of the FLoC 1999 conference in Italy. The 6th International Workshop on “practical” aspects of model checking was held in September 1999 as part of the FM99 conference (see above).
The 5th International Workshop looked at such issues as runtime efficient state compaction in SPIN; distributed-memory model checking; partial order reduction in the presence of rendezvous communications, adding active objects to SPIN; model checking of manifold applications; and divide, abstract and model-check. An overview of a new release of SPIN was also presented. The 6th International Workshop had presentations on model checking for managers, integrated validation management for XSPIN, analysing mode confusion, analysing feature interactions, the JAVA PathFinder, a visual interface for Promela. It included an invited presentation of the work at Lucent on applying model checking to the development of commercial systems written in C.
The International Workshop on Current Trends in Applied Formal Methods [FMTrends] aimed at focusing on key technologies that might broaden the application of formal methods in industry. In addition there were discussions on developing a “roadmap for formal methods.” From an assessment of the current status of the technology, the aim was “to identify intermediate goals in order to obtain industrial strength formal methods and a routine application in the software engineering process.” Both industrial and
Formal Methods Diffusion: Past Lessons and Future Prospects Page 19 of 71
Version 1.0 September 2000
academic representatives gave presentations. The invited talks covered substantial territory with Wolfram Buttner’s talk being particularly noteworthy for its focus on formal methods adoption (mainly model checking and equivalence checking) at Siemens.
Verification tools from North America and Europe were demonstrated at the tools fair (including Uniform, Z/EVES, FDR2, VSE- II and the IFAD Toolset).
The TPHOL (Theorem Proving in Higher Order Logics) [TPHOL] [TPHOL99] conference series focuses on all reasoning tools for higher order logics. The conference series evolved from the informal HOL users’ meetings. According to the TPHOL web page [TPHOL] the main theorem proving systems supporting higher order logics that have formed the subject matter of the conferences are Coq, HOL, Imps, Isabelle, Lambda, Lego, Nuprl, ProofPower, PVS and TPS. Pointers to all these systems are available at [TPHOL]. Though the affiliations of the authors are not listed (on the TPHOL website), it appears that only two papers have authors with industrial backgrounds (with Intel): Harrison’s paper on “A Machine-Checked Theory of Floating Point Arithmetic,” and Carl-Johan Seger’s co-authoring of “Lifted- FL: A Pragmatic Implementation of Combined Model Checking and Theorem Proving.”
One of the significant notations of formal methods is Z. Z has been broadly disseminated and supported by numerous books and tools. It is used in education, research and industry. A series of conferences has been held on Z, with the 11th International Conference of Z Users [ZUM98] held in Berlin, Germany, in September 1998. In general, [ZUM98] primarily consisted of presentations from Z researchers, with only a limited application and industrial perspective. Topic areas included concurrency, tools, Z and HOL, safety-critical and real-time systems,
semantic theory, theory and standards, reasoning and consistency issues, refinement and object orientation. The next conference will expand from a purely Z perspective to include presentations relating to the B tool.
3.3 Key industrial areas and applications
3.3.1 Introduction
In this section we review the application of formal methods in three key industrial areas: safety related systems, security critical systems and hardware development. The review does not attempt to be exhaustive but does hope to cover the major applications with an emphasis on real projects.
3.3.2 Safety related systems
In this section we review some of the applications of formal methods to safety related systems. We consider nuclear, railways, defence, aerospace and avionics applications.
One should perhaps note the scale of safety related systems. The French SACEM train control system has about 20,000 lines of code, the primary protection system (PPS) for the Sizewell B nuclear reactor has 100,000 lines of code (plus about the same amount of configuration data). While challenging for formal methods they are modest in size when compared with the10M lines of code in a typical telephone switch.
Nuclear industry
The software-based safety system we review is the primary protection system (PPS) of the Sizewell B nuclear reactor, a system which aroused widespread interest in the UK scientific and engineering communities, as well as in the world at large. The independent analysis and testing
Page 20 of 71 Formal Methods Diffusion: Past Lessons and Future Prospects
September 2000 Version 1.0
of the software involved a “searching and complete” examination of the software and required that any errors found should be insignificant. For Sizewell B this involved an extensive retrospective application of static analysis using Malpas of the PLM source code, involving about 200 person years of effort [Ward93], as well as a comparison of the object and source code [Pavey]. While the sponsors of this work are careful not to present it as “formal verification” it comes within our broad definition of formal methods. There was also some inspired unofficial analysis of one of the protocols used [Anderson]. While the effort to do the initial analysis was considerable, the costs of maintaining and modifying the analysis in the maintenance phase is reported to be very favourable when compared to the costs of retesting. This is due to the modularity of the analytical evidence and the cost of the elapsed time required for testing.
In Canada there has been considerable focus on the issue of safety critical software following the licensing delays arising from the assessment of the Darlington shut down systems. In 1987 the Atomic Energy Control Board (AECB) identified the shutdown system software as a potential licensing impediment and it concluded that it could not be demonstrated that the software met its requirements. Despite an intense and largely manual effort devoted to demonstrating that the code satisfied its table based specifications the system was judged to be unmaintainable and the systems are being reimplemented. To do this, considerable effort has been expended in developing a coherent set of software engineering standards and using these in trial applications. These standards set clear principles for the development of safety critical software based on:
• good software engineering process
• statistically valid testing
• mathematically formal verification
One of the motivations for moving towards mathematical methods is that they provide a more rigorous notation for review [Joannou90]. More recently Ontario Hydro have developed PVS theorem proving support for this work (winning an internal prize) and the table based approach has diffused into other areas.
Railway applications
Although in many railway projects the software is developed to good industry practice, using structured design and limited use of static analysis for control and data flow analysis, there is a growing body of examples of the use of formal methods. For example, GEC-Alsthom in France has made extensive use of formal methods and machine assisted mathematical verification. The use of formal methods originated in 1988 and since then has seen an increasing scope in their use, greater machine support achieved by the development of the B Method and the Atelier B toolset, and more recently the integration of the formal methods with the requirements engineering process. Formal methods have been applied to a number of train protection products (e.g. those used on the Cairo and Calcutta Metros and recently to the new Paris metro line [Behm99]). The costs of applying formal methods have decreased by about two orders of magnitude from the early projects reported in [Survey] and from 0.5 to 2 lines per hour using B today [Chapront99]. The development of the B process and tools has been supported by the French Railways and RATP. Interestingly, the regulators require the delivery of the proof file from Atelier B. Also the use of Atelier B obviates the need for module testing. This use of machine-checked formal methods for verification is some of the most advanced in the world in terms of rigour and scale of application.
There have also been examples where formal methods have been applied to the
Formal Methods Diffusion: Past Lessons and Future Prospects Page 21 of 71
Version 1.0 September 2000
modelling and validation of signalling requirements and specifications. For example special purpose theorem provers have been used in Sweden to model specifications [Erikson96], and some protocols have been analysed in detail with the HOL theorem prover as part of academic research [Morley93]. Network SouthEast also commissioned a large formal specification (~300 pages of Z) of a signalling control system. The ability to model the railway signalling philosophy and demonstrate with high confidence that changes to it are consistent and safe may be an important tool in facing the future challenges of new and hybrid control regimes. There are examples of model checking using SPIN [Cimatti98] and Prover [Stalmark]. There are also some significant applications of RAISE as reported in [Haxthausen99]. Developments in the nuclear industry in France follow a different paradigm. The tool SAGA, based on the language LUSTRE, is an industrial tool used in France by Merlin-Gerin. It carries out the familiar type checking activity but also carries out the automatic code generation.
Defence (UK)
In the UK the Ministry of Defence standard Def Stan 00-55 sets out requirements for a formal methods based approach. The application of this standard has been progressive. One of the first applications is reported in [King99]. This system checks whether it is safe for helicopters to take off or land on a naval ship. Praxis Critical Systems is responsible for developing the application software and this includes formal specification, implementation in SPARK Ada, flow analysis and proof. Praxis Critical Systems has also worked with Lockheed on the verification of the safety critical software used in the Hercules C130J aircraft. This includes implementation in SPARK, flow analysis and proof [Croxford96]. This project was originally seen as a retrospective addition of assurance but the techniques were found
to be so effective that Lockheed introduced them into the forward development path.
While the UK MoD sponsored the development of early static analysis tools more recent policy, at least from the Defence Evaluation and Research Agency (DERA), has been to concentrate on the use of CSP and the FDR toolset as their strategic technology. (Note that FDR2 evolved from interaction with mission critical groups within the defence industry.) There is also work continuing on extending static analysis to deal with pointers.
Avionics
The avionics industry approach is represented by the requirements of DO178B: formal methods are not mandated in DO-178B, but there are static analysis requirements. On the commercial side GEC-Marconi Avionics are convinced of the benefits of both static and dynamic analysis. They applied formal methods to certain critical algorithms in the Boeing 777 Primary Flight Computer. Formal specification was beneficial in some of the simpler algorithms, and problems were identified and resolved. However, formal proof was less beneficial in the same algorithms and the task was labour intensive of the highest grade of effort. Attempts to apply formal methods to more complex algorithms also required high- grade effort and produced no useful result (as reported in [SOCS98]). Their general conclusion based on this experience was that formal methods have limited applicability. However there are strong economic drivers from the avionics industry to reduce the test effort required from DO178B and this is driving both R&D and also the revisions to DO178B.
Moreover work on other flight critical aspects has been sponsored by NASA and is described in [Butler98, Owre99]. Rushby’s work on formal methods and assurance is widely read in the safety critical community [Rushby93]. The parallel made between proof and
Page 22 of 71 Formal Methods Diffusion: Past Lessons and Future Prospects
September 2000 Version 1.0
calculation is an important insight underwriting the need for capable tools. The development of PVS and its associated application in requirements, hardware verification and fault tolerance is a significant step in the maturing of formal methods: over 1200 sites have installed PVS. However most of the work reported is either research or initial industrial applications. It has yet to be seen whether the use of PVS will take hold after these first projects.
Earlier work in the aerospace sector [SIFT] provides an example of over claiming, or at least a lesson in the care needed when presenting the results of this type of work. An independent investigation into the project provided some detailed criticism and later work showed flaws in the proofs that were undertaken.
Other applications
There is also considerable work in Europe on the development of drive by wire automotive systems. These are leading to the use of deterministic time triggered architectures with formally verified hardware and protocols.
The automotive industry has also sponsored the development of a verified compiler for PLCs [DaimlerBenz]. There is also related work in the defence industry [Stepney98].
There are also emerging new applications of model checking to operator procedures in aerospace, air traffic control and nuclear applications [Lüttgen 99, Zhang99]. Work is being carried in both air traffic control and nuclear plant control in which the complex procedures that have to be followed by the operator are formally modelled and desirable properties proved.
While the process industry does not appear to innovative in its use of formal methods there is an example of a small verified hardware design, motivated by the difficulties in exhaustive testing. The application of model checking to PLC
programs also seems a promising area and one that has been picked up by the SACRES project led by Siemens.
There is also perhaps the world’s only shrink-wrapped formally specified product, DUST-EXPERT , an advisory system on avoiding dust explosions [Froome99]. This was developed by Adelard and involved a 20k line VDM specification. The product was sponsored and approved by the UK Health and Safety Executive.
3.3.3 Security applications
Since the 1970s there has been significant investment into formal methods by agencies concerned with national security. Investment by these agencies has helped to accelerate certain aspects of formal methods, but has also retarded evolution through, for example, restrictions on the dissemination of research results and on the distribution of leading-edge formal methods systems. As of the late 90s, many of these restrictions have been significantly eased.
In North America, early security-related formal methods research was aimed at tool- supported processes, with definite emphasis on relating formal security models, through top level specifications, to code. Systems such as the Gypsy Verification Environment, HDM, Ina Jo, and m-EVES all aimed at handling the above, with varying degrees of success. Early research emphasised processes and tools that led to "code verification." Some of these systems (or their ilk) were used, again with varying degrees of success, on security products of some importance to national governments. Deployed security-related systems exist that have benefited from these tool-supported analyses.
Assurance concerns about security-related products were not only manifested by R&D in formal methods, security modelling, etc., but also in significant initiatives to produce effective information technology security criteria and to assess commercial security
Formal Methods Diffusion: Past Lessons and Future Prospects Page 23 of 71
Version 1.0 September 2000
products against these criteria. In the U.S., this led to early work on the Trusted Computer System Evaluation Criteria (TCSEC) and the Federal Criteria for Information Technology Security. Similar criteria (ITSEC) were also developed in a number of other countries, including Canada, Germany, the Netherlands and the United Kingdom. Recognising that the burgeoning number of national criteria would impede the acceptance of security products in other countries because of the numerous evaluations that would be required, a number of countries (in particular, Canada, France, Germany, the Netherlands, the U.K. and the U.S.) harmonised their criteria into an internationally, standards-based "Common Criteria."
Though at the time of writing this report it appears that the number of products that have been evaluated against the Common Criteria is small, one should not underestimate the importance some governments are placing on the Common Criteria and, in general, on evaluation. For example, the formal signing of the international agreement, in October 1998, was given rather extensive publicity. Furthermore, in the U.S., NSA and NIST jointly formed the "National Information Assurance Partnership," (NIAP) which is an initiative designed to meet the security testing needs of both information technology producers and users. As appears to be the case in other signatory countries, the U.S. is privatising the evaluation process by setting up commercial testing laboratories.
Though specific to the U.S., the goals of the NIAP partnership (extracted from niap.nist.gov/howabout.html) appears to generalise to other signatory countries:
• Promote the development and use of security-enhanced IT products and systems.
• Demonstrate and increase the value of independent testing and certification as a measure of security and trust in information technology.
• Foster research and development to advance the state-of-the-art in security test methods and metrics.
• Move current government- conducted evaluation and testing efforts to accredited, private sector laboratories.
• Help establish the elements of a robust commercial security testing industry.
• Establish the basis for international mutual recognition and acceptance of security products test results.
As of October 1999, NIAP is initially focusing on three initiatives:
• Security Requirements: NIAP services to aid interested parties in specifying robust, testable security requirements.
• Security Product Testing: Efforts focusing on the evaluation of products against specified security requirements.
• Security Testing Research and Development.
The dearth of evaluated products appears to be a result of the preliminary nature of the Common Criteria and the ongoing process of establishing evaluation laboratories. For example, in Canada, there are two evaluation labs. According to a the Canadian Communications Security Establishment web pages (http://www.cse.dnd.ca/) there are 13
Page 24 of 71 Formal Methods Diffusion: Past Lessons and Future Prospects
September 2000 Version 1.0
certified products ranging from evaluation level EAL1 to EAL4. Note that the requirement for a formal model of a product's security policy is required only at level EAL5 or above.
From a formal methods perspective, there appears to have been a change on the application of the technology from the 70s/80s to the current day. While much of the early focus (at least in North America) was on tools and processes that handled the entire development process (from security models to code), much of the current work appears to be at a security modelling level. A review of the literature suggests substantial work on cryptographic/security protocols and pointwise applications elsewhere.
As an example of the current work with formal methods on security protocols, consider a July 1999 workshop on the subject [FMSP99]. Papers and presentations included:
• A survey of formal analysis of cryptographic protocols.
• The CAPSL intermediate language, supporting a security protocol front-end to FDR.
• Analysis of a library of security protocols using Casper and FDR.
• Undecidability of bounded security protocols.
• Towards the formal verification of ciphers, logical cryptanalysis of DES.
FM99 [FM99] contained a track on security and formal methods. Seven papers were accepted for publication including:
• Secure interoperation of secure distributed databases.
• A formal security model for microprocessor hardware.
• Abstraction and testing.
• Formal analysis of a secure communication channel: Secure core-email protocol.
• A uniform approach for the definition of security properties.
• Group principals and the formalisation of anonymity.
In addition to the formal papers, there were two tutorials:
• Formal methods for smart cards based systems.
• Automated validation and verification of security protocols.
Other conferences of note are the IEEE Symposium on Security and Privacy (held annually in Berkeley, California), the IEEE Computer Security Foundations Workshop (last held in Italy, June 1999), and the National Information Systems Security Conference (Arlington, Virginia, October 1999) (NISSC). These conferences cover a broad range of topics. For example, NISSC tracks included the assurance, criteria and testing; electronic commerce; networking and the internet (with substantial interest in PKI); and R&D.
It appears that most efforts at evaluating security products are occurring at the EAL4 Common Criteria level or below and, consequently, limiting the application of formal methods technology. It also appears that there is a much improved understanding within the security community on how to make use of formal methods technology in a manner that is much more effective than the earlier efforts in the 70s/80s, when formal methods tools were prematurely applied to significantly
Formal Methods Diffusion: Past Lessons and Future Prospects Page 25 of 71
Version 1.0 September 2000
complex tasks. The use of formal methods by the security community seems consistent with the use of the technology by other communities: formal modelling is used to improve understanding of the artefacts being developed, while formal analysis is used both to "formally debug" formal models and to provide assurance of consistency (e.g., between a security policy and a top level specification).
In conclusion, the broad range of formal methods usage (electronic commerce, internet, security protocols, operating systems) by international R&D and commercial groups, strongly suggests maturation. However, the lack of formal methods in endorsed products also suggests the existence of adoption barriers.
3.3.4 Hardware and microcode verification
There is an enormous amount of work on the application of formal methods to the design and verification of hardware designs. Formal methods appear to be
having a (potentially) major impact in the hardware industry. Some of the large hardware vendors are either developing in- house capabilities in formal verification (e.g., Intel) or are adopting externally developed technologies (e.g., AMD with ACL2). The forthcoming Merced chip from Intel may be the first mass produced artefact to have formally verified parts of the hardware and microcode design (verified using HOL) although Intel have also retrospectively verified the floating point arithmetic of the Pentium Pro processor [Intel99]. In doing so they have developed a framework that combines advanced model-checking technology, user- guided theorem-proving software, and decision procedures that facilitate arithmetic reasoning.
.
Figure 1 Levels of hardware verification
Model checking and related automatic decision procedures are the most widespread verification technology used in
Page 26 of 71 Formal Methods Diffusion: Past Lessons and Future Prospects
September 2000 Version 1.0
hardware verification. A significant amount of research work on Binary Decision Diagrams (BDDs), their optimisation and applications has been reported at FMCAD. Theorem proving, however, was not absent as indicated by papers on the application of PVS and ACL2. There were also research papers discussing how an industry standard hardware description language (VHDL) could be used formally.
Wolfram Buttner, Siemens AG, gave an outline of their work on hardware verification at the FM-Trends 98 conference in Boppard [FMTrends]. Equivalence Checking (EC) is used to validate VHDL refinements and can routinely handle up to a million gates (see for example Chrysalis Symbolic Design who have "checked the formal equivalence of nine, extremely large, complex ASICs. Each of these ASICs comprises between one and two million gates. The equivalence checker typically takes between 10 minutes and 3 hours to formally verify each chip".). At least 800 Siemens engineers are required to use EC. One attraction of EC is that there is a substantial ease-of-use factor. Model Checking (MC) is used to validate circuits and software (in SDL). In the latter case, MC is used at all levels of the development process. Buttner noted that MC is still in the missionary selling stages and that there are technical issues outstanding. In particular, while MC can handle a few hundred state bits, ASICs often have a few thousand state bits. Buttner concluded that the technology is difficult and has high adoption barriers. However, Siemens has had encouraging experiences and feels that the commercial prospects are good.
At higher levels of abstraction there are many examples of the application of theorem proving.1 The work at CLINC on
1 Of the 386 papers in the HOL bibliography (http://www.dcs.glasgow.ac.uk/~tfm/hol- bib.html) about 30% relate to hardware verification.
the FM9001 is notable as part of their approach to developing a verified "stack" of products, and the verification of aspects of the Viper processor in HOL [Cohn89] resulted in a high profile for the claims for assurance that can be made following a proof of correctness. More recently the work with PVS at SRI and Rockwell Collins shows the continuing benefits of this form of semantic modelling and proof [Butler98].
3.4 Preliminary conclusions
The application of formal methods has a long history but they have not been substantially adopted by the software engineering community at large. Failure to adopt has arisen for numerous reasons:
• Research that was successful but required heroic efforts to get results (e.g., from lack of theory base, tools not usable).
• Tools being developed but not transferred to other groups (e.g., due to a combination of platform, and people issues).
• Large investments in tools without a corresponding advance in theory or method, premature attempts to increase rigour or increase scale.
• Results that did not scale from case studies: a common problem in the applied area.
• Not meeting customer requirements, e.g., sponsors’ concerns shift but the research community not listening. There have been some notable shifts in emphasis by government agencies that in the past were the prime sponsors of formal methods.
Formal Methods Diffusion: Past Lessons and Future Prospects Page 27 of 71
Version 1.0 September 2000
• Over ambition and failure to meet expectations (e.g., from overselling by the research community or from failing to meet unreasonable expectations of the sponsors).
Yet in some respects these could be argued to be the normal turbulence in any area of technological research and deployment. The overall perception of the current formal methods landscape is that it consists of:
• Techniques and tools mostly with a long history of usage and a community of use behind them. They have not been the product of relatively short term (3 year) R&D programmes.
• Smallish organisations (primarily university research groups and small commercial companies) with little in the way of widespread commercial adoption of core concepts and tools.
• Generally commercial companies making money from services not tools.
• Stable core groups but no major break out, some fission.
However there is significant take up of formal methods in critical industries. Large hardware vendors are either developing in- house capabilities in formal verification (e.g. Intel, Siemens) or are adopting externally developed technologies and in the area of safety there is active research, case studies and serious application of formal methods throughout the safety lifecycle. The key points to emerge are:
• There are significant applications of formal verification in the railway and nuclear industry with variable degree of rigour. Use of static analyses, interactive
theorem proving and model checking with the use of domain specific solutions.
• There is a need for integration of formal methods with the existing system and software engineering process. This has lead to the use of engineer friendly approach to notations – variants of table based notations, existing PLC notations — trading expressiveness with ease of analysis.
• There are examples of verified compilers for special purpose languages and rigorous back translation. There is significant work on verified hardware in flight and drive-critical applications.
• The use of model checking and equivalence checking has been adopted by hardware chip manufacturers. The forthcoming Merced chip from Intel may be the first mass produced artefact to have formally verified parts of the hardware and microcode design.
There are also examples of "over claiming", and the need for careful presentation of results – not an easy balance to maintain in commercially and academically competitive environments.
While not particularly scientific, it is instructive to identify the present limits of various FM-based technologies. Note that with judgement in abstraction and modularization, larger systems could be piece-wise analysed. Furthermore, even small systems may have substantial intellectual depth and challenge technological limits. With these caveats the present limits appear to be:
Page 28 of 71 Formal Methods Diffusion: Past Lessons and Future Prospects
September 2000 Version 1.0
Equivalence checking of 1 Million gate ASICS.
Model checking of 1000 latches at a time, using techniques for modularising and looking at components, e.g. there are claims of verifying large (1020 ) state spaces
Software verification from design to code of ~80Kloc.
Simple formally verified compilers for special purpose languages.
Static analysis of >150Kloc.
Specification and modelling of >30,000 lines of specification.
Theorem proving is a specialised activity, and there is some evidence of a skills shortage in hardware verification and proof techniques. Also despite some evidence that engineers can read and write formal specifications, mass uptake will be on derived partially automated solutions and the use and adaptation of familiar notations (e.g. tables, diagrams).
One should also note that investment in formal methods tools needs to be continual: they need to keep pace with the general industry rate of change in the implementation technology. While these changes often facilitate new functionality they also drive obsolescence. We have already mentioned hardware and operating systems but it also applies to databases, GUI builders, languages (Lisp, Smalltalk).
4 Analysis
4.1 The formal methods market
4.1.1 Characteristics of the market
As indicated in the preliminary conclusions above, the overall perception is that the
formal methods landscape consists of a variety techniques and tools. While the tools often tackle different problem areas their number would normally be taken for a sign of an immature market. However the formal methods market is not really a single market, more a particular technical viewpoint on a host of disparate activities. The landscape primarily consists of smallish organisations (university research groups and commercial companies) with no widespread commercial adoption of core concepts and tools. There are a number of small service companies using formal methods as a leading edge service. They are generally technically motivated to grapple with new tools and put up with being the first to use the theories or tools on industrial scale projects.
Presently, the only places where formal methods appear to be having a (potentially) major impact is with the hardware industry and parts of the safety critical industry. Some of the large hardware vendors are either developing in-house capabilities in formal verification (e.g., Intel) or are adopting externally developed technologies (e.g., AMD with ACL2). There are indications that the telecommunications industry may also be adapting the technology. However, the areas of security critical systems show very little adoption although of course this work is not particularly visible even though security was originally one of the prime motivations for significant R&D funding.
It is also instructive to speculate on the extent of the formal methods activities world-wide. Taking the number of attendees at FM99 as indicative of 10% of those involved in the software related activities, adding a similar number in the hardware fields and calculating the salaries and overheads of those involved we arrive at a figure of $1-2B per annum.
Formal Methods Diffusion: Past Lessons and Future Prospects Page 29 of 71
Version 1.0 September 2000
4.1.2 Economic and other drivers
Both the hardware and telecommunications industries have a compelling reason to adopt the technology. Intel’s Pentium Flaw (FDIV) cost the company hundreds of millions of dollars and was a wake up call that the then existing process for developing increasingly complex chips was inadequate. With the only option being to recall the chips (unlike the software industry’s general— but not ubiquitous— capability for releasing software patches), new scalable assurance arguments were necessary. Model checking was a good fit. (Model checking is discussed further below.)
Both the hardware and telecommunication industries are spending substantial resources on simulation and testing to achieve (or approximate) relevant coverage criteria. On some projects, the cost is nearing 50% of overall project funding. Increasing the efficiency of the analysis (through either increased coverage or more efficient and cheaper exploration) is a compelling economic argument. This is also true in the avionics and parts of the safety critical industry where the direct costs and time of achieving test coverage is driving the adoption or investigation of formal methods.
There are similar economic drivers in the aerospace industry. Lucas Aerospace report in their justification for their ESSI project PCFM that V&V accounts for typically 41% of total software production costs and code corrections for typically 32% of total maintenance costs. We have already noted (Section 3.3) that the use of B reduced the need for module testing in a railway application.
4.1.3 The impact of development process failure
The Intel Pentium Flaw demonstrated that the complexity of contemporary chips was exceeding the engineering processes
currently in place. The failure of Ariane 5, because of a software fault, indicated a failure in the software architecture and process. The failure of an U.S. Titan rocket was traced to a decimal placement error in its software.
Thomas Kuhn in his book “The Structure of Scientific Revolutions” [Kuhn] defines and discusses why paradigm shifts occur. In effect, a technical community will buy into (or start developing) a new theory/paradigm if the old theory is refuted or if the new theory extends scientific prediction in material ways. Often, the shift will occur when a significant infelicity with the current belief system is found. The Pentium Flaw was an indicator of such infelicities in the chip design processes.
The inability of hardware vendors and telecommunications companies to achieve an appropriate scope of validation with current technology may very well be another crisis point requiring a paradigm shift. Indeed the continual pressure on all companies to reduce costs and development times keeps new technologies under scrutiny. Companies are searching for order of magnitude improvements in cost and not just minor polishing of their processes: process improvement runs out of steam without a technology shift.
There also appears to be a lack of business models to assess impact of new technologies on process. There needs to be supporting, investment-oriented, risk type models of the engineering process that will allow the business case for new methods or techniques to be properly assessed.
4.1.4 The impact of communities
As suggested by the sociology group of Donald Mackenzie at the University of Edinburgh, community belief systems play a role in the adoption of new technologies. Hardware developers generally come with a strong engineering ethos. Consequently, they are receptive to new technologies that
Page 30 of 71 Formal Methods Diffusion: Past Lessons and Future Prospects
September 2000 Version 1.0
are strongly grounded technically and demonstrate economic potential. This community appears to be less accepting of high failure rates than the software community.
However, even a solid engineering community may be slow to adopt. For example, the avionics/astrionics community is highly self-referential (i.e., closed). Even high-cost software-related failures (e.g., at least one of the recent Titan failures at a cost of over $1B) has not resulted in an apparent change in perspective. Over the decades of aviation they have developed particular processes with which they are comfortable. Breaking into such a community not only requires a compelling reason to buy/adopt, but also requires the vendors to commit substantial resources to participate in the domain and to modify their technology accordingly.
Sometimes, the belief system of the R&D community differs from that of the sponsors and potential consumers. Perhaps a cautionary tale along these lines can be told with respect to a well-respected U.S. company Computational Logic Incorporated (CLI). CLI received substantial funding from various security- related agencies and was well respected for the work they performed in the development of NQTHM, ACL2, and the application of these tools (especially to hardware) and other FM-related activities. One of the driving tenets at CLI was the so- called “CLI Stack” in which R&D targeted rigorous mathematical development and proof for application programs, through compilation down to verified chips. The CLI vision was certainly comprehensive. However, ultimately, this was not a vision shared by a major customer, as it was perceived not to meet the customer’s mission requirements. Apparently, CLI did not adapt to the new market reality and R&D funding diminished.
Though some other R&D organisations have been somewhat more reactive to
market realities, in general terms one wonders whether the use of R&D service companies and university research groups are an effective means for technology transfer and industrial adoption. Many of the organisations involved in such R&D do not have a true commercial product perspective nor, in fact, are they interested in developing such a perspective. For some organisations, providing R&D services is their main source of revenues and technical enjoyment.
Instead of investigating means of taking solid R&D results and adapting them to niche markets (through whole product development) and then aiming for commoditisation, the R&D stays in an early market in which only highly gifted individuals can participate. Consequently, a question that sponsors must ask is whether they are expecting an R&D group to perform, essentially, pure R&D, or whether they are expecting the group to take a more market oriented view. Either option is reasonable, but it is crucial that there is a clear statement as to intentions.
We also note that the self-referencing, who talks to whom defines what we mean by a market in formal methods: it is not just a case of using the same technology. Of course markets have fuzzy edges but the technical communities considered here are quite closed with respect to other sectors. Avionics suppliers look towards other similar suppliers and to organisations like NASA for leadership. Process industry players look to key multinationals with safety records (Shell, Dupont) and process industry working groups (such as ISA SP84). Individuals in these groups do of course keep abreast with the wider picture and regulatory pressures can lead to some diffusion between sectors (it is an obvious question for the regulator to ask whether you have considered technique X that is used in sector Y). There are examples of some diffusion between sectors: normally from those with development funding outwards. The use of the static analysis
Formal Methods Diffusion: Past Lessons and Future Prospects Page 31 of 71
Version 1.0 September 2000
tools and their successors (Spade, Malpas and Spark) have migrated from their origins in the defence sector. Some would also claim that the whole area of formal methods has migrated out of the needs of national security.
Another important aspect of communities is the separation between the research communities. For example the self- appointed main formal methods conferences, even as late as 1998, had no representation from the model checkers. It is only with FM99 that these have been brought together in some form. The communities of model checkers, theorem provers, abstract state machines, protocol verification are often too distinct.
Different communities tend to have different research agendas and this can lead to a self-perpetuating separation. For example the perception by NASA sponsored work in [Rushby93] that it is the front end of the lifecycle that is important influenced a large body of work and importantly the tools. Hence the lack of code verification features in PVS. This lack of code verification has meant that this aspect has been downplayed. Yet there are significant economic drivers in the aerospace industry for the replacement of the lengthy and costly test coverage that is required by the relevant sector standard DO178B. Another example is the railway industry and the use of B where the emphasis on code verification has led to a quite different toolset, Atelier B.
In Moore’s terms [Moore1], there is a need to present solutions as a “whole product” for that community, e.g., a new and efficient method for solving propositional logic might accurately describe the technical basis, but as a product it would need to be seen as a “tool for checking that the shutdown logic meets its specification” or in the railways “a tool for checking that interlock specifications satisfy the signalling requirements”. It would need to interface to the tools, notations and
processes in use in these different application areas.
The community viewpoint highlights a number of important points:
1. Referencing important, different key players in different sectors.
2. Perceived needs and opportunities reflect community values.
3. Need to establish credentials within a community – entry costs can be expensive.
4. Research agendas can shift by external factors.
5. Technology and companies must be sensitive to shifts in community values. As always there is a need to listen to the customer.
4.1.5 Impact of government and standardisation bodies
Government and other large organisations can influence the adoption of technology through their impact on the regulatory process, in regulated industries, and through shaping the market with their significant purchasing power. The development by the UK Ministry of Defence of a standard Def Stan 00-55 caused considerable international comment as a major purchaser of safety critical systems set out its requirements for a formal methods based approach. The standard was updated and issued as a full UK Defence Standard in 1996 and is notable for its emphasis on formal methods. One might also note that the emerging generic international standard for safety related systems, IEC 61508, requires the non-use of formal methods to be justified for the systems in the higher safety categories.
In the security area the agencies in the USA and the UK have had a far reaching
Page 32 of 71 Formal Methods Diffusion: Past Lessons and Future Prospects
September 2000 Version 1.0
influence on the development of formal methods. Although they have been significant sponsors of formal methods R&D and applications they have also distorted the market through the earlier US embargo on verification technology and their prescriptive approach to the methods and tools that should be used. The development of PVS was in part a response to the potential problems of the technology embargo. However the development of national and security requirements for different techniques for different security evaluation levels has had a stabilising effect on the market, and there is some evidence that this is now providing a driver to formal methods application as more higher level security evaluations are coming through. This is partly a result of the demands of electronic cash and e-commerce applications.
NASA is currently an important sponsor of formal methods. The plans of Nasa and other future US programs are detailed in Appendix B.
4.2 Technology adoption models
4.2.1 Introduction
In this section we introduce two technology adoption models. The purpose of presenting these models, within a formal methods context, is to provide a systematic means of estimating the likely adoption trajectories of new formal methods technologies. Consideration of these models may suggest how various research programmes and projects could be altered to heighten the likelihood of success.
We will discuss the technology diffusion model of Everett Rogers [Rogers] and a high technology model developed by Moore [Moore1, Moore2]. If one is to stimulate the adoption of formal methods into the developmental organisations of
critical products (e.g., within electronic commerce) then one must not only focus on a strong R&D agenda but also on a comprehensive technology adoption agenda. Our discussion of Rogers and Moore is meant to provide a backdrop for considerations relating to adoption. It is our belief these models will improve the odds of adoption; however, if the developed technology is inadequate to the problems at hand, an ideal adoption plan will still fail. The reader should note that, in effect, we are taking a business perspective on adoption and view successful adoption as being business success.
In this report, we can only provide a brief introduction to the underlying concepts for the Rogers and Moore models. Readers are directed to [Rogers, Moore1, Moore2] for in depth discussion. The work of Rogers is quite general and concerned with technology diffusion in a number of cultures (e.g. of health care in South America), whereas that of Moore is primarily focused on mass market adoption and the associated marketing of high technology products. For the Rogers model, we provide two examples of its application. The first, example demonstrates how the Rogers model was successfully used in ORA Canada's EVES project, resulting in substantially enhanced adoption of the EVES technology. The second example concerns the Swedish company Prover Technology. We do not provide examples of the Moore model. However, [Moore1, Moore2] are replete with high-tech examples of both successful and failed adoptions.
4.2.2 Technolog