Omnibus - Pennsylvania State University

Omnibus

Towards a framework to allow software components to be reused safely

Thomas Wilson

Omnibus


Honours Dissertation

April 2003

Thomas Wilson

Supervisor: Dr Savi Maharaj

Dissertation submitted in partial fulfilment for the degree of BSc (Honours) Computing Science

Department of Computing Science and Mathematics

University of Stirling

Omnibus: Towards a framework to allow software components to be reused safely

v

Abstract The re-use of software components could lead to the next great leap forward in software development. In current software development, programmers continually waste vast amounts of time re-inventing a common set of components. Attempts to re-use components using current state-of-practice languages and frameworks frequently ends in failure. More formal languages can help by bringing the clarity of Mathematical rigour. However, such formal languages have yet to be adopted in the real world. This dissertation starts by establishing the reasons why attempts to reuse software components using current languages and frameworks so frequently end in failure. A new framework is presented called Omnibus, which attempts to provide the basic facilities that are needed to allow the safe re-use of software components. The language has a relatively simple semantics and is amenable to formal analysis. Re-use of software components in the language is based around the concepts of Contracts and Certificates. A Contract formally defines the public interface of a component while a Certificate provides a guarantee that an implementation of a component is consistent with its Contract. Certified components are guaranteed to be described by their associated contracts and hence can be re-used safely. The dissertation is divided into three parts. The first part consists of five chapters which give the reader an overview of the Omnibus language. The second part consists of two chapters which present techniques for executing and analysing programs written in the Omnibus language. The third part consists of three chapters which present the conclusions of the work.


vii

Acknowledgements I would like to start by thanking the whole Computing Science department at the University of Stirling. They have showed tremendous confidence in me and I hope this work helps justify that confidence to some extent. In particular, I would like to thank Savi Maharaj, Bob Clark, Ken Turner and Carron Shankland. Without the support these people gave me, this would be a far lesser piece of work. Without my supervisor, Savi, this dissertation would say a lot less using a lot more words. I have been working towards this dissertation for some time now. However, the person who has been working towards it for the longest time is my father. When I was still a relatively young child in primary school he would take me along to Computing clubs and encourage me to pursue my interest in the area. He has continued to support and encourage me throughout my academic career and remains my biggest influence. The importance of his contribution to my development as a Computer Scientist cannot be overstated. He has helped me get into the position to write a document such as this. Thank you Dad. The rest of my family have also been wonderfully supportive in all my work, especially my mother and brother. My mum has been a great friend and an even greater mother. I would also like to thank my friends for helping keep me a relatively sane and balanced individual whilst I worked on this dissertation. Thank you to Ali, Jennifer, Lloyd, Craig and all my other friends. I learnt many things in the year that I studied at University of California in Santa Barbara. Lectures given by Tevfik Bultan, Dick Kemmerer and Oscar Ibarra were worth travelling halfway around the world for on their own. Finally, I would like to thank the readers of this dissertation. I hope it is as rewarding to read as it was to write. Thomas Wilson. (April 2003)


ix

Summary of Contents Introduction to the Dissertation Part I: The Omnibus Language

Chapter 1: Introducing Omnibus Chapter 2: The Basics of Programming in Omnibus Chapter 3: Contracts Chapter 4: Advanced Aspects of Programming in Omnibus Chapter 5: Certificates and Reusable Components

Part II: The Omnibus Tools

Chapter 6: Translation to Java Chapter 7: Symbolic Execution

Part III: Evaluation

Chapter 8: Related Work Chapter 9: Future Work Chapter 10: Final Conclusions

Appendices

Appendix A: Language Syntax Appendix B: Omnibus Standard Libraries Bibliography Home Pages


x

Contents Introduction to the Dissertation xxiii

The Problem ..................................................................... xxiii Scope and Objectives .......................................................... xxiv Achievements and Contributions ........................................ xxiv What end-users will gain ...................................................... xxv Structure of the dissertation.................................................. xxv

Part I: The Omnibus Language 3 Chapter 1: Introducing Omnibus 5

1.1 State-of-the-art................................................................................ 6 1.1.1 Object-Oriented Programming ........................................ 6 Key concepts.................................................................................. 7 Encapsulation/Abstraction........................................................ 7 Inheritance ................................................................................ 7 Polymorphism .......................................................................... 7 Assisting technologies ................................................................... 8 Interface documentation ........................................................... 8 Testing...................................................................................... 9 Promises of components .............................................................. 10 Reality of the current use of components .................................... 10 1.1.2 More Formal Methods................................................... 12 Specification and Verification Techniques – a historical

perspective.............................................................................. 12 Design By Contract...................................................................... 14 Extended Static Checking............................................................ 15 Complete Formal Specification and Verification ........................ 18 1.2 Omnibus ................................................................................ 21 1.2.1 What is Omnibus? ......................................................... 21 Omnibus is Simple....................................................................... 22 Omnibus is Object-Oriented ........................................................ 23 Omnibus is functional.................................................................. 24 Omnibus is a Specification and Programming Language............ 24 Omnibus has Contracts ................................................................ 25 Omnibus has Certificates............................................................. 25 Omnibus is for developing robust real world software

applications............................................................................. 26 1.2.2 Comparing Omnibus to other types of languages ......... 27 Omnibus versus Java ................................................................... 27 Omnibus and Java versus other commercial programming

languages ................................................................................ 28 Omnibus versus Z........................................................................ 29 Omnibus Verification versus Testing .......................................... 29 Omnibus versus closely related research projects........................ 30


xi

Chapter 2: The Basics of Programming in Omnibus 31

2.1 The basics of expressions, classes and objects ............................. 32 2.1.1 Primitive expressions and types .................................... 32 Primitive types and values ........................................................... 32 Operators ................................................................................ 32 Boolean operators.......................................................... 33 Integer operators............................................................ 33 Generic operators .......................................................... 33 Variables................................................................................. 34 2.1.2 Classes ........................................................................... 34 Attributes ..................................................................................... 35 Constants ..................................................................................... 35 Functions ..................................................................................... 35 Constructors................................................................................. 36 Creators........................................................................................ 36 Operations.................................................................................... 37 Putting it all together in class....................................................... 37 Methods ....................................................................................... 37 2.1.3 Objects........................................................................... 38 Using methods ............................................................................. 38 Object equality............................................................................. 40 2.1.4 Core language classes.................................................... 41 String ........................................................................................... 41 Collection .................................................................................... 42 Map.............................................................................................. 43 Optional ....................................................................................... 43 2.2 Writing code ................................................................................ 44 2.2.1 Basic statements ............................................................ 45 Declaration statement .................................................................. 45 Assignment statement.................................................................. 45 Operation call statement .............................................................. 45 Construct statement ..................................................................... 46 2.2.2 Branching statements..................................................... 46 If statement .................................................................................. 47 Select statement ........................................................................... 49 2.2.3 Repetition statements..................................................... 49 While statement ........................................................................... 49 Repeat statement.......................................................................... 50 For statement ............................................................................... 50 ForEach statement ....................................................................... 51 2.2.4 Writing a first class........................................................ 52 2.3 Structuring applications................................................................ 52 2.3.1 Managing packages ....................................................... 53 Package directive......................................................................... 53 Uses clauses................................................................................. 53 2.3.2 Starting applications ...................................................... 54 Traditional approaches to starting applications ........................... 54 Starting OO applications in a manner becoming of an OOPL..... 55 2.3.3 Writing a first Omnibus application .............................. 57


xii

Chapter 3: Contracts 59

3.1 Basic principles of contracts......................................................... 60 3.1.1 Contracts in the real world ............................................ 60 3.1.2 Introducing a running example...................................... 61 3.2 Specifying contracts in Omnibus.................................................. 62 3.2.1 Requires and Ensures clauses........................................ 62 3.2.2 Changes clause .............................................................. 64 3.2.3 Producing the correct contract ...................................... 65 Invariants ..................................................................................... 66 Formalising Invariant laws ............................................ 67 Concrete Constraints.................................................................... 69 Symbolic constraints and quantifiers ........................................... 71


xiii

Chapter 4: Advanced Aspects of Programming in Omnibus 75

4.1 Implementations for Specifications.............................................. 76 4.1.1 Invariant-based repetition statements ............................ 76 While statement ........................................................................... 76 Repeat statement.......................................................................... 77 For statement ............................................................................... 78 ForEach statement ....................................................................... 79 4.1.2 Specification as a problem in its own right ................... 79 4.1.3 Inductive assertions ....................................................... 81 4.1.4 Assert statement............................................................. 84 4.2 More on Methods and Attributes.................................................. 85 4.2.1 Attribute parameters ...................................................... 85 4.2.2 Complex operations....................................................... 87 Value-returning operations .......................................................... 87 Calling value-returning operations ................................ 88 Using value-returning operations in assertions.............. 89 Var/Out parameters...................................................................... 91 Calling operations with var/out parameters................... 92 Using operations with var/out parameters in assertions 93 4.2.3 Modifiers ....................................................................... 95 Private.......................................................................................... 95 Private methods ...................................................................... 95 Private attributes..................................................................... 95 Symbolic...................................................................................... 98 4.3 Templates .............................................................................. 100 4.3.1 The Java approach to genericity .................................. 100 4.3.2 The Omnibus approach to genericity .......................... 102 Declaring templates ................................................................... 103 Using templates ......................................................................... 104 4.4 Algebraic classes ........................................................................ 104 4.4.1 Historical background ................................................. 105 4.4.2 Defining algebraic classes in Omnibus ....................... 106 4.5 Inheritance .............................................................................. 109 4.5.1 Basic principles of behavioural inheritance ................ 109 4.5.2 Using behavioural inheritance..................................... 109 4.5.3 Polymorphism.............................................................. 111 4.5.4 The laws of behavioural inheritance............................ 113 4.5.5 Formalising the laws of behavioural inheritance......... 116 4.5.6 Implementation Inheritance......................................... 117 4.5.7 A classification of the different uses of inheritance .... 119 Different forms of inheritance ................................................... 119 Inheritance for specialisation....................................... 119 Inheritance for specification ........................................ 119 Inheritance for construction......................................... 119 Inheritance for generalisation ...................................... 120 Inheritance for extension ............................................. 120 Inheritance for limitation............................................. 120 Inheritance for variance............................................... 121 Inheritance for combination ........................................ 121 Evaluation of the classification results ...................................... 121


xiv

Chapter 5: Certificates and Reusable Components 123

5.1 Trust .............................................................................. 125 5.1.1 Trust in the real world ................................................. 125 5.1.2 Building trust in the real world.................................... 126 5.2 Components and component-based development ...................... 126 5.2.1 Components................................................................. 126 5.2.2 Classes as Components................................................ 127 5.2.3 The need for components ............................................ 127 5.2.4 What components need................................................ 128 5.3 Formal Methods and component-based development ................ 129 5.3.1 Why formal methods are rarely used........................... 129 5.3.2 A perfect marriage: Formal methods and components 130 5.3.3 Mixing formality and informality................................ 130 5.4 Certificates .............................................................................. 131 5.4.1 A framework for certificates ....................................... 131 5.4.2 Certificates as a basis for trust in component reuse .... 133 5.4.3 Issues with Certificates................................................ 133 5.5 The Omnibus Certification Framework...................................... 135 5.5.1 Producing certificates .................................................. 135 5.5.2 Validating Certificates................................................. 136 5.5.3 Differing levels of Theorem Prover Technology ........ 136 5.6 A vision for the Omnibus Component Libraries ........................ 137


xv

Part II: The Omnibus Tools 139 Chapter 6: Translation to Java 141

6.1 Approaches to the generation of executable code ...................... 142 6.1.1 Compiling to machine code......................................... 142 6.1.2 Translation to intermediate form................................. 142 6.1.3 Translation to high-level languages............................. 143 6.2 Translating classes from Omnibus to Java ................................. 143 6.2.1 Object immutability in a mutable world...................... 143 6.2.2 Translating classes....................................................... 144 Package directive and the uses list............................................. 144 Class header............................................................................... 145 Attributes ................................................................................... 145 Constants ................................................................................... 145 Protected constructor ................................................................. 145 Constructors............................................................................... 146 Creators...................................................................................... 146 Functions ................................................................................... 146 Operations.................................................................................. 147 Other things in the generated Java class .................................... 147 6.3 Translating code from Omnibus to Java..................................... 148 6.3.1 Translating blocks of statements ................................. 148 6.3.2 Translating individual statements................................ 148 Assignment statements .............................................................. 148 Declaration statements............................................................... 148 Operation call statements........................................................... 148 Local operation call statements ............................................ 148 Object operation call statements........................................... 149 If statements............................................................................... 149 Select statements........................................................................ 149 While loops................................................................................ 149 For loops.................................................................................... 149 Repeat loops .............................................................................. 149 ForEach loops ............................................................................ 149 Assert statement......................................................................... 150 6.4 Translating Templates from Omnibus to Java............................ 150 6.4.1 Why the problem is non-trivial.................................... 150 6.4.2 Generating multiple Java files ..................................... 150 6.5 Modelling the core language classes .......................................... 151 6.5.1 Collection .................................................................... 151 6.5.2 Map.............................................................................. 151 6.5.3 Optional ....................................................................... 152 6.5.4 String ........................................................................... 152


xvi

Chapter 7: Symbolic Execution 153

7.1 Basic concepts of Symbolic Execution ...................................... 154 7.1.1 King’s simple programming language ........................ 154 7.1.2 Symbolically Executing King’s simple language........ 157 7.1.3 Effigy........................................................................... 159 7.1.4 The power of Symbolic Execution .............................. 159 7.2 Symbolically Executing Omnibus methods ............................... 160 7.2.1 Terminology ................................................................ 160 Symbolic Executor .................................................................... 160 State ..................................................................................... 160 Symbol ..................................................................................... 160 Symbolic Expressions................................................................ 161 Fresh symbolic value ................................................................. 161 Evaluating a variable ................................................................. 161 Knowledge Expression/Path Condition ..................................... 161 Verification Condition/Proof Obligation ................................... 161 Proof ..................................................................................... 162 Soundness .................................................................................. 162 Completeness............................................................................. 162 Initialising.................................................................................. 162 Re-assigning .............................................................................. 162 Assuming Additional Knowledge.............................................. 162 Checking Theorems ................................................................... 162 Clearing Knowledge .................................................................. 162 Symbolic execution rule ............................................................ 163 Scope Levels.............................................................................. 163 7.2.2 Symbolic execution rules for the Omnibus language.. 163 Symbolically Executing Methods.............................................. 164 Constructors ................................................................ 164 Operations ................................................................... 164 Functions ..................................................................... 164 Symbolically Executing Statements .......................................... 165 Assignment statement.................................................. 165 Declaration statement .................................................. 165 Assert statement .......................................................... 165 While statement........................................................... 166 Local operation call statement..................................... 166 Object operation call statement ................................... 166 If statement.................................................................. 166 Evaluating Method Calls ........................................................... 167 Attribute accesor methods ........................................... 167 Constant accessor methods.......................................... 167 Functions ..................................................................... 167 Constructors ................................................................ 168 Operations ................................................................... 169 7.2.3 Applying Symbolic Execution to verify Omnibus

implementations ................................................... 169 Notation ..................................................................................... 169 Notation for the state of the symbolic executor........... 169 Notation for Verification Conditions........................... 170 Notation for executing statements ............................... 170 Notation for objects ..................................................... 171 Notation for evaluating expressions ............................ 171 Using Symbolic Execution to verify absolute ........................... 172 7.3 Checking contracts via Symbolic Execution.............................. 174 7.3.1 Contract check symbolic execution rules .................... 174 Constructor ................................................................................ 174 Operation ................................................................................... 174


xvii

Constraint .................................................................................. 175 7.3.2 Applying Symbolic Execution to verify the contract of

class ...................................................................... 176 Verifying constraints ................................................................. 177 Verifying concreteTestCase1 ...................................... 177 Verifying symbolicTestCaseA .................................... 178 Verifying symbolicTestCaseB..................................... 179 Verifying Invariants................................................................... 181 Verifying the zero constructor establishes the invariant......................................................... 181

Verifying the withValue constructor establishes the invariant ......................................................... 181

Verifying the inc operation maintains the invariant ............. 182 Verifying the dec operation maintains the invariant............. 182 7.4 Proving Theorems ...................................................................... 183 7.4.1 Rewrite rules for primitive operators .......................... 183 Simplifications........................................................................... 183 Equivalences.............................................................................. 184 7.4.2 Sequent Calculus ......................................................... 184 Sequents..................................................................................... 185 Inference rules ........................................................................... 185 Omnibus sequent calculus inference rules................................. 186 Initial sequents............................................................. 186 Conjunction ................................................................. 186 Implication .................................................................. 186 Negation ...................................................................... 187 Disjunction .................................................................. 187 Truth............................................................................ 187 Falsehood .................................................................... 187 Universal quantification .............................................. 187 Existential quantification............................................. 188 Weakening................................................................... 188 Contraction .................................................................. 188 Cases ........................................................................... 188 Cut ............................................................................... 189 Symbol elimination ..................................................... 189 7.4.3 Some examples of proving theorems in Omnibus....... 189 Proving VC#1............................................................................ 189 Proving VC#2c .......................................................................... 191 Proving VC#4............................................................................ 193 Proving VC#7............................................................................ 194


xviii

Part III: Evaluation 195 Chapter 8: Related Work 197

8.1 Classification ...................................................................... 198 8.1.1 A classification scheme ............................................... 198 8.1.2 Classifying a range of approaches............................... 198 8.1.3 Classification table for Java-related approaches ......... 199 8.1.4 Classification diagram................................................. 199 8.2 Comparison with similar approaches ......................................... 200 8.2.1 JML/LOOP.................................................................. 201 What is JML?............................................................................. 201 What is LOOP?.......................................................................... 201 Comparing JML/LOOP to Omnibus.......................................... 202 8.2.2 B ...................................................................... 203 What is B? ................................................................................. 203 Comparing B to Omnibus.......................................................... 203 8.2.3 SPARK ...................................................................... 204 What is SPARK? ....................................................................... 204 Comparing SPARK to Omnibus................................................ 204 8.2.4 PerfectDeveloper ......................................................... 205 What is PerfectDeveloper? ........................................................ 205 Comparing PerfectDeveloper to Omnibus................................. 205 8.3 Detailed comparison between PerfectDeveloper and Omnibus . 205 8.2.1 Introducing Perfect from an Omnibus perspective ..... 206 Similarities................................................................................. 206 Value semantics........................................................... 206 Polymorphism ............................................................. 206 Null.............................................................................. 207 Global variables........................................................... 207 Not quite Design By Contact....................................... 207 Expressions.................................................................. 207 Function/attribute equivalence .................................... 207 Object equality ............................................................ 207 Invariants..................................................................... 208 Syntactic differences.................................................................. 208 Operators ..................................................................... 209 Changing objects ......................................................... 210 Separating public and private elements ....................... 210 Making attributes publicly accessible.......................... 210 Functions ..................................................................... 210 Pre- and Post-conditions.............................................. 211 In-out parameters......................................................... 211 Maps............................................................................ 211 Modelling critical behaviour ....................................... 211 Differences................................................................................. 211

Handling of collections, sets and other parameterised classes ............................................................ 211

Constructors ................................................................ 212 Handling of complex operations/schemas ................... 212

8.2.2 Examples ..................................................................... 212 ContractKiller ............................................................................ 213 PosCounter ................................................................................ 215 8.2.3 Java code generation.................................................... 216 Similarities................................................................................. 220 Object equality ............................................................ 220 One-to-one translation of assignment statements ........ 220 Differences................................................................................. 220


xix

Attributes..................................................................... 220 Schemas/Operations .................................................... 222 Constructors ................................................................ 226 Var/Out parameters ..................................................... 226 8.2.4 Verification.................................................................. 231 Verifying ContractKiller in PerfectDeveloper........................... 231 Making the properties more complicated .................... 232 Verifying PosCounter in PerfectDeveloper ............................... 234 8.2.5 Theoretical Foundations .............................................. 238 Theory of computation .............................................................. 238 Consequences of theoretical results ........................................... 239 8.2.6 Conclusions ................................................................. 240

Chapter 9: Future Work 241

9.1 Under development .................................................................... 242 9.1.1 A fully Object-Oriented Symbolic Execution scheme 242 9.1.2 Formal Semantics of Omnibus .................................... 246 Informal Semantics.................................................................... 246 Formal Semantics ...................................................................... 247 A Formal Semantics of Omnibus............................................... 248 9.1.3 Omnibus IDE............................................................... 249 9.1.4 GUIs ...................................................................... 250 9.1.5 Facilities ...................................................................... 250 9.1.6 Exceptions ................................................................... 251 9.2 Proposed Future Extensions ....................................................... 252 9.2.1 Termination ................................................................. 252 9.2.2 Support for facilities needed by commercial projects . 252 9.2.3 Networking and Concurrency...................................... 253

Chapter 10: Final Conclusions 255

10.1 Review of the dissertation ........................................................ 256 10.1.1 Review of Chapter 1: Introduction ............................ 256 10.1.2 Review of Chapter 2: Basic Introduction to Omnibus ............................................................... 257 10.1.3 Review of Chapter 3: Contracts ................................ 258 10.1.4 Review of Chapter 4: More Advanced Aspects of

Omnibus ............................................................... 258 10.1.5 Review of Chapter 5: Certificates and Reusable

Components.......................................................... 259 10.1.6 Review of Chapter 6: Translation to Java ................. 260 10.1.7 Review of Chapter 7: Symbolic Execution ............... 260 10.1.8 Review of Chapter 8: Related Work ......................... 261 10.1.9 Review of Chapter 9: Future Work ........................... 262 10.1.10 Summary of the dissertation.................................... 263 10.2 Main Contributions................................................................... 264 10.3 Concluding Remarks ................................................................ 264


xx

Appendices 265 Appendix A: Language Syntax 267

A.1 Construct rules ...................................................................... 268 A.2 Statement rules ...................................................................... 269 A.3 Expression rules ...................................................................... 270 A.4 Miscellaneous rules ................................................................... 270 A.5 Lexical rules ...................................................................... 271

Appendix B: Omnibus Standard Libraries 273

B.1 Structure of the Libraries ........................................................... 274 B.2 The omni.app package ............................................................... 275 The omni.app.Application class............................................................... 275 B.3 The omni.lang package .............................................................. 276 B.3.1 Wrapper classes .......................................................... 276 The omni.lang.Boolean class ..................................................... 276 The omni.lang.Integer class ....................................................... 276 B.3.2 Integer sub-ranges....................................................... 276 The omni.lang.Positive class ..................................................... 276 The omni.lang.Magnitude class ................................................. 276 The omni.lang.IntegerRange class............................................. 277 B.3.3 Core language classes ................................................. 277 The omni.lang.Object class........................................................ 277 The omni.lang.String class ........................................................ 277 B.3.4 Core template classes.................................................. 278 The omni.lang.Collection class.................................................. 278 The omni.lang.Optional class .................................................... 280 The omni.lang.Map class........................................................... 281 The omni.lang.Array class ......................................................... 282 B.4 The omni.adt package ................................................................ 283 B.4.1 Standard ADTs ........................................................... 283 The omni.adt.Stack class ........................................................... 283 The omni.adt.Queue class.......................................................... 284 The omni.adt.Set class ............................................................... 285 B.4.2 Variations of the standard ADTs ................................ 287 The omni.adt.FiniteQueue class................................................. 287 The omni.adt.TheoreticalSet class ............................................. 289 The omni.adt.ResizableArray class ........................................... 290 The omni.adt.FiniteStack class .................................................. 291 The omni.adt.InfiniteArray class ............................................... 292 B.5 The omni.nodes package............................................................ 293 B.5.1 Linked lists.................................................................. 293 The omni.nodes.ListNode class ................................................. 293 The omni.nodes.LinkedList class .............................................. 293 B.6 The omni.shape package............................................................ 295 The omni.shape.Shape class..................................................................... 295 The omni.shape.Rectangle class .............................................................. 295 The omni.shape.Square class ................................................................... 295 The omni.shape.Circle class..................................................................... 295


xxi

Bibliography 297

Algebraic Specification ........................................................ 297 Alloy ..................................................................................... 297

Aslan and ASTRAL ............................................................. 297 B........................................................................................... 298

Compilers ............................................................................. 298 Components.......................................................................... 298 Design By Contract .............................................................. 299 Eiffel ......................................................................... 299 Java DBC.................................................................. 299 Extended Static Checking..................................................... 299 Formal Semantics ................................................................. 299

JML....................................................................................... 300 LOOP........................................................................ 300 Language Design .................................................................. 301 Miscelaneous ........................................................................ 301 Object-Oriented Programming ............................................. 301 PerfectDeveloper .................................................................. 301 Proof-Carrying Code ............................................................ 302 Programming Languages...................................................... 302 C++........................................................................... 302 C# ............................................................................. 302 Java ........................................................................... 303

PVS....................................................................................... 303 Sequent Calculus .................................................................. 303 Software Engineering ........................................................... 304

SPARK ................................................................................. 304 SRS....................................................................................... 304

Symbolic Execution.............................................................. 304 UML ..................................................................................... 305

Z ............................................................................................ 305


xxii

Home Pages 307

Alloy ..................................................................................... 307 Atellier B .............................................................................. 307 The B method ....................................................................... 307 The B-tool............................................................................. 307 Project Bali ........................................................................... 307 C# ......................................................................................... 308 C++....................................................................................... 308 ChAsE................................................................................... 308 Coq ....................................................................................... 308 Daikon invariant detector ..................................................... 308 Eiffel ..................................................................................... 308 ESC/Java............................................................................... 308 iContract ............................................................................... 309 Jass........................................................................................ 309 Java ....................................................................................... 309 JavaCard ............................................................................... 309 jContractor............................................................................ 309 JML....................................................................................... 309 Joose ..................................................................................... 310 jUnit...................................................................................... 310 KeY....................................................................................... 310 LOOP.................................................................................... 310 PerfectDeveloper .................................................................. 310 Proof-Carrying Code ............................................................ 310 PVS....................................................................................... 311 Simplify ................................................................................ 311 SPARK ................................................................................. 311 UML ..................................................................................... 311 VeriCard ............................................................................... 311 Z ............................................................................................ 311


xxiii

Introduction to the Dissertation This section will give a brief introduction to the dissertation.

The Problem The software development industry is in the midst of a crisis and has been for some considerable time now. The vast majority of modern software projects are delivered late and over budget. Frequently, even when they are finally delivered, they are seriously defective. Users have come to expect software to have bugs. The following quote taken from “Software’s Chronic Crisis” sums up the situation as it was in September 1994. Indications show that what was said is as true today as it was then.

Studies have shown that for every six new large-scale software systems that are put into operation, two others are cancelled. The average software development project over-shoots its schedule by half; larger projects generally do worse. And some three quarters of all large systems are “operating failures” that either do not function as intended or are not used at all.

Clearly, there are serious problems with the industry that need to be addressed if software systems are to continue to take on an increasingly important role in modern society. The same paper describes a possible solution to the problem. It likens the software development industry to the process of manufacturing goods. Just as the industrial revolution brought major advances in the manufacturing world by introducing interchangability in place of craftsmanship, so too could a similar revolution have huge benefits for software development. In current software development practice, programs are hand-crafted in a manner reminiscent of that used in the manufacturing world before the industrial revolution. If, instead, software could be constructed from libraries of interchangable components then we could expect similar rewards. So, constructing software using libraries of reusable components, could help address some of the issues of the software crisis. The advantages of building software programs from libraries of software components have been known for some time. However, in the current software development world, software components are reused in an extremely limited fashion. When software component reuse does occur, it frequently ends in catastrophic disaster. The Ariane disaster is one of the most prominent of these disasters. In this project, a software reuse error caused the Ariane 5 rocket to crash 40 seconds after take-off. The cost of this failure was estimated at $500 million. This project, while being far from typical in the amounts of money involved, is not so far from typical in its general result. Software


xxiv

projects that reuse software components often result in the same end result: failure. Such experiences discourage software developers from software reuse. As a result of this, software components are only reused to a limited extent in the world today. We may ask why software reuse is so unsuccessful. Is the software development field fundamentally different in such a way that reuse could never be successful? In his paper entitled “The next software breakthrough”, Bertrand Meyer argues that the major issues which have thus far prevented software reuse from becoming a reality are of a technical nature. He argues that the current languages and frameworks for reuse of software components are insufficient to achieve the goal of safe reuse. The failings in the current languages and frameworks for enabling software components to be reused is the problem which this dissertation focusses on.

Scope and Objectives The dissertation has three main aims: 1. To carry out an investigation into the failings of current languages and reuse

frameworks. 2. To propose a language and framework which makes the reuse of software

components possible without incurring the difficulties which current approaches suffer from.

3. The proposed language and framework should be placed within the context of

related work.

Achievements and Contributions The dissertation has achieved each of the aims set out in the previous section. The corresponding major contributions and achievements of the dissertation are: 1. A clear presentation of the problems with current mainstream component-reuse

practice is given. This includes discussions of the theoretical foundations of interest to any project of this nature.

2. A language called Omnibus is presented. This language is built around the concepts

of Contracts and Certificates. Together, the application of these concepts give the theoretical basis required to make software reuse possible. A framework for the analysis and execution of Omnibus programs is also presented. This gives the practical basis for the application of the language.

3. The Omnibus language together with twenty seven other specification and/or

implementation and/or verification projects are classified into related groups.


xxv

Omnibus is then compared to the five most relevant of these. Finally, the one ajudged most relevant is the focus of an in-depth comparison with Omnibus.

What end-users will gain The dissertation proposes a language and framework which allow software components to be reused without incurring the difficulties which currently dog attempts at software reuse. The application of software reuse would herald a major breakthrough in software development theory and practice. By supporting safe software reuse, an approach such as the one presented here could help save millions in development costs and allow the world to safely depend on software applications such as those used in safety-critical situations.

Structure of the dissertation This dissertation has ten chapters and two appendices. The chapters are divided into three parts. The first part contains the next first chapters and gives an overview of the Omnibus language. The second part contains two chapters that present techniques for executing and analysing Omnibus code. The final part presents an evaluation of the work. The appendices include reference material on the project. Part I: The Omnibus Language Chapter 1: Introducing Omnibus. This chapter gives a high-level introduction to the Omnibus language. It starts with a detailed discussion of the problem to be solved. This includes discussions of the details of a range of different techniques for reusing components. It then presents the principles of the Omnibus language and places it within the context of the approaches discussed earlier in the chaper. Chapter 2: The Basics of Programming in Omnibus. This chapter introduces the fundamentals of the Omnibus language. It starts by introducing the basics of expressions, classes and objects. These are the basic building blocks of the language. It then goes on to look at writing code in Omnibus. It concludes by looking at how applications are structured in the language. This chapter will give the reader everything they need to write a first, simple Omnibus application. Chapter 3: Contracts. This chapter presents the system by which contracts are specified in the Omnibus language. It starts by discussing the basic principles underlying contracts and then looks at augmentations required to allow the language to express contracts. It then focuses on the details of defining contracts for methods and for defining the requirements of a component. Finally it looks at the issues involved with checking the requirements of a component.


xxvi

Chapter 4: Advanced Aspects of Programming in Omnibus. This chapter introduces some of the more advanced aspects of the Omnibus language. The features discussed in this chapter are implementtions for specifications, complex operations, templates, algebraic classes and inheritance. These facilities allow the programmer to write real-world applications more easily. Chapter 5: Certificates and Reusable Components. This chapter looks at the Omnibus certification system. It starts by discussing the problem of trust in the programming world and then goes on to describe the proposed Omnibus certification framework. The chapter concludes by presenting a vision for the Omnibus libraries built upon trust through certificates. Part II: Technical Details Chapter 6: Translation to Java. This chapter outlines the details of the process whereby Omnibus files are converted to equivalent Java files. The chapter starts by looking at the scheme used to translate Omnibus classes into Java classes. It then presents the details of the translation of Omnibus statements to Java statements. The next section looks at the handling of Omnibus templates for which there is no directly equivalent Java system to map to. The chapter concludes by looking at how the core classes in the Omnibus language are translated to Java. Chapter 7: Symbolic Execution. This chapter presents Symbolic Execution which is the primary technique used in the Omnibus analyser. The chapter begins by outlining the historical background of the approach. It then introduces the basic concepts and notations which the reader will need in order to follow the technical details in the remainder of the chapter. The chapter then presents the details of the Symbolic Execution process. The chapter concludes by discussing the management of proofs of the theorems resulting from Symbolic Execution. Part III: Conclusions Chapter 8: Related Work. This chapter attempts to place the work within the context of related work carried out by others. The chapter is divided into three parts. The first part classifies twenty seven different approaches into different groups. The second part then briefly compares Omnibus to the five most closely related projects. The third part compares Omnibus to one of the project adjudged most relevant, PerfectDeveloper. Chapter 9: Future Work. This chapter looks at potential future extensions of the work presented in the dissertation. A number of extensions to Omnibus are currently under development and these are discussed. Finally, a number of future extensions on which no significant work has yet been undertaken are proposed. Chapter 10: Final Conclusions. This chapter presents the final conclusions of the work. It starts with a review of the conclusions of each chapter. It then discusses the major contributions of the dissertation. The chapter concludes with a collection of succinct concluding remarks.


xxvii

Appendices Appendix A: Language Syntax. This appendix presents the syntax for the current version of the Omnibus language in EBNF notation. Appendix B: Language Reference Manual. This appendix presents the contracts of a number of the classes in current version of the Omnibus Standard Library.

Omnibus


Part I The Omnibus Language

Chapter 1: Introducing Omnibus Chapter 2: The Basics of Programming in

Omnibus Chapter 3: Contracts Chapter 4: Advanced Aspects of Programming in

Omnibus

Chapter 5: Certificates and Reusable Components

5

Chapter 1

Introducing Omnibus Overview:

This chapter will introduce Omnibus. It will start by illustrating some of the problems with commercial programming using the current generation of commercial programming languages such as Java. It will then look at the use of formal methods to help solve these problems. The chapter will then introduce Omnibus and compare Omnibus to a number of related languages.

Contents:

1.1 State-of-the-art 1.1.1 Object-Oriented Programming 1.1.2 More Formal Methods

1.2 Omnibus 1.2.1 What is Omnibus? 1.2.2 Comparing Omnibus to other languages

Part I: The Omnibus Language

6

1.1 State-of-the-art This section gives an overview of the state of the art. We start by looking at the currently dominant paradigm for programming: Object-Oriented Programming.

1.1.1 Object-Oriented Programming The main idea behind Object-Oriented Programming is to group together related data and code. More specifically, data and the code which operates on that data, are together encapsulated inside a class. A complex system is then constructed from a collection of classes, each of which encapsulates a collection of data variables and associated operations. All access to the data variables within a class should be made through the operations provided by the class. If this is the case, the class implementer can strictly control the manipulation of the data. The data variables within a class are referred to as attributes and the operations that query and manipulate their values are called methods. The interface of a class defines everything that a client of the associated component needs to know in order to use the class. Instances of classes can be created and manipulated. We refer to these class instances as objects. Special methods called constructors are used to instantiate objects. Each object has associated with it its own copy of the internal data defined in the class. When the operations of the object are called, the operations manipulate this copy of the data. There are huge advantages to the OO approach. The main one is that it allows the components of a complex system to be reasoned about somewhat independently. In an Object-Oriented application, each component is represented using a separate class. The programmer can think about each component abstractly in terms of the operations offered through the class’ interface. They need not consider the low-level implementation details in order to use the component. In fact, the low-level details are purposefully hidden from their view. This information hiding has great benefits for software development. It allows the low-level implementation details to be altered without affecting other components in the system. Key concepts Object-Oriented Programming is built around a number of key concepts. To be considered Object-Oriented, a language should provide support for all of these concepts. They are Encapsulation/Abstraction, Inheritance and Polymorphism. We consider each in more detail here. Encapsulation/Abstraction We have already introduced Encapsulation and Abstraction briefly. Data and the associated operations to manipulate that data are together encapsulated behind an interface within a class. We manipulate objects of this class abstractly in terms of its

Chapter 1: Introducing Omnibus

7

interface without ever referring to the low-level implementation details behind the interface. We do this on a day-to-day basis when, for example, we think of a car in terms of how we interact with it, and not in terms of how it actually works. When we press the accelerator pedal, we do not consider the low-level details of what has happened (e.g. how the engine uses petrol to generate power to drive the wheels within the vehicle’s bodywork) but, instead, focus on the high-level consequences (e.g. the car accelerates). Inheritance One of the most distinctive aspects of Object-Oriented Programming is inheritance. Inheritance is essentially a way of defining a particular type of relationship between the classes in a system. This relationship is commonly referred to as the “is a” relationship1. Inheritance is a way of specialising or extending the facilities provided by another class. This concept exists in the real world too. Polymorphism Polymorphism means “many forms” and is concerned with the substitutability of objects of different types within a class hierarchy. The principle of substitutability forms the basis of polymorphism. It states the following.

If S is a subclass of T, then wherever the system expects an object instance of the class T, an object instance of the class S can be substituted.

For the principle substitutability to be satisfied, the subclass and superclass should also be consistent in as much as applying some operation to the subclass has the same meaning as applying the same operation to the superclass. This is hard to quantify until we meet contracts and start to formally specify what methods should do. Assisting technologies In this section, we will consider some of the technologies present in modern Object-Oriented languages. Interface documentation As was explained earlier, the principle of OOP is based around the concept of encapsulating the components which make up a system within classes and hiding the details of the inner workings of the component. The interface of the class then defines everything that the outside world needs to know to use the class. The obvious question here is: what does the outside world need to know about the component? It certainly needs to know about what methods it can call, but is this enough? It would be nice if the interface could also describe what each method does. Does it change the internal

1 In fact, implementation inheritance can be used even when the is-a relationship doesn’t hold


8

state of the object? What is the meaning of the value what it returns? These would all be useful things to know and it would be impossible to be sure that you were using a component properly without knowing these sorts of things. The current generation of commercial Object-Oriented Programming Languages (OOPLs), which includes languages like Java, C++ and C#, do not provide the facilities to provide the answers to these sorts of questions in their interface descriptions. However, these questions need to be answered in some way to use a component properly. How can we use a component properly without an understanding of what it does in some (albeit abstract) way? The solution given by these languages is to provide a framework for the definition of special comments within the source text and a pre-processor which can parse source files, pick out these comments and construct textual documentation from them. Java and C++ provide facilities to generate HTML documents from such comments and C# provides facilities to generate XML. The developer of the component can then informally answer some of the questions posed above via these natural language descriptions. It is important to note that the comments are not really part of the language and are certainly not mandatory. There are also no strict conventions governing how best to use these comments or describing what to say and what not to say. The system is completely ungoverned and left to the individual programmer to manage. Practical experience with these languages exposes a number of problems with this approach to describing interfaces. Firstly, because the comments are not required, they are frequently omitted entirely. In such cases, the only information provided to explain the component is the methods signatures provided by the class. This is insufficient information to use the component properly. Other times, comments are provided but they have been written purely so that the author can claim to have them. In such cases, the comments are often brief and frequently provide very little additional information to the user of the component. This is a by-product of the informality of the descriptions used because there is no way to formally differentiate between a good comment and a bad comment, all we can formally say is whether it is present or not present. The two cases we have considered so far are certainly extremes of bad practice but, nonetheless, they are still a possibility, and even an occurring reality. A more typical scenario is where comments describing the interface are given and the author makes some genuine attempt to explain what the method does. This will involve stating answers to some of the questions we mentioned earlier. Frequently the information from these comments, together with knowledge from the domain and the common sense of the user of the component are enough to build up a fairly good idea of what the component does. However, unfortunately, the text of the descriptions are unavoidably ambiguous due to the ambiguous nature of natural language itself. Thus, the user of the component can never be absolutely certain that how they understand the comments is the same way that the author intended them to be understood. Besides, even if the user of the component and the author of the component are able to synchronise their conceptions of what the component should do, there are still no guarantees that this is indeed what the component does. The implementor of the component may have made some mistake or mishandled some special case. So, even when the two parties do agree on what the component should do, the user still has to trust the programmer to have implemented the solution properly. Unfortunately, the user has no way of knowing who to trust and who not to trust.


9

Testing Once a programmer has produced a component for a system, it is necessary to determine whether the component is correct. In other words, does the component that the programmer has produced function as it should? The programmer will have some idea of what the component should do. They will have attempted to convey this in the documentation of the class’ interface. This model of what the component should do will include an idea of what the component should do when it is in some internal state and some particular method is called with some particular parameter values. The programmer (either through their own knowledge or by intereacting with the client) is in a sense an oracle that can answer what is the right thing to do in any given circumstance. One way of determining whether the component is correct is to execute the component in some situation and then compare the results produced with those that the oracle expected. This process is called testing and is still the major technique used to verify the correctness of software components. The problems with this approach as described above are quite apparent. Firstly, there is no guarantee that the oracle is worthy of the name oracle. The decision about what should be done is a potentially very arbitrary process in the system described above and the model which is used to determine the correct response exists only inside the programmers consciousness. The interface documentation will try to give some sort of an indication of this model but it can do little more than provide hints to the reader. If we had a formal description of the model being used by the oracle then we could reason about it formally in an attempt to reassure ourselves that it corresponds to our informal conception of the right thing to do. However, we have no such formal model to examine using this approach. The other key problem with the testing process as described earlier is how to select the data to execute the system with. Ideally, we would like to test the system in all possible situations but for non-trivial systems this is not practical. Thus, we must select some subset of all the possible scenarios within which to test the operation of the component. There are many different techniques for the selection of such test cases but whatever scenarios are selected, they will still not cover all possible scenarios and so we cannot make statements like “the component works correctly in all possible situations” on the basis of these tests alone. This is a shame because we would dearly love to make such a statement. Promises of components A huge number of programming projects use many of the same components within them. All programming basically comes down to combining standard data structures and algorithms together with a limited amount of problem-specific code. All over the world, people are writing programs to do the same things. As such, it makes an awful lot of sense to work together. There is no need for people to “re-invent the wheel”. Before starting to write a component, the programmer’s first action should be to check to see if someone else has written such a component. If they have, then they can simply plug this component into their system. If there is no such component, then they can write one and then the next time someone needs such a component, the newly created one can be reused.


10

Using such a framework, programming could be reduced to the selection and combination of components from some central library with only the occasional need to write a new component. Newton once famously said: “If I have seen farther than others, it is because I was standing on the shoulders of giants.” So too could a statement be applied to a programmer producing a ground-breaking new software application by combining components developed by brilliant programmers possibly many, many years previous. Is this what currently happens in the programming world and, if not, why not? Reality of the current use of components Sadly, the world described in the previous section is not the same world in which we live. To explain why this is the case, we will outline a real experience which came from development using an OOPL in a commercial setting. “Standing on the toes of Boris”:

I was working with a commercial software company who were producing a suite of programs targeted specifically at small business users. This was in the time before Windows XP, when windows still looked rather ugly and had square corners. We wanted to give our software a more visually impressive look. We wanted to be able to “skin” all the windows in the application and give them nice borders and perhaps even round corners. Following the steps described above for what should happen, we started by searching on the Internet for a component which did this kind of thing already. We found a few. We downloaded trial versions of each and started testing them. Most didn’t work as we desired and so we disgarded them. Finally, we found one that did what we wanted. We built a few test applications with it and it worked brilliantly for as much as the test version let you do. The documentation was reasonably good too and, from experience, we took this as a good sign. We then investigated how to integrate it into the existing applications in the suite. We developed a system for easily converting our existing ugly windows into the new pretty windows and tasked the designer with the job of developing a visually pleasing “skin”. We produced a prototype, which, again, did as much as the trial version allowed. The decision was then taken to purchase the component. The component was written by a programmer named Boris who lived somewhere in Russia, I forget exactly where. After the payment cleared, we were sent the unrestricted version of the component and we proceeded to integrate it into a clone of the suite. It was not long until we hit our first problem. In using the component, we misunderstood what the purpose of one of the methods of the component was. We had developed our own model of what we thought it did (or, should do) from the documentation and our common sense but the model we had came up with did not fit what the component was doing. We returned to the documentation and found that, given the knowledge of what the component was doing, we read what was being said in the documentation differently than we had previously. We revised our model of what it should do and it now fit together with what the component was doing once more. Sadly, the second


11

problem we hit could not be resolved in a similarly pleasing manner. We noticed that in certain circumstances, the skin was disappearing and the window was returning to its normal appearance. This certainly could not fit into anyone’s model of what the component should be doing. This was a bug, a mistake in the implementation. We emailed Boris to inform him of the bug and shelved the project. The next week, pretty windows went off the agenda and the project was shelved for good.

The (true) story above illustrates a number of interesting points. In this example, component re-use did not work despite the best efforts of all those involved. It was essentially the two problems that ended the progress of the project so we should start our post-mortem by looking at each of these. The first problem was a consequence of the ambiguity of the component interface documentation. The documentation was certainly not sub-standard by any means. In fact, it was considerably better than that which comes with many components. Despite this, we still managed to incorrectly interpret its meaning. Once we had seen what the component did, we could understand what the documentation had been trying to say but there was no flaw in the reasoning we had previously used, the statement could simply be read in different ways. This is an inescapable problem with the use of natural language as a method for describing the interface of a component. Such ambiguous statements are always going to be possible in such an untamed language. The second problem was more serious in that it proved to be the fatal blow. This was not a problem with the model or the interface description, it was an implementation bug. When we started using Boris’ component, we put our trust in his abilities to correctly implement, if not our model, then at least his own model of the solution. It appears that he failed to do this. Does this mean we were wrong to place our trust in him? What other choice did we have? It is certain that using a buggy component is not an option. As a project manager, you are responsible for the product in its entirety including all the code written by your own programmers and all the third party components. If the product gets shipped and has such a bug in it, explaining to the boss that it was Boris’ mistake and not your own does nothing to help. As a result of experiences similar to those described in the earlier story, many programmers become vary wary of software re-use. They are quite justified to feel this way about software re-use in such languages. Most of these people will have been brought down to Earth by some equally infuriating series of experiences except perhaps with far worse consequences. This fear of software re-use is extremely damaging because it encourages the affected programmers to waste vast amounts of time re-inventing components instead of re-using existing ones. The significance of this should not be underestimated since this act accounts for huge proportions of a programmer’s time and millions of pounds, dollars and other denominations. From the preceding discussion, clearly the systems provided by such languages are inadequate for safely supporting component re-use. It should also be clear that software re-use is highly desirable and the achievement of a system that would support it would be nothing short of a breakthrough. This dissertation is an attempt to move towards the development of such a system. Note: For reference, of the two problems encountered in the story described earlier, the first could have been avoided through the use of Contracts and the second could have been avoided through the use of Certificates. These central concepts of Omnibus are


12

explained briefly later in this chapter and in detail in sections of chapters three and five, respectively.

1.1.2 More Formal Methods In this section we will look at some of the techniques which have been developed to bring more formality to the software development process. Without introducing more formality to the development process, there is no way that we can provide unambiguous interface descriptions or verify that they are obeyed by the implementation of a component. Both of these things are highly desirable because they would make safe component re-use possible. Specification and Verification Techniques – a historical perspective The problem that we have formulated in the chapter up to this point was recognised long ago and many have worked on solutions. One of the pioneers of the area was Sir Tony Hoare. Hoare wrote a paper entitled “An axiomatic basis for computer programming” in which he outlined an approach for formally specifying the interface of a block of code together with a technique for verifying that the provided code was consistent with this specification. The interface for a block of code was formally described using a pair of assertions; an assertion being a special Boolean expression which evaluates to true for some subset of the possible states of a system. For example, consider a system with two integer variables x and y. The state space of the system is then the cross-product of the domains of the two variables. In other words, there is a distinct state for every distinct pair of values (x,y). The assertion “x > 0” holds over some subset of these states. For example, it holds over (3,5), (300,-5) and (50,999) but not over (-5,60), (0,88) or (-78,23). The first of the pair of assertions used to describe the interface of a block of code is called a pre-condition and describes the valid states within which the code can be called. The second of these assertions is called a post-condition and describes the states within which the system should be in after execution of the code. Specification via this pair of assertions is extremely useful. The desired action of a block of code can be abstractly described in an unambiguous manner. This formal description provides all the information that the user of the code need know, it explicitly describes what the code assumes about the environment in which it is invoked and what they, in turn, can assume about the actions of the code. Such a technique could be used to answer all the questions posed in the last section about what an implementation does. In many situations where an informal contract is used and a component fails, it is unclear who is responsible for the failure i.e. whose fault it is. Should the implementer have handled the circumstance better or was the user requesting making some nonsensical request that had no sensible response? Frequently it is not possible to apportion blame exclusively to either party. One approach to this problem is to use defensive programming where implementers aim to handle all situations as gracefully as they can so that their component does something sensible (which may simply be


13

report the cause of the problem and return an error) in all circumstances. However, this places a big additional burden on the implementers of the component. Much of the implementers time would be wasted writing this code to check for frequently ridiculous circumstances. This adds lines of code to the project (thus complicating it) and lowers the efficiency of the code (since it takes time to check for these ridiculous circumstances which may never arise). This is clearly an undesirable state of affairs. Pre- and post-conditions have the advantage of clearly delegating responsibility between the user and implementer of a component. It is the responsibility of the user of the component to ensure that the system is in a state satisfying the pre-condition of the code before the code is executed and it is the responsibility of the implementer of the component to ensure that (when the code is executed in a state satisfying the pre-condition) the system ends up in a state satisfying the post-condition. The implementer can use the pre-condition to formally define what constitutes acceptable and unacceptable situations to execute the code in. Once they have done this, they no longer need to handle any unacceptable situations in any sensible manner. This saves them development time, lines of code and run-time seconds. It is also advantageous for the user of the component because it gives provides them with an unambiguous description of their responsibilities and, with an appropriate supporting system such as the ones we will look at shortly, will give them informative feedback as to the cause of the failure (i.e. it will allow them to work out what they did wrong). In his paper, Hoare gives us a way of specifying interfaces for pieces of code unambiguously via pre-conditions and post-conditions. He also presents a system for developing proofs that a particular piece of code is consistent with one of these specifications. Of course, this is the other piece of the puzzle that we are lacking. Once we have a formal description of what a component should do, the next task is to verify that it does it. The technique used to verify implementations in this project is called Symbolic Execution and is described in great detail in chapter 7. The formal methods we will now go on to look at all use Hoare-style pre- and post-conditions to specify blocks of code but involve different systems for verifying that a component satisfies them. Design By Contract The approach of Design By Contract (DBC) was developed by Bertrand Meyer. It is the most pracitcally applicable of the approaches we will consider here for verifying that an implementation is consistent with its specification. The DBC approach involves no sophisticated compile-time analysis system and does not require the programmer to have comprehensive Mathematical training. Yet the approach still manages to allow interfaces to be specified formally using pre- and post-conditions. Thus, it sounds fairly ideal! The principle of DBC is to make the appropriate interface checks at run-time as a program is being executed. For example, before the code of a method is executed, the pre-condition assertion for the method is evaluated and a failure is reported if it yields the value false. Similarly, after the code of a method is executed, the post-condition assertion for the method is evaluated and a failure is reported if it yields the value false.


14

This approach gives all the benefits associated with having a formal description of what a component should do. The use of DBC makes component re-use a lot more feasible a proposition that it is with comment-based interfaces. The approach can be very simply implemented in virtually any programming language. To do this, the pre- and post-conditions are translated to boolean expressions and evaluated at run-time at the start and end of the method, respectively. A simple conditional statement (such as an if statement) can then be used to test the value of this expression and do something like report the error if the value is false. The reader may now ask: can I stop reading here and start re-using components safely by applying this concept? DBC helps, unfortunately, it is not a perfect solution answer. The approach has disadvantages and limitations too. As always, it is a case of weighing the advantages against the disadvantages. Of course, it should be clear that DBC requires a certain run-time overhead to operate. The checks that a contract specification assertions are met require time to evaluate. You may be forgiven for thinking that such overheads are negligible but, in fact, they quickly become significant. How long it takes to evaluate an assertion clearly depends on the complexity of the corresponding expression. Complex specifications will take considerable time to be verified in a DBC system and may result in a system that is noticeably slower than an identical program written in an unasserted language such as Java. The next limitation of DBC is closely related to the previous point. The contracts used in DBC differ fundamentally from those that Hoare used in one crucial way. The contracts Hoare used characterised the code they described completely. These contracts defined everything that someone needs to know to use the code. They define everything that the user must ensure holds before the call and everything that the implementer must ensure holds after the call. This type of a contract can be referred to as a complete contract since it completely defines the encapsulated code. The contracts used in DBC have no such completeness to them. Instead, DBC uses partial contracts which simply describe some of what the user of the component must ensure and some of what the implementer of the component must ensure. One of the reasons for this is that not every assertion can be expressed efficiently as a boolean expression to be checked at run-time. The implementer has some control over the run-time overhead of the evaluation of the associated assertions since they have control over the selection of the assertions themselves. When faced with a complex assertion, they can either include it in the interface description and incur the associated run-time penalties for its evaluation or they can omit it and sacrifice the completeness of the contract for run-time efficiency. As is common, a compromise is usually the best solution. The implementer could simplify the assertion in some way so that it checks some crucial aspect of the assertion but not all of it. Another crucial limitation of DBC is that it provides no way of verifying that a component is consistent with its specification in all cases. An approach such as testing is still required to attempt to discover the scenarios where a component may differ from


15

its contract. There is no guarantee simply by the association of a contract with a component that the component obeys the contract and there is no way to attain such a guarantee through any means within DBC. Thus, we cannot trust DBC components any more than we could trust components in, say, Java. We can be confident that when DBC components fail that they will explain the cause of the failure far more clearly but we cannot be confident that they will never fail. As a result, we can view DBC as a step towards safe software re-use but not the complete solution. Note: If Boris had used DBC then the first problem we had would probably not have occurred since DBC interfaces are far less ambiguous than natural language interfaces. However, the second problem would not have been avoided. There is no good reason to think that Boris would not still have made the same coding mistake, resulting in the same bug. The only thing DBC would have given us would have been a guarantee that it was not in any way our fault. In this example, the post-condition of one of Boris’ methods would have failed and the system would have told us something like “Boris made a mistake” except in a little more cryptic a form! Extended Static Checking Extended Static Checking (ESC) is an interesting next step from DBC. As the name suggests, ESC is an extension of the type checking performed on source code by standard compilers. ESC incorporates Hoare-style pre- and post-conditions much like DBC except it analyses them differently. Whereas DBC converts the assertions to expressions and evaluates them at run-time, ESC analyses them in conjunction with the associated code at compile-time. This process is far more complicated than the relatively simple process involved in DBC but its use has many additional rewards. It may be initially hard to understand how ESC manages to do this compile-time analysis. It is certainly harder to get your head around than DBC was. The DBC approach seemed relatively obvious and was straightforward to implement. In DBC it was easy to assign meaning to the references to variables in the assertions since each variable had a concrete value at the point where the assertions were being evaluated. This is not the case in ESC. It cannot be. At compile-time, there is no executing system (yet) and so there are no values currently assigned to the variables. ESC gets around this by assigning symbolic values to the variables in the system and then manipulating them according to the statements in the code being considered. Before executing the code, the pre-condition can be assumed to hold over the symbolic values of the variables in the system. After the code is executed, the system must verify that the current symbolic values of the variables are related in the way outlined in the post-condition of the code. If this is the case then the code is consistent with its specification. Probably the biggest advantage of this form of symbolic analysis is that it enables the system to reason about all possible scenarios and not just a particular test scenario. This enables statements to be made about a component in all cases, something which testing alone failed to give us. Clearly, this is a huge leap forward.


16

The reader may now ask: does specification via ESC allow us to unambiguously describe what a component should do and allow us to provide a guarantee that it does it? Is this the system we are after? Again, there is another side to the story. ESC too has a number of crucial limitations. The symbolic reasoning employed by ESC has a Mathematical basis to it. Algebraic symbols are assigned to variables and then manipulated as the code is executed. At the end of the code, the post-condition needs to be checked. This involves an analysis of the current symbolic values of the variables in the system being analysed. Such an analysis requires Mathematical manipulation of the symbols involved. This involves rearrangements of the expressions, appropriate application of axioms and other operations that make up the art of Mathematics. Many of these operations can be carried out automatically by a computer system, however, in problems of the order required to specify real-world problems, the problem has been shown to be undecidable in general. In other words:

No computer program can automatically perform all the required Mathematical manipulations for us without some form of external assistance.

This is a fundamental problem with such forms of symbolic analysis. How such a system handles this limitation is of crucial importance and, indeed, the main way in which ESC differs from the approaches in the next section is in its handling of this problem. When faced with the limitations of automated symbolic reasoning, ESC takes the strategy of working within these limitations. More specifically, ESC takes the task of proving that a symbolic formula is true and turns it on its head giving a problem of trying to prove that a symbolic formula is false. At first this may seem like an irrelevant action because surely the two formulae are logically equivalent representations of the same thing but, in fact, this action has crucial consequences. For example, consider the expressions (x-y)+z and (x+z)-y. We may think that the expressions are equivalent and that it is irrelevant which way we write the expression. However, when making such a statement, we are thinking about expressions in terms of an ideal computational model such as pure Mathematics. Unfortunately, the computational models in the world of programming practice are far from ideal and have many limitations. For example, in an ideal computational model, the integer variables x, y and z would be unbounded, however, in a practical programming model they are likely to be bounded. If they are bounded then there must be some largest representable value. Now, consider the case where x holds this largest representable value, y holds the value 10 and z holds the value 5. Evaluation of the first form of the expression yields x-5, a perfectly valid and representable value. However, evaluation of the second form of the expression will result in an overflow error and the same answer will not be successfully calculated. In this example, the limitations of the computational model mean that the way we evaluate something is significant. Like in the above example, the limitations of the system we are using make the way we formulate the problem significant. ESC reformulates the problem of proving something


17

to be correct as attempting to prove that it is false. The system attempts to do this proof in a finitely bounded amount of time since, because of the undecidability of the problem, otherwise it could continue forever without finding a proof or contradiction. If it finds a proof then the original formula was false. If it finds a contradiction then the original formula was true. The interesting case is when it “times out”. In this case, it has not been able to find a proof or contradiction in a finite amount of time. The system assumes that since it could not find a proof that there probably exists no such proof and, thus, the original formula is probably true. This may be correct but, clearly, it may also be incorrect. There may have been a proof but it was just unable to find it. In fact, the undecidability proof of the problem implies that there will exist such a scenario. How acceptable this handling of unresolvable formulae is depends on a number of things. Firstly, the better the theorem prover, the less damaging this course of action would be. In the extreme, if you took a theorem prover which was unable to prove anything automatically, then every contract-component pair would be verified regardless of whether it was, indeed, correct. The theorem prover used in conjunction with the ESC project is widely recognised as one of the most powerful theorem provers currently available but, of course, it still suffers from the fundamental undecidability limitation. Secondly, whether this is acceptable depends on what you are trying to achieve. If you are trying to prove that the code is consistent with its specification in all cases then this approach is inappropriate. However, ESC is not about proving correctness, it is about catching some common errors. The ESC approach used in conjunction with a powerful theorem prover will certainly catch some errors and so would be adequate for such a goal. The next two limitations are connected to the limitations of the theorem prover in a similar way to that which related the DBC limitations. ESC, like DBC, uses partial contracts. DBC uses partial contracts because they are easier to reason about (in that they use up less execution time). How completely to specify the interface is a programmer-led compromise. Similarly, ESC uses partial contracts because they are easier for its theorem prover to reason about (they result in less unresolvable symbolic formulae). Again, the programmer can decide how completely to specify the contract in a given situation. ESC also provides facilities to directly override its built-in reasoning facilities. This may seem a little surprising. The developer could easily abuse these facilities to tell the ESC reasoning system to assume the truth of some symbolic formulae without justification. Thus, one way of satisfying the post-condition of a method (as far as ESC could tell) would be to assume the truth of the post-condition just before the end of the method. Obviously, this destroys the basis for any real trust you can have in the verification system so we may ask why the designers of ESC have chosen to do such a thing. Again, we must consider the decision in relation to the goals of ESC. As was noted earlier, ESC does not aim to provide a guarantee that a particular component is consistent with its specification in all cases. All it aims to do is catch some common programming errors. Clearly, overriding the built-in reasoning may cause the system to miss some errors that it may otherwise have caught though so there must be some other reason. The designers of the system explain that this facility is included to cover situations where the programmers knows the truth of some assertion but they cannot easily justify this knowledge to the analyser. Without the ability to override the


18

reasoning facilities, in this case, the only way to get rid of the error from the analyser would be to justify the knowledge to the theorem prover. The designers of ESC explain that if they simply allow the programmers to override their reasoning facilities then they can do so at this point and get on with writing code and finding bugs. Considering “number of bugs” found as their metric for success, this is probably the best decision. ESC is another step towards a system supporting safe software component re-use but it still falls short of what we are looking for. The main problem is that while ESC, again, gives us an unambiguous description of what a component should do, it, again, fails to give us a way of guaranteeing that a component corresponds with the specification in all cases. ESC is closer to achieving this than DBC was because it is able to symbolically reason about all possible data values. However, the facilities to override the reasoning system and the approach taken by the theorem prover mean that just because a program passes through the analyser does not necessarily mean that it is correct. Note: DBC and ESC are effectively different techniques for verifying the same kinds of asserted components. A component could be defined with a partial contract and then analysed using either a DBC or ESC approach, whichever was appropriate. Complete formal specification and verification The ESC approach has brought us very close to the system we are after, one which will allow us to safely re-use software components. It allows us to unambiguously describe what a component should do and to verify (albeit in some incomplete way) whether the component complies with this contract in all situations. However, we need to address the limitations of the ESC approach before we will have the system we are seeking. The limitations of the ESC system all stem from the designers’ handling of the undecidability of the theorem proving problem. In this earlier section, we established that:

No computer program can automatically perform all the required Mathematical manipulations for us without some form of external assistance.

The ESC system overcame this limitation by trying to resolve the problem as well as it could without external assistance while leaning towards the assumption that the component worked. This allows the ESC system to catch many common errors but means it potentially misses some. The problem is further compounded by the facilities to override the analyser’s logical reasoning system and by the fact that ESC contracts are only partial. The ESC handling of this problem is not the only option. The statement from the earlier section states the nature of the limitation but also mentions a crucial point, that the program cannot solve everything “without some form of external assistance”. So, if we provided some form of appropriate external assistance to the analyser then it could, in fact, solve all the problems it needs to be able to. The obvious next question is what


19

constitutes appropriate external assistance? Clearly, it cannot simply be another computer program. Consider that we have such a computer program which can assist the analyser in appropriate ways and, hence, allow it to verify any symbolic formulae. Now, let us package this program together with the analyser to give us a new application called the “super-analyser”. From the hypotheses, our “super-analyser” is able to verify any symbolic formulae. However, we know that this problem is undecidable and so that there can exist no program to do this. Thus, we have reached a contradiction. We followed a logical system of reasoning based upon well-founded assumptions apart from our assumption that we had a program which could help the analyser in appropriate ways. Therefore, the fault must have lay in this assumption and so, via a proof by contradiction, there must exist no computer program which can provide all of the required assistance. One way to get around the undecidability problem is to involve a human operator in some way. This is the approach taken by many systems which attempt to tackle this problem. Indeed, perhaps the best known theorem prover of all, PVS, uses this approach. This system attempts to prove the theorem that it is given in finite time using a number of techniques. If it fails to do this automatically, it places itself in the control of the human user who can then direct it towards a solution. Here the user supplies what PVS calls “tactics”. These are suggestions as to how the PVS system may be able to solve the problem. After being given one of these tactics, the PVS system will go away and attempt to solve the problem in the light of this “tip off”. The problem with this approach is that it requires a highly skilled human operator to assist the theorem prover whenever appropriate. It would clearly be far more desirable to not need such a person to be present whenever a formula needs to be proved. In reality, this is a solution of limited applicability but it does lay the foundations for the next solution, the one which the Omnibus system is built upon. We have undersold the PVS system a bit in the preceding section. The crucial thing that we failed to mention was that the system can be told to record the sequence of tactics that it was given by the user. This sequence effectively constitutes the appropriate external guidance that the system PVS needed to prove the particular theorem it was operating on. Consider now that we wish to re-verify that the theorem is valid. We can pass the theorem and the transcript of the recording of the sequence of tactics provided by the user during the resolution of the theorem in the previous run. When the system analyses the theorem this time, it can refer to the transcript to provide any required guidance. Thus, the theorem can be easily re-verified without the need for a human theorem prover to be involved. We can formalise how such a system could be used within the setting of component re-use a bit more concretely now. In the example above it is not made explicitly clear who is responsible for what. In other words, whose responsibility is it to run the analyser and develop the transcript? If we think about this, it is fairly clear that this should be the responsibility of the implementer of the component. Before distribution of the component, the implementer should analyse the component and provide all the required guidance to the analyser, recording the details of this guidance. The distributor can then distribute a copy of the details of this guidance along with the component. When the perspective user of the component, receives this package, they can pass the component


20

and this guidance information through the analyser and it should allow the user’s analyser to verify that the component is correct without the need for human intervention. If the implementer of the component does what it is described above that they should do then the component will pass this test and the user can safely trust it. If, however, the implementer distributes a component with insufficient guidance to do this verification then this will be exposed as soon as the user attempts to verify the component using the analyser. In this case, the analyser will be unable to verify the component using the information provided. When this occurs, it is not the responsibility of the user of the component to guide the theorem prover to a solution if one exists. This is not one of their responsibilities. Instead, the user of the component should simply discard the component as untrustable. Things would remain like this until the implementer justified to the analyser that the component was correct. Do we finally have a framework for the safe re-use of software components? Yes. The system described above will allow software components to be re-used safely. As with DBC and ESC, the system allows contracts to be unambiguously defined. However, unlike DBC and ESC, it also provides the basis for a trust system. The user can be confident that a component, which is verified by such an analyser, is going to do what its contract says it should. It is crucial that such a system does not allow facilities, such as those encountered in ESC, which allow the logical reasoning of the system to be unjustly overridden. The presence of such facilities would destroy the basis for any trust we could place in the verification process. It is also crucial that such a system uses total contracts. Partial contracts would not be enough in this sort of a system. The reasoning behind this is that partial contracts, by definition, do not provide all the information about a component. Thus, there are things which are true about a component which are not encoded within its partial contract. Without facilities to override the logical reasoning process of the analyser, there is no way for the client to use this information. The problem is that the user cannot justify the knowledge to the analyser because the justification of what they are saying was part of the partially specified contract that was omitted. For the system we have described to work, the interface of a component needs to provide a complete description of what the component does. In this way, such problems are avoided. Omnibus is built around this approach.

1.2 Omnibus This section will briefly introduce the Omnibus language which will be the focus of this dissertation. The principles of the language itself will be presented along with an explanation of the philosophies behind its design. The language will also be placed in context by considering its advantages and disadvantages relevant to a number of other approaches.


21

1.2.1 What is Omnibus? What exactly is Omnibus and why should you, as a busy person, care? Hopefully, this section will explain, at some high-level, what Omnibus is. By the end of the section you should also be able to see the potential benefits of using such a language. Before giving a more complete description of what Omnibus is, we can go so far as to say that it is something which will allow us to do what was described in the previous section. It will give us everything we need to be able to safely re-use components. In fact, the system described in the previous section was Omnibus although that general description would also have fitted a number of the other most closely related projects. Firstly, it is important to explain that “Omnibus” is a term which encompasses a range of different things. “Java” has come to do this in a similar way. There is a “Java Programming Language”, a “Java Virtual Machine” and “Java libraries”. What exactly is Java? Is it the language? Is it the language and the libraries? Is it all three? The same questions apply to the “Visual Basic Language” and “Visual Basic IDE”. Omnibus is no different in that there is an “Omnibus Programming Language”, “Omnibus libraries” and an “Omnibus IDE”. The term “Omnibus” will be used within a range of different contexts to mean one of these in particular or all three collectively. The context should allow the reader to differentiate between these cases. The Omnibus Programming Language can be fairly well summed up by the following sentence.

Omnibus is a simple object-oriented and functional specification and programming language, with contracts and certificates, for developing robust real-world software applications.

We can break down this statement into the following points. Omnibus:

1. Is Simple 2. Is Object-Oriented 3. Is Functional 4. Is a Specification and Programming Language 5. Has Contracts 6. Has Certificates 7. Is for developing robust real-world software applications

The meanings of many of these terms may not be clear so we will now go on to explain what we mean by each of the sub-statements. Omnibus is simple This is probably the easiest sub-statement to misunderstand. The statement refers to the simplicity of the basic principles underlying the language. It could be best interpreted as a Mathematical simplicity. In a sense, Omnibus is “closer to Maths” than many modern


22

programming languages. This does not mean that only Mathematicians can use the language. Rather, it means that the language can be described more easily in Mathematics than these other languages. Someone may argue that the Java language is simple and that they can work out what a piece of Java does without too much effort. This may even be true. They may also argue that Java is based upon a collection of simple principles. However, there are many complexities in the Java language which are far from simple. These hidden complexities can lead to obscure problems and, thus, are highly undesirable. One such complication is Java’s object reference semantics. In Java, objects are accessed by reference and each reference is either null or the address of some object of the associated class. Fledgling Java programmers are frequently introduced to Java objects without being told this. These programmers may naturally view objects as being defined by the values of their attributes. Thus, the first time they compare objects using the equality operator (“==”), they get confused because this operator does not work as they expect. This is because the operator is comparing the references and not the values held within the objects themselves. Other problems occur when they first encounter aliasing via these object references and when they are introduced to null values. These low-level details complicate the process of reasoning about programs written in Java. This would be acceptable if such complications were an essential requirement of the practical usage of the language, however, more frequently than not, you do not want reference equality, aliasing or null values. Thus, these features complicate the language without giving greater expressiveness. By comparison, Omnibus programmers encounter no such reasoning problems. As we shall see later, equality of objects involves checking whether the values of the attributes of the objects equal, and is independent of reference addresses. Aliasing is simply not possible in the language and different techniques are used to achieve things which are usually implemented using aliasing in Java. Also, object values can not, by default, contain null values. Frequently, null values have no real meaning and allowing them forces the programmer to constantly check that a reference is not equal to null before dereferencing it safely. Omnibus objects cannot be null unless explicitly declared as optional via the “?” suffix or the “Optional” template (for which the “?” operator is a shorthand). Thus, handling of Omnibus objects is simpler than Java objects. In fact, Omnibus objects may be accessed by reference but this fact is not exposed to the programmer and they can safely treat objects as values without encountering any flaw in their reasoning. Simplicity has many advantages. Mathematical simplicity makes it easier to Mathematically model the language (and, hence, programs written in the language). This makes the resulting theorem proving problems easier to resolve. In a language such as Java, the Mathematical complexity of the language itself means that modelling even the simplest program is difficult. The problem is that apparently simple Java statements are mapped to complex Mathematical models in order to truly represent the complicated semantics completely. Simplicity in a more informal sense also has many advantages. The elimination of aliasing is a huge simplification because all variables that are changed must be done so explicitly. Thus, just by looking at the program text, the effect of a piece of code can be


23

determined far easier. This makes the act of writing and debugging applications far easier to manage intellectually. Even without the use of the analysis tools which are available with the Omnibus language, the simplicity of the language has huge advantages. Omnibus is Object-Oriented The central modularisation mechanism within the Omnibus language is that of the class. Omnibus also fully supports the OO concepts of Abstraction, Inheritance and Polymorphism. This is a very important attraction of the language. As we have already discussed, OOP has huge advantages. There are also a huge number of OO programmers who have grown up with the concepts of OOP. To make the language most attractive to these people, we allow them to re-use their accumulated skills in working with OO systems within our new framework. One of the hopes of the Omnibus language is that an existing OO programmer, familiar with a language such as Java, can become an efficient Omnibus programmer relatively quickly. To aid this process, the language also adopts a Java-flavoured syntax and borrows much from its inheritance framework. In fact, Omnibus is in many ways more of an OOPL than many languages which claim to be so. By providing a formal meaning for Inheritance and Polymorphism, Omnibus gives a firm foundation for OOP which is not present in many languages such as Java and C#. The abstraction facilities of Omnibus allow a class to be completely defined by its contract. This is essential for OOP to work properly since without this, the principles of Encapsulation and Information Hiding are in direct opposition. Unlike many languages, Omnibus allows these two principles to co-exist in harmoniously. Using the formal rules for OOP, the Omnibus analyser can, in fact, verify that the basic principles of OOP are respected by the classes of a system. Thus, a class which passes the Omnibus analyser is guaranteed to be Object-Oriented (i.e. is guaranteed to obey the laws of OOP) in a way that other languages cannot guarantee. Omnibus is functional The meaning of functional here comes from the term functional programming language. The Omnibus Programming Language is Object-Oriented in as much as it supports the basic principles of OOP as outlined in the previous section. However, the language is fundamentally different from most Object-Oriented Programming Languages. Most OOPLs are built around the concept of modelling objects are mutable entities with some associated identity. Examples of languages which do this are Java, C++ and C# which, between them, account for the primary programming languages of most programmers. Omnibus is built on a fundamentally different principle. Objects are immutable and have no sense of identity independent of their attribute values. This philosophy is far closer to that used by the family of functional programming languages (FPLs).


24

Functional Programming Languages, as the name suggests, are built around the central concept of the function. Execution of functional programs consists of the evaluation of expressions as opposed to the manipulation of variables via assignments. This is a more abstract approach, independent of memory addresses and other such considerations. The main complexities of references, side-effects and aliasing are completely avoided in these languages just as they are in Omnibus. Omnibus is designed to be attractive to the masses of OO programmers in the way it uses OOP concepts and Java-style syntax. As such, Omnibus may superficially appear to be closer to existing OOPLs than to functional programming languages such as LISP. However, “underneath the hood” the basic principles of Omnibus, in fact, are far closer to those of a FPL like LISP. This is why Omnibus is best described as an Object-Oriented Functional Language. We felt it was important to support the principles of OOP in Omnibus but felt that FP was a better logical foundation for programming than that used in languages such as Java. The adoption of Java-style syntax and organisation was as much a fashion issue as it was anything else. In the still very immature field of Computing, fashion and marketing still play major roles and if we want our language to be used then we must follow the latest fashions to some extent. Thus, we have decided to play the game in a way that many academics choose not to. Omnibus is a Specification and Programming Language Omnibus is both a Specification language and a Programming Language. This is very important because many languages are either one or the other. Languages like Java, C++ and C# are Programming Languages and provide no facilities for specifying formal contracts. Languages like Z, Alloy and Aslan are Specification Languages and provide no facilities for defining concrete implementations. The lack of overlap between these two sets of languages is frequently a limiting factor in the application of formal specification, implementation and verification. A language like Z can be used to both formally define the problem and to design a solution. The produced model can then be analysed in order to allow the programmer to better understand the problem. However, if an executable system is required as the end product, such a model is not a sufficient deliverable in itself. To produce an executable system, the developer will then have to switch to a programming language such as Java. They may be able to use the information about the problem to design a solution which is structured in a better way and they may be able to generate test scenarios from the model but they cannot directly relate the Z specification to their Java application. Specifically, they cannot formally verify that the Java application is consistent with the Z specification. Using Omnibus, specifications and implementations can be written using the same language and the implementations can then be formally shown to be consistent with their specifications.


25

Omnibus has Contracts Contracts are one of the central concepts of the Omnibus language. A contract in Omnibus, much like its real-world counterpart, is responsible for describing the responsibilities of the different parties in an interaction. The contract is used to describe the requirements and behaviour of a component. Contracts form the basis for abstraction in Omnibus. A contract for a class consists of the contracts of the publicly accessible elements within the class. For example, methods within the class are defined in terms of pre- and post-condition assertions in the manner discussed earlier. Contracts in Omnibus are defined using the Omnibus Specification Language. These contracts can then be analysed by the Omnibus analyser. The contracts can be checked for internal inconsistencies and implementations can be checked against these contracts. Contracts give an unambiguous specification of the semantics of a component. They do not suffer from problems relating to the ambiguity of natural language. Omnibus has Certificates Certificates are another one of the main concepts behind Omnibus. Certificates are not a part of the language; they are a framework for ensuring components are trustable. A certified component is guaranteed to be consistent with its contract. Certificates go hand-in-hand with contracts since such a formal certification framework is not possible without formal contract specifications. The principle behind certificates is that part of the job of the implementer of a component is to provide some sort of justification that their component satisfies its contract. This is somewhat different to how things currently operate and requires a slightly different mindset. However, it makes a lot of sense. To gain an Omnibus certificate, the implementer of a component must justify that their implementation is consistent with their contract. This may involve providing guided proofs for selected lemmas which were unresolvable by the analyser. Certificates can then be verified using the client’s analyser. A false certificate can easily be detected because passing the component through the client’s analyser will yield some unresolvable lemma. As soon as such a lemma is found, the certificate is invalidated and the component cannot be trusted. A genuine certificate, with its associated lemma proofs, will pass through the client’s analyser without any unresolvable lemmas and hence the certificate will be validated. The contract and certification systems of Omnibus, together provide an unambiguous description of what a component should do and a trustable guarantee that it does it. Together, they provide all the facilities to safely re-use components.


26

Omnibus is for developing robust real-world software applications Omnibus is not a toy language to investigate purely theoretical ideas, it is a language designed for developing robust real-world software applications. The Omnibus-to-Java translator is one of the most important components of the Omnibus system. It takes Omnibus programs and converts them into code that can be executed via the Java Virtual Machine. Every facility which is included in the language is practically implementable in some efficient manner. There are theoretical aspects to the language and, indeed, to the dissertation as a whole but the ultimate goal of the work is to improve the development of real-world software applications. Many languages have achieved some level of academic respectability without being adopted by real world programmers at all. Naturally, we would prefer Omnibus to be used in practice in the real world. If real world acceptance could be achieved through some reasonable compromise of some of the theoretical aspects of the language, then we would actively pursue such a compromise. This philosophy is inherent in the current version of the language. For example, Omnibus does not model primitive types as classes, it represents them outside the type hierarchy and accesses them by value, not by reference. In a pure OOPL, everything is a type and there are no such things as primitive types. However, in order to make the translation to Java most efficient, primitive types were included in Omnibus similarly to their handling in Java. Note: Primitive types can be automatically converted to objects by the compiler using the boxing approach popularised by the Microsoft.NET platform. This could potentially be used to efficiently translate a pure OO version of Omnibus into efficient Java code. However, this would need further investigation. In order to be usable in the development of real-world software applications, Omnibus needs to provide the tools that real-world developers require. Real-world applications incorporate GUIs, databases, XML and many more distinct technologies. Omnibus needs to be able to allow developers to use the language to work with these technologies. Many theoretical languages fail to provide such facilities. Omnibus will, ultimately, attempt to provide efficient support for these technologies. Extending the system to allow these things to be formally reasoned about is an interesting challenge. It is certainly not straightforward. It will require the Omnibus design philosophy to be mapped onto these technologies.

1.2.3 Comparing Omnibus to other types of languages In this section, we will consider how Omnibus compares to conventional approaches. Omnibus versus Java Java is one of the most widely used programming languages in the world today. It is extremely popular as a language for everything from teaching to implementing real world enterprise applications. So how does Omnibus compare to this popular language?


27

There are a number of key advantages that the Omnibus language has over Java. Key advantages of Omnibus over Java:

1. Omnibus programs are easier to reason about 2. Omnibus associates clear contracts with components 3. Omnibus has a verification and certification framework 4. Omnibus supports generic types properly

We will now briefly discuss these advantages. Omnibus is a far easier language to reason about than Java. For example, Omnibus protects the user from the need to consider low-level details such as reference addresses. Also, Omnibus does not support aliasing and does not allow objects to take on null values unless the type is explicitly declared as optional. These and other things make the Omnibus language simpler to reason about and hence it should be easier to write and maintain Omnibus programs. Even without the use of any of the other facilities of the language, this simplicity is, in itself, a huge advantage. We have briefly discussed the role of contracts within the Omnibus language. These allow the funcitonality of a component to be defined in an unambiguous fashion. Java provides method signatures and the JavaDoc commenting system to describe their contracts. However, such informal descriptions are frequently ambiguous and incomplete. Following simplicity and contracts is the last of the big three advantages of Omnibus over Java. It is Omnibus’ verification and certification framework. The Omnibus analyser allows Omnibus programs to be formally reasoned about and their correctness verified. This is of great use for the detection of bugs. Java provides no such facilities above its basic type checking abilities. Instead, bugs are detected by testing the application with concrete values and informally verifying that the results are appropriate. This approach can never be practically applied to give the guarantees that the Omnibus verifier can. Another big advantage of Omnibus over Java is that it supports generic types in a far better way than Java. In Java, generic types are supported by exploiting polymorphism. Collections, and other generic classes, are defined to hold instances of the Object class, the root class of the class hierarchy. Objects of any class can then be added into the collection and are implicitly cast to Objects. The problem is that type information is lost when this cast occurs. The type information needs to be recovered when the object is retrieved from the collection and the only way to carry out such a recovery is to use an explicit cast. This cast is highly dangerous and will cause a failure unless the type is as expected. Almost always, we use homogenous collections. We usually work with collections of strings or collections of integers or collections of some other specific object. However, in Java, we can never specify this homogeneous property and never ensure it is respected. Omnibus uses template classes to provide support for generic types. Template classes are classes that are parameterised by some sequence of types. The template class is then described in terms of these type parameters with the parameters being replaced with actual type values at compile-time. We can use such a system to define a Collection class which is parameterised by an Element type. We can then use instances of this template class by supplying a value for the Element


28

parameter. For example, we can define a collection of Strings using the notation Collection[String]. The corresponding add method of this template class takes a parameter of the type of the type parameter and the get method returns a value of this type. Thus, the type information is not lost as it is in the Java approach. Omnibus and Java versus other commercial programming languages There are many ways in which Omnibus and Java are different but there are also a number of ways in which they are similar. Omnibus code is translated to Java and then compiled and so it shares many of the advantages that helped bring Java to prominence. Key advantages of Omnibus and Java over other commercial programming languages:

1. Garbage collection 2. Access to the extensive Java libraries 3. Portability

Java and Omnibus both use garbage collection systems to de-allocate dynamically allocated memory storage. C# and many other newer languages also have garbage collection systems but most older languages such as C and C++ do not. Explicit de-allocation of memory is a tricky business and it can cause subtle bugs to creep into an application. Letting the run-time system automatically handle this lifts a great burden from the programmer and eliminates one of the most common sources of bugs in languages without this facility. Java comes with an extensive array of standard libraries to allow the programmer to interact with a wide-range of other technologies. Such facilities are essential to make a language usable for the development of real world applications. Omnibus piggybacks on the Java libraries. The Java libraries are not directly compatible with the Omnibus language; however, they can be used indirectly via appropriate Omnibus wrapper classes. All commercial programming languages need some form of library facilities to provide such support for additional technologies but the Java libraries are considerably more comprehensive than many. One of the initially most important advantages of Java was its portability. The compiled form of Java code is not platform dependant and can be executed on any platform for which a Java run-time system was available. Hence, Omnibus code can also be executed on any platform for which a Java run-time system was available. Omnibus versus Z The Omnibus language contains a specification sub-language which can be used to give a formal specification for a system. Z is a formal language that is able to achieve this same goal. Key advantages of specification using Omnibus over specification using Z:

1. Language support for commercially required facilities 2. Designed to be understandable to Java programmers 3. Integrated specification and implementation languages


29

Omnibus is fully compatible with facilities required for the production of commercial software applications. Z has no such facilities and provides little over and above general Mathematics. Z specifications can be very inaccessible to people without Mathematical training. The notation uses Mathematical symbols and, as a result, Z specifications frequently appear scary to the uninitiated programmers of the world. The Omnibus notation is designed to be familiar to those with programming training but not necessarily extensive Mathematical training. As such, Omnibus specifications appear less scary to most people. Probably the most crucial advantage of Omnibus over Z is that Omnibus is more than just a language for describing problems. It is a language for describing problems and solutions. A programmer can specify, implement and verify their application all within the Omnibus language. This is not possible with Z because Z cannot be used to describe implementations. As such, a problem can be described using Z but an implementation language, such as Java, must still be used to provide an implementation. Using this combination of languages, the problem and solution are described using different languages and it is hard to relate the two. As a result, it is not possible to give the same guarantees that Omnibus can that a component is correct. Omnibus Verification versus Testing The Omnibus verification process is used to ensure the correctness of Omnibus programs. In many other language, such as Java, correctness is ensured using testing. Key advantages of Omnibus verification over Testing:

1. Omnibus verification handles all cases 2. Requirements formally encoded as part of the language 3. Dedicated tool support for verification

Testing involves executing a program in a handful of test scenarios and then checking that the behaviour of the program was acceptable in each case. A separate test should be performed for each of a collection of appropriate test inputs. For non-trivial programs, it is not feasible to test the behaviour of the program for all possible inputs and so a subset of the possible inputs are selected. However, the behaviour of the program over this subset of possible inputs is not enough to conclude that the program behaves correctly in all cases and hence is correct. To be correct, a program should behave appropriately for every possible input. Non-exhaustive testing cannot achieve this. On the other hand, Omnibus verification can. The Omnibus verification process is built around the symbolic execution approach which is used to reason about all possible inputs to a program. Thus, Omnibus verification can be used to show that the program behaves correctly for all inputs. Another advantage of the Omnibus approach to verifying the correctness of programs is that the requirements which form a definition of what it means for a program to be correct are expressed within the Omnibus language itself. This is beneficial because it gives a formal framework for the expression of the requirements. There are no such


30

facilities within the Java language. Instead, in Java, programmers can use third-part frameworks like jUnit to some formal achieve this formal expression of requirements. The final of the listed advantages is closely related to the preceding one. As well as providing support for the expression of requirements within the Omnibus language, the Omnibus framework also provides tools to assist in the verification of these requirements. Languages such as Java provide no direct support for the execution of test scenarios. Again, third-part frameworks like jUnit can be used to achieve this. Omnibus versus closely related research projects The reader is referred to Chapter 9: Related Work for comparisons between Omnibus and some existing projects with similar aims.

31

Chapter 2

The Basics of Programming in Omnibus

Overview:

This chapter will introduce the basic principles of the Omnibus language. It will start by looking at the basics of expressions, classes and objects in the language. These are the basic concepts which underpin the whole language. It will then go on to look at writing code using the language and will take the reader through the process of writing their first Omnibus class. The chapter will conclude by looking at how applications are structured in Omnibus. This final section will look at the use of packages to group related classes together, at how Application classes are used to start Omnibus applications and will present a first simple Omnibus application.

Contents:

2.1 The basics of expressions, classes and objects 2.1.1 Primitive expressions and types 2.1.2 Classes 2.1.3 Objects 2.1.4 Core language classes

2.2 Writing code 2.2.1 Basic statements 2.2.2 Branching statements 2.2.3 Repetition statements 2.2.4 Writing a first class

2.3 Structuring applications 2.3.1 Managing packages 2.3.2 Starting applications 2.3.3 Writing a first Omnibus application


32

In this chapter we will present everything that is needed in order to write simple applications in Omnibus.

2.1 The basics of expressions, classes and objects This section will introduce the basic principles of expressions, classes and objects as they are used in the Omnibus language.

2.1.1 Primitive expressions and types We start by looking at the first of the type systems, the primitive types. We will look at the types that make up this type system and the values and operators used to build up expressions of these types. Primitive types and values We will first consider Omnibus’ primitive types. The Omnibus type system is very similar to that employed by the Java/C family. There are two separate type systems. The first of these are primitive types. These are types which are built into the system and the user cannot define their own primitive types. Values of these types are accessed by value and not by reference. The primitive types supported by the Omnibus language are integer and boolean. Integers are used to store numeric values ranging between –2,147,483,648 and 2,147,483,647. This type is equivalent to Java’s int primitive type. Booleans are used to store one of two possible values: true or false. This type is equivalent to Java’s boolean primitive type. Integer values are defined by sequences of numeric characters. For example, the value twenty-two is simply written in Omnibus as “22”. Negative integers cannot be directly expressed but can be indirectly expressed by the use of the unary negation operator. The tight binding of this operator means that expressions such as “-22” are correctly interpreted as negative twenty-two by the system. Boolean expressions can have one of two values: true or false. The language defines constants true and false. These can be used as they are in Java. Operators Just as in Java and virtually every other programming language, Omnibus has a collection of built-in operators for building expressions. There are operators for each of the primitive types, we will describe them briefly here.

Chapter 2: The Basics of Programming in Omnibus

33

Boolean operators The standard operators of Boolean logic are provided by the language. Operator Symbol Example Negation ! !x Or | x | y And & x & y Implication => x => y If and only if <=> x <=> y Integer operators The standard operators for manipulating integer values are also supported. Each of the operators show below take integer arguments and return an integer. Operator Symbol Example Minus – –x Addition + x + y Subtraction – x – y Multiplication * x * y Division / x / y Modulus % x % y Note: the division system used in Omnibus expressions is Integer division. Hence the Mathematical description of division as z = (z/y)*y does not hold. Instead, we have the description z = (z/y)*y + z%y. The following are the standard relational operators supported by Omnibus. They are defined only over integers. Operator Symbol Example Less than < x < y Less than or equal to <= x <= y Greater than > x > y Greater than or equal to >= x >= y Generic operators Omnibus also defines a number of operators which can operate over any of the primitive types. The first group of such operators that we will consider are the equality operators. These compare the values of two primitive expressions of the same primitive type. For example, they can be used to compare two integers, two characters or two booleans but not, say, an integer and a character.


34

Operator Symbol Example Equality = x = y Inequality != x != y Other operators supported by Omnibus are the if expression (corresponding to the conditional operator in Java) and the let expression. Operator Syntax Example If “if” boolean “then” any

“else” any “fi” if y != 0 then x/y else 0 fi

Let “let” declarations “in” “(“ any “)”

let x := 3, y := 2 in (x+y)

Variables Like Java, Omnibus is a strongly typed language. This means that each variable must be declared to hold a value of a specific type and can never hold a value of any other type. For example, a variable declared to hold a character could not be assigned the integer value 55. In addition to this, each expression in Omnibus has some well-defined type. For example, the type of x*y-z is integer and the type of if b then e1 else e2 fi is the type of e1 (which must be the same as the type of e2). The combination of these two facilities allow the Omnibus compiler to perform static type checking. This process checks that type violations do not occur at run-time by analysing the source code of the program at compile-time. This system allows many potential sources of run-time errors to be detected and eliminated at compile-time. There is a problem of what values variables should be given before they are explicitly assigned one. Either some default value can be assigned to them or they can be given an undefined value. Java uses these two approaches in different contexts. For example, it initialises local variables to undefined values but initialises attribute values to default values. In Omnibus, all variables are given an undefined value initially which they retain until they are assigned a proper (i.e. defined) value. A variable cannot be used in an expression until it has been assigned a proper value.

2.1.2 Classes Class definitions are the central mechanism for structuring Omnibus programs. Classes allow new types to be defined and used in addition to the primitive types. The concept of the class in Omnibus is similar to that present in every Object-Oriented language. We will now briefly introduce the things that make up a class definition.


35

Attributes Attributes are also referred to as fields or instance variables. These hold the property values of the object. Each attribute has some name and type. It is declared as follows:

“attribute” name “:” type The values of the attributes of an object can be accessed by code outside the class (unless it is declared as private); however, it can only be changed by the code within the class. e.g.

attribute counter:integer Constants Constants are very similar to attributes except they are given some initial value and this value cannot be changed not even by the code within the class.

“constant” name “:” type “:=” value The value must be compatible with the declared type for the declaration to be valid. e.g.

constant pi:Rational := Rational.of(314,1000) Functions Functions (or, more correctly, member functions) can be used to return some value calculated from the passed parameters and the current values of the attributes in the object. These functions can be called and used in expressions. Functions may appear (to a Java programmer) to be identical to a Java value-returning method. However, there is a crucial difference between Omnibus (member) functions and Java value-returning methods. An Omnibus function is not permitted to alter the values of any of the attributes of the associated object. Nor is it permitted to alter the values of any of its parameters. Omnibus does not permit the use of global variables, the other main way of achieving side-effects in Java functions. A function can, however, declare new local variables and manipulate their values in order to calculate its result.

“function” name “(“ parameters ”)” “:” type Each parameter has a name and a type and multiple parameters are separated by commas.


36

e.g. function gcd(x:integer, y:integer):integer

The above description of a function is equivalent to a Java method type signature. Java permits the details of the method to then be described by providing an implementing collection of statements. Omnibus also allows an implementing collection of statements to be defined. However, it allows the interface of the method to be described more concretely than just through a type signature. The facilities for this are detailed in the next chapter. Constructors Constructors are used to create a new instance of a class. A constructor accepts some set of parameters and initialises a new object instance of the class using these values. This involves initialising the values of each of the attributes of the object. Each of the attributes of the object must have defined values after the constructor call executes. In Java, constructors are special methods which have the same name as the class and no return type. Multiple constructors are permitted through Java’s overloading system. In Omnibus, constructors have their own name associated with them.

“constructor” name “(“ parameters “)” e.g.

constructor withValue(val:integer) In Omnibus, like in Java, constructors are not inherited. This is discussed more fully in Chapter 4. Creators Creators are identical to constructors except for the fact that they are inherited. Creators cannot be explained fully until we have looked at inheritance in Omnibus. We do this in Chapter 4.

“creator” name “(“ parameters “)” e.g.

creator default() Creators are used to implement the Application launcher system that we will see at the end of this chapter and also to support default values.


37

Operations Operations provide the only means for altering the values of the attributes of an object. There are different variations of operations. We consider those of the simplest form in this section and look at the more complicated variations in Chapter 4.

“operation” name “(“ parameters “)” The Java programmer could consider Omnibus operations roughly equivalent to void returning Java methods. The crucial difference is that while Java void-returning methods change the value of the object, Omnibus operations return a new object. The values of the same object after calls of a Java void method and Omnibus operation will be the same. However, in Java, any variables which pointed to the same object will have changed whilst, in Omnibus, any variables which pointed to the same object will still have the same value. By avoiding aliasing problems, Omnibus code is far easier to reason about. Putting it all together in class We have looked at all the sub-components of a class. In this section, we will look at how to combine them to create a complete class definition. Each class has a name and a collection of the attributes, constants, functions, constructors, creators and operations.

“class” name “{“ members “}”

e.g.

class Counter { constant defaultValue:integer := 0 attribute value:integer constructor zero() constructor withValue(val:integer) function magnitude():integer operation inc() operation dec() }

Methods Java uses the term method to refer to a call applied to some object. There are two types of method, instance methods and static methods. Instance methods must be applied to an object whereas static methods are applied to the class itself.


38

Omnibus has attributes, constants, functions, constructors, creators and operations. There is nothing explicitly called a method. In fact, functions, constructors, creators and operations are all methods. Constructors and creators are associated with the class itself much like static methods are in Java. We refer to these as class methods in Omnibus. Functions and operations are associated with objects much like instance methods are in Java. We refer to these as object methods in Omnibus. In addition, Omnibus automatically generates special accessor methods for each of the attributes and constants in an object. These are essentially functions with the same name as the attribute/constant and simply return the value of the associated attribute/constant. These accessors are also object methods. The Omnibus method classification system is illustrated by the table below.

Constructors Creators Class methods Constant accessors Attribute accessors Functions

Methods

Object methods Operations

2.1.3 Objects We have looked at how to define classes. In this section we will look at how to create, manipulate and query instances of these classes which we refer to as objects. Objects in Omnibus are somewhat different from Java objects. Firstly, and most crucially, they are immutable. This is a completely different foundation than is present in Java where objects can be mutable. This allows aliasing to be eliminated and so makes reasoning about the code far, far easier. Users of the Java programming language can refer to the String class as an example of a class with immutable objects. This immutability has been discussed previously and will be discussed in more detail later. Using methods Objects in Omnibus are created, manipulated and queried using the methods of the class to which the object belongs. Objects are created using constructor/creator calls, manipulated using operations and queried using attribute accessors and functions. Objects are created by calling one of the constructor or creator methods of the class that you wish to create an object of. This is slightly different to how this is performed in Java where the new operator would be used. Omnibus contains no new operator and represents constructors as just another form of method. The Omnibus approach was adopted to allow objects to be created using the existing logic of method calls and hence Omnibus does noe need to introduce a new language construct with its own set of rules to model object creation.


39

For example, to create instances of the class we described in the previous section, we could use any of the following:

Counter.zero() Counter.withValue(5) Counter.withValue(x+5*z) Counter.withValue(-7)

Each of these returns an object instance of class Counter. Objects are manipulated by calling one of the operation methods of the object. An operation creates a new object from the previous one. For example, to manipulate the existing Counter variable c, we can use the following calls:

c.inc() c.dec()

Each of these returns an new object instance of class Counter with a value of one greater than or one less than the value of c.value(), respectively. We can also apply operations to newly constructed objects or previously manipulated objects in the following way:

Counter.zero().inc() Counter.withValue(23).dec().dec().inc()

Each of the above yields a Counter instance. Objects are queried by calling an attribute accessor or function method of the object. For example, we could query the following Counter objects to yield the results shown.

Counter.zero().value() yields the value 0

Counter.withValue(5) .value() yields the value 5

Counter.withValue(-7) .value() yields the value –7

Counter.zero().inc().value() yields the value 1

Counter.withValue(23).dec().dec().inc().value() yields the value 22


40

and Counter.zero().magnitude()

yields the value 0

Counter.withValue(5) .magnitude() yields the value 5

Counter.withValue(-7) . magnitude() yields the value 7 In the first set of queries, we use the accessor method for the value attribute. In the second set, we use the magnitude function method. Constant accessor class methods are used just like any other class method. For example, we could create a Counter object with the default value as follows:

Counter.withValue(Counter.defaultValue()) This uses the defaultValue constant accessor class method as a parameter to the withValue constructor class method. Object equality Comparing objects in Java is one of the most common sources of error. Experience teaching Java to students shows time and again that Java’s handling of object equality is confusing. In Java, objects are accessed by reference. The value of a Java object is represented as a reference to a memory address on the heap where the object resides or a null value. When a method of a Java object is called, the Java run-time system uses this reference to locate the target object and invokes the method code. This background work by the run-time system allows the users of the language to conceptually think of the object as being directly stored in the variable. Such a conceptual model is far simpler and frees the programmer from the need to consider memory addresses and references. The programmer can think about the objects in the program in more abstract terms. Programmers new to programming frequently do this. However, there is a deadly danger involved in this reasoning. In Java, the equality operator compares two values. When it is applied to two objects, the values of the objects are compared. These values are references to locations in memory and so this comparison compares these references and not the objects themselves. Thus, the newbie programmer’s abstract model breaks down. The problem is best illustrated via probably its most common form which is the comparison of two Strings. Suppose the programmer wants to compare the value of the String variable s with the String literal “Hi”. They would frequently (but incorrectly) attempt to use the following for their comparison.

s == “Hi”


41

However, even when the String variable s contains “Hi”, this test will yield false (on most occasions). The equality operator is comparing the value of the reference contained in the variable s with a reference to a newly created String containing “Hi”. The addresses are different even if the objects at those addresses are the same and hence the test yields false. This makes no sense to the programmer without them lowering their abstract view of the language to incorporate the handling of references and memory addresses. This is subtle and hence undesirable as it leads to many bugs. What the programmer needs to do to perform such a comparison is to use the equals method as shown below.

s.equals(“Hi”) This yields the desired result of true whenever (and only when) s contains “Hi”. In Omnibus, the user need not think about objects in terms of references and memory addresses. In Omnibus, comparing two objects means are the values of these objects equal? The definition does not involve memory references in any way and the user can retain their abstract view of these objects without any loopholes. The behaviour of the Omnibus equality operator is equivalent to a call of the String equals method. In Java, the equals method must be overridden by each class to give this value equality functionality, however, in Omnibus, the equality operator is automatically defined with this functionality over all objects. The formal definition of equality of objects in Omnibus is that two objects are equal if and only if all of their corresponding attribute values are equal. Note: In fact, Object equality is a complicated issues to which we return later in the Future Work chapter.

2.1.4 Core language classes As well as the built-in primitive types and the simple classes we have seen so far, there exist other forms of types. In this section we look at a collection of classes that are built into the language. We start by explaining the handling of Strings. We then present three special operators which operate on types and yield classes. These classes can then be used exactly like the classes we have seen so far. For example, they can be used as the types of variables and methods can be applied to them. String Omnibus provides a String class. String literals can be given as quoted strings in the source text. The addition (+) operator is also overloaded to allow Strings to be concatenated. In Omnibus, Strings are atomic and so they cannot be treated as sequences of characters.


42

Collection Much like in Java where a pair of empty square brackets can be appended to a type to represent an array of that type, in Omnibus an asterisk can be appended to a type to represent a collection of that type. Collections and not arrays form the basic aggregation technique in Omnibus. The difference being that collections are dynamically resizable and hence somewhat more abstract. The asterisk suffix operator can be thought of as a template for a special set of classes, one for each of the types it is applied to. For example, the expression Element* (where Element is some class or primitive type) can be thought of as representing the class given below.

class Element* { constructor empty() operation add(e:Element) operation union(c2:Element*) operation removeLastOf(e:Element) operation removeAllOf(e:Element) operation removeAll() function get(i:integer):Element function contains(e:Element):boolean function size():integer function countOf(e:Element):integer }

Note: we will see later that the asterisk suffix is a notational short-hand for Collection template. The type expression Element* is equivalent to Collection[Element]. Templates are discussed in Chapter 4. The reader can also refer to Appendix B for the full details of the Collection template and all other templates considered in this section. For example, the following are expressions evaluating to classes representing a collection of integer, a collection of Counter and a collection of a collection of Counter, respectively.

integer* Counter* Counter**

We can create a new Collection containing the integers 1, 2 and 3 via the following expression.

integer*.empty().add(1).add(2).add(3) Note how the empty() constructor is applied to the integer* class.


43

Map The Map type in Omnibus is equivalent to the Hashtable class in Java or to the finite partial function in Mathematics. Maps are defined using the -> operator. This takes two arguments, representing the Key and Value types. Again, the right-arrow infix operator can be thought of as a template for a special set of classes, one for each combination of types to which it is applied to. For example, the expression Key -> Value (where Key and Value are some classes or primitive types) can be thought of as representing the class given below.

class Key -> Value { constructor empty() operation put(k:Key, v:Value) function contains(k:Key):boolean function get(k:Key):Value function keys():Key* }

Note: we will see later that the right-arrow infix operator is a notational short-hand for the Map template. The type expression Key -> Value is equivalent to Map[Key, Value]. For example, the following expressions evaluate to classes representing a map from integer to integer and a map from integer to Counter.

integer->integer integer->Counter

We can create a Map mapping each of the integers from 1 to 5 to a boolean representing whether they even as follows.

(integer->boolean).empty().put(1, false).put(2, true) .put(3, false).put(4, true).put(5, false)

Note how the empty() constructor is applied to the (integer->boolean) class. Optional In Omnibus, unlike Java, by default objects cannot be given null values. A variable must be explicitly declared as optional before it can be given the equivalent of a null value. Omnibus uses the special ? suffix operator to indicate an optional type. It should be noted that, unlike in Java, there is no special programming logic for handling optional types and null values. Optional types are represented as just another class to which methods can be applied and null values are constructed by using the nil() constructor of the appropriate Optional class. Once again, the question mark suffix operator can be thought of as a template for a special set of classes, one for each type to which it is applied. For example, the


44

expression Element? (where Element is some class or primitive type) can be thought of as representing the class given below.

class Element? { constructor nil() constructor of(value:Element) function isNil():boolean function value():Element }

Note: we will see later that the question mark suffix operator is a notational short-hand for the Optional template. The type expression Element? is equivalent to Optional[Element]. For example, the following expressions evaluate to classes representing an optional integer and an optional Counter.

integer? Counter?

We can create a nil optional integer value and a non-nil optional Counter value with the counter value of 5 as follows.

integer?.nil() Counter?.of(Counter.withValue(5))

The value Counter.withValue(5) would not be acceptable in place of the second of these because Counter.withValue(5) is of type Counter. The type of Counter?.of(Counter.withValue(5)) is Counter? which is what is desired.

2.2 Writing code In the preceding section we saw how Omnibus can be used to define classes. However, the classes we defined were just high-level interfaces that described the signatures of a collection of attributes and methods. There is no way that we could execute these classes as they are. In Java, the programmer first designs such interfaces and then goes on to provide an implementation for each of the methods of the class. We can carry out the same step next in Omnibus using the Omnibus implementation language, however, as we will see in the next chapter, in Omnibus we also have the option of describing the class more fully before making this step. In this section, we will look at the statements that make up the Omnibus implementation language. These statements can then be used to provide implementations for the methods in our Omnibus classes.


45

2.2.1 Basic statements In this section, we will consider the four most fundamental statements in the Omnibus implementation language. Declaration statement This statement is used to introduce a new variable. This new variable can then be manipulated in any way the programmer sees fit.

“var” variablename “:” type (“:=” expression)? The newly declared variable can optionally be given some initial value by including the assignment section in the above syntax expression. If no initial value is given, the variable is given the special value undefined and it cannot be used in an expression until it is given a defined value. There must be no variable with the specified name currently in existence in the current scope level for the declaration to be valid. The variables in the system will contain all the attributes of the class, all the parameters and any previously declared variables. The scoping rules for Omnibus are the same as for Java. The declared variable can be used within the scope of the method body from the statement following the one in which it is declared to the end of the scope level in which its declaration appears. Assignment statement This statement is the cornerstone of imperative programming. The statement is used to re-assign the value of some named variable to the value yielded by the evaluation of the given expression.

“let” variablename “:=” expression Note: we will see later that there are restrictions on what we can use as the target of an assignment statement. For example, the parameters we have been shown thus far are not valid as the targets of assignment statements. Operation call statement This statement is equivalent to the method call statement which plays such a pivotal role within the Java programming language. There are two variations of it, the object operation call and the local operation call. The object operation call re-assigns the value of an object containing variable to be the object which results from the application of the operation with the specified name, given the specified parameters, to the value of that object before the statement was executed. The local operation call re-assigns the values of the attributes in the current object (the this object) by applying the operation


46

with the specified name, given the specified parameters and the values of the attributes before the statement was executed.

“call” (objectvariablename “.”)? operationname “(“ parameters “)” This statement bears more similarity to an assignment statement than it does to a Java method call statement. The Omnibus statement:

call obj.doSomething(withThis) is equivalent to:

let obj := obj.doSomething(withThis) The latter statement can be read as: call the doSomething operation of obj, passing the parameter value withThis, and store the resulting object back in obj. Note: when we consider complex operations in Chapter 4 we will see that we can no longer represent operation call statements using a primitive assignment statement. The problem is that we will be then need to change the value of multiple variables, something which we cannot do with a single assignment statement. Construct statement The construct statement can be used to call a constructor method of the current class. A construct statement constructs a new object of the current class by calling the constructor with the specified name, passing the specified parameters, and then assigns each of the attributes in the current object to the values of the attributes of this new object.

“construct” constructorname “(“ parameters “)” The construst statement is roughly equivalent to a constructor call in Java. The main difference is that the construct statement can be called from anywhere in any kind of Omnibus method. In Java, a constructor call can only appear as the first statement in the body of a constructor. Note: In the precense of Inheritance, the construct statement can also be used to invoke constructors of the super class. This will be discussed when meet inheritance.

2.2.2 Branching statements In this section we will look at the different branching statements provided by the Omnibus implementation language. These statements are used to execute different branches of code depending on some condition.


47

If statement This is the best known and most widely used branching mechanism in programming. It has an illustrious history that goes back as far as structured programming. The branch of code to be executed is determined by the successive evaluation of the conditions defined in the if and elseif clauses.

“if” boolean “{“ statements “}” (“elseif” boolean “{“ statements “}”)* (“else” “{“ statements “}”)?


48

There are two crucial differences between the syntax of the Omnibus if statement and the syntax of the Java if statement. Firstly, the Java if statement executes the statement immediately following it. If multiple statements are to be contained in the branch then a compound statement (“{…}”) can be used to specify that all the statement should be treated as if they were a single one and hence all executed together. Experience has shown that this frequently leads to confusion for many beginners to the language and provides a common source of error even for more experienced users of the language. Omnibus addresses this situation be requiring all if statement blocks to be enclosed with curly brackets, whether they have a single statement or many statements. The second major difference is a consequence of the first. By always requiring curly brackets, we complicate the syntax of the if..else if..else if..else.. statement as is illustrated in the following example

if (b1) { S1; } else if (b2) { S2; } else if (b3) { S3; } else { S4; }

becomes

if (b1) { S1; } else { if (b2) { S2; } else { if (b3) { S3; } else { S4; } } }


49

This is rather horrible and provides much more scope for error so we introduce the elseif (all one-word) clause as a solution. Using this gives us:

if (b1) { S1; } elseif (b2) { S2; } elseif (b3) { S3; } else { S4; }

This gives us the best of both worlds. Select statement The standard select/switch/case statement is also supported. This uses the value of an expression to determine which of the branches to take. The programmer is encouraged to use this statement instead of an if statement wherever possible because it is easier to read.

“select” “case” “of” expression “{“ (“case” expression “{“ statements “}”)* (“default” “{“ statements “}”)? “}”

This statement is semantically equivalent to a corresponding if..elseif..elseif..else.. statement. Unlike the Java switch statement, the Omnibus select statement can use a variable of any type as its control variable (not just a primitive value).

2.2.3 Repetition statements In this section we will look at the various repetition statements provided by the Omnibus implementation language. These statements are used to execute some collection of statements repeatedly until some appropriate time to stop. While statement This statement is present in most programming languages. A collection of statements are executed repeatedly while the evaluation of the loop condition yields the value true.


50

The loop condition is evaluated before the first execution of the loop body and after each subsequent execution. When the loop condition is evaluated and yields false the statements following the while loop will be executed.

“while” boolean “{“ statements “}”

Again, curly brackets are a required part of the syntax of the while statement, similar to omnibus if statements. Note: Once we start to formally analyse Omnibus classes, we will need to augment this statement to include a loop invariant assertion. This will be discussed where appropriate. Repeat statement This is a variation of the while statement. The only difference between the two statements is how they behave on the first time they are executed. The while statement evaluates the loop condition before executing the statements in the loop body and hence the loop body of a while statement may not be executed any times. The repeat statement executes the statements in the loop body before it evaluates the loop condition. This means that the body of the repeat statement is always executed at least once.

“repeat” “until” boolean “{“ statements “}”

Note: As was the case with the while loop, we will later need to add a loop invariant clause to every repeat statement. For statement This is another of the classic statements that has been with programmers for a long time. This statement repeatedly executes the statements in the loop body a set number of times whilst incrementing a control variable with each iteration.

“for” variablename “:=” integer “to” integer “{“ statements “}”

Again, curly brackets are included within the syntax of the statement. A new variable with the specified variable name is declared and can be used within the statements in the loop body. The value of this control variable cannot be changed by the statements in the body of the loop. Also, the value of the lower bound (the value immediately to the left of the “to”) must be less than or equal to the value of the upper bound (the value immediately to the right of the “to”).


51

The first integer value must be less than or equal to the second value for it to be a valid Omnibus statement. Omnibus uses a VB-style for statement over a Java-style one. The reason for this is that it was felt the Java-style for statement was a bit too general. In fact, it can be used to allow any loop to be represented. It was felt that the role of the for statement should be to represent a distinct type of repetition statement just as the select statement represents a distinct type of branching statement. As soon as the programmer sees a select statement or for statement it would be nice if they were able to be confident that they knew the basic principles behind how it works. The use of these statements could help the reader understand the flow of control far easier than if the general if or while statements were used. Using a Java-style for loop, the presence of the for loop gives no such clues whereas the use of a VB-style for loop does. Hence, a VB-style for loop was adopted. A potential source of misunderstanding is how the lower and upper bound expressions are evaluated. They are evaluated only once, when the loop is first reached, and these values are then used throughout the lifetime of the loop. Note that this is different from Java where the loop condition is dynamically evaluated before the first execution of the loop body and after each subsequent execution of it. Thus, in Omnibus, a for loop can only be executed a finite number of times. Note: The for statement will also require an augmentation later when we come to formally reason about code. ForEach statement The last of the repetition statements is the foreach statement. This is a newer statement not yet present in every programming language. It allows the programmer to iterate over a collection without having to deal with the low-level details of the process. This is a great aid to abstraction and it is far better to use this than it is to use of a for statement ranging over the index values of the collection or some sort of a while loop. The foreach statement accepts a Collection object which is the collection to be iterated over. It executes the body of the foreach statement once for every element in the collection.

“foreach” variablename “:” type “in” Collection “{“ statements “}”

As with the for statement, a new variable with the specified variable name is declared and can be used within the statements in the loop body. The value of this control variable cannot be changed by the statements in the body of the loop. Similarly to the for statement, the collection specified immediately after “in” keyword is evaluated once when the loop is first reached. The value which the collection evaluates to at this time is stored and used to control future iterations. Thus, problems of the mutation of the collection which is being iterated over are avoided.


52

Note: The foreach statement will also require an augmentation later when we come to formally reason about code.

2.2.4 Writing a first class We now have the facilities to write our first class which we can compile and execute. In this section we will take the Counter class which we used earlier and we will write code for the methods in it using the Omnibus implementation language. In the final section of this chapter, we will then look at how to build an application to use this class. An implementation of the Counter class could be the following:

class Counter { constant defaultValue:integer := 0 attribute value:integer constructor zero() { let value := 0 } constructor withValue(val:integer) { let value := val } function magnitude():integer { if (value >= 0) {

let result := value } else { let result := -value } } operation inc() { let value := value + 1 } operation dec() { let value := value - 1 } }

The above example should be fairly easy to understand after the previous section.

2.3 Structuring applications In this section we will look at the structuring of applications in Omnibus. We will start by looking at how to organise classes into packages and then we will go on to look at how Omnibus handles application entry points. At the end of the section, we will use


53

everything we have learned from this chapter to write a simple first Omnibus application.

2.3.1 Managing packages Object-Oriented programs are divided up into classes. This breaking down of the program into small interacting classes makes the system far easier to reason about. However, for large applications classes can be too small. A large OO program could consist of thousands of classes. Without some way of managing these classes, the program will still be very hard to reason about. It would be useful if we could group related classes together. Then the programmer would not need to consider all the classes in an application at once but could, instead, only consider the classes in a particular group at once. This would be far easier to manage. Packages provide such a structuring mechanism. Each package has a name and contains some collection on classes and sub-packages. Packages form a hierarchy. There is a root package which has no name. Packages are defined using a path notation from the root. Classes within the package hierarchy can also be referred to using this path notation. The package hierarchy should be represented physically in the associated file system. The package handling system in Omnibus is closely based upon similar facilities in the Java language. This was done to aid the generation of Java code. Package directive It is usual when you declare a new class to place it within the package hierarchy in some way. A special directive can be placed at the start of an Omnibus source file to indicate the package which the class file is to be a part of. This directive is given along with the path to the package which the class is to belong to. The class file should then be saved in a corresponding directory in the file system.

“package” path The path should contain a list of the packages containing the class. e.g.

package usrs.twil.tests The file which starts with this line should then be placed in the “usrs/twil/tests/” directory. Uses clauses The packaging facilities allow classes to be grouped into collections that are related in some way. These groups can then be considered relatively independently. If there were no packaging facilities then all classes would be directly accessible. However, because


54

of this grouping, not all classes are directly accessible anymore. We have direct access to all the classes in the package which the class file is in and direct access to all the base classes in the default Omnibus package (omni.lang) which all classes can automatically refer to. However, those are the only classes we can see by default. If we want to work with any classes elsewhere in the package hierarchy then we must explicitly import them. This is achieved via uses clauses. Immediately after the package directive, we can give a number of uses clauses. A uses clause is equivalent to a Java import statement. It allows the classes in the specified package to be directly referred to in the rest of the file. Typically the programmer will need uses clauses for a number of the packages in the standard library and uses clauses for packages within their own personal package hierarchy.

“uses” path The path should contain a path identifying a package or class within the package hierarchy. If the path identifies a package, all the classes within that package will be imported but not any of the contents of the sub-packages. If the path identifies a class, the class will be imported. e.g.

uses omni.adt uses usrs.twil.uni.ai.IntelligentAgent

These clauses will import all the classes in the package omni.adt (i.e. all Omnibus class files in the directory “omni/adt/”) and will import the IntelligentAgent class from the usrs.twil.uni.ai package (i.e. the file “usrs/twil/uni/ai/IntelligentAgent”).

2.3.2 Starting applications In this section, we will look at the issue of how to start an application. We will start by looking at traditional approaches to the problem and then we will present the approach taken by Omnibus. Traditional approaches to starting applications This problem has existed as long as programming itself. The most widely taken approach is the one from the C language. It involves the declaration of a global function commonly called “main”. When the application is compiled and then executed, the run-time system starts the application by calling this “main” function. This approach works well and seems a fairly natural in a procedural setting. However, it is a procedural approach and does not fit at all into OOP since in an OOPL there are no such things as global functions. Hence, we must find some other approach.


55

The approach taken by language such as Java and C# is to define a static method called “main” in one of the classes of the application. The application is then invoked by specifying the path to the class file which contains this main method. The run-time system then calls this method and the application can happily start executing. This works. It essentially just adapts the procedural approach into the OO setting. However, it is not as nice as it could be. How an application starts is an important thing. One of the first things a new programmer will have to come to grips with is this system. Use of static methods should really be discouraged in OOP since using static methods is essentially procedural style programming. However, the first application a Java programmer will likely write will have to contain a static method. Surely this is starting them off on the wrong foot. It would be far nicer if we could start applications in an Object Oriented way. It would, for example, be much better to define an Application class programmers could subclass and then the run-time system could simply create an instance of the specified Application class and call some method to tell it to run. The problem with writing such a framework in Java is that the system has no way of creating an instance of the Application object. Constructors are not inherited and so there is no way within the language to define an interface which defines a constructor which all its subclasses will also have to support. A possible approach would be to require all classes to have a zero parameter constructor but this would compromise many classes which had no logical parameterless constructor. Even the static main method approach is ugly in that there is no way of specifying within the language that an Application class must have a static main method. There is no way of doing such a thing for a static method. Starting OO applications in a manner becoming of an OOPL While there is no way to support an Application class framework in Java, there is in Omnibus. The key is that Omnibus supports constructor inheritance via creators. We can define an Application class containing a “initialise” creator and “execute” operation. We can then define application entry classes by subclassing the Application class. This class will then either inherit or override both the execute operation and the initialise creator. The run-time system can then create the Application subclass by calling the intialise creator and then call the execute operation to run the application. This approach is far more OO than that used in Java or C#. In Omnibus, programmers do not need to resort to ugly non-OO language hacks like static methods to write even a simple application. Instead, they can structure their Application class with the same care as any other class. Programmers could then write a simple application using only simple things from the language. The other big advantage of this approach over the Java approach is that by making the interface between the class and the run-time system part of the language, type-checking can be performed at compile-time to verify that the application subclass is an acceptable application entry point. This is not possible in Java and so this may result in a run-time failure.


56

To illustrate this point, consider the following example:

public class JavaStarter { public static void main(String[] args) { System.out.println(“Hello from Java!”); } }

This is how to write an application class correctly in Java. However, lets suppose that the programmer accidentally entered the following code instead.

public class JavaStarter { public static void main() { System.out.println(“Hello from Java!”); } }

Here, the programmer has forgotten that the main method needs to accept an array of Strings. The application compiles without complaint but when the user runs it, a run-time error will be reported and the application will fail to start. Now, consider a similar Omnibus application.

uses omni.app class OmnibusStarter isa Application { operation execute(args: String*, env:Environment) { call env.println(“Hello from Omnibus!”) } }

Let’s suppose the Omnibus programmer made a similar mistake giving the following code.

uses omni.app class OmnibusStarter isa Application { operation execute(env:Environment) { call env.println(“Hello from Omnibus!”) } }

If they now attempt to compile the application class, they will receive an error that the execute operation is incompatible with the one in the Application class. Thus, type-checking has allowed the error to be detected at compile-time.


57

2.3.3 Writing a first Omnibus application We now have all the tools required to write our first Omnibus application. We will present a relatively pedagogical example using the Counter class which we developed earlier. Firstly, we will define a new version of the Counter class using the package directive appropriately. The following file should be called “Counter.obs” and located in the “usrs/twil/count/” directory.

package usrs.twil.count class Counter { constant defaultValue:integer := 0 attribute value:integer constructor zero() { let value := 0 } constructor withValue(val:integer) { let value := val } function magnitude():integer { if (value >= 0) { let result := value } else { let result := -value } } operation inc() { let value := value + 1 } operation dec() { let value := value - 1 } }


58

We can now define an application class which will be used to launch our application. The following file should be called “CountOfArgs.obs” and located in the “usrs/twil/tests/” directory.

package usrs.twil.tests uses omni.app.Application uses usrs.twil.count class CountOfArgs isa Application { operation execute(args:String*, env:Environment) { var cntr:Counter := Counter.zero() foreach arg:String in args { call cntr.inc() } call env.println(“There were “ +cntr.value() +” arguments passed to“ +” the application”) } }

Executing this application will then give the following results:

java omni.app.ApplicationLauncher usrs.twil.tests.CountOfArgs There were 0 arguments passed to the application java omni.app.ApplicationLauncher usrs.twil.tests.CountOfArgs Hello World! There were 2 arguments passed to the application java omni.app.ApplicationLauncher usrs.twil.tests.CountOfArgs Omnibus There were 1 arguments passed to the application

This concludes the Chapter. In the next Chapter, we will introduce contracts.

59

Chapter 3

Contracts Overview:

This chapter will present one of the central concepts of Omnibus, the contract. The chapter will start by looking at the basic principles of contracts in the real world and will encode a real world use of contracts in the Omnibus language that we have so far. This example will be used throughout the chapter to illustrate the use of the new features of the language as they are introduced. The second half of the chapter will look at the details of how contracts are specified in Omnibus. It will provide the reader with the basic skills to write Omnibus contracts and will consider issues such as verifying that the contract produced is correct.

Contents:

3.1 Basic principles of contracts 3.1.1 Contracts in the real world

3.1.2 Introducing a running example 3.2 Specifying contracts in Omnibus

3.2.1 Requires and Ensures clauses 3.2.2 Changes clause 3.2.3 Producing the correct contract


60

In this Chapter we will look at the use of contracts in Omnibus. We will start by considering the basic principles and will then move on to look at how to apply the concept and write the contract for methods and classes. The chapter will conclude by looking at how requirements are encoded within Omnibus contracts.

3.1 Basic principles of contracts The name contract was carefully selected for the language feature which this chapter will consider. This was done because of the many conceptual similarities between a real world contract and an Omnibus contract.

3.1.1 Contracts in the real world When two parties (such as two businesses) want to formally interact in the real world, they will draw up a contract to govern the interaction. This contract will describe the responsibilities of each of the parties in some clear form. Signing the contract involves making a commitment to carry out your responsibilities as the contract lays them out. If one of the parties do not fulfil their responsibilities, the contract is invalidated and the other party does not have to continue to satisfy their obligations. Contracts describe what each party should do, not how they should do it. The contract is not concerned with how each party might satisfy their obligations. This is of no relevance and so there is no need for such information. In fact, it is desirable for the contract not to specify how each party should satisfy their responsibilities. Each party should be able to change their approach to fulfilling their contractual obligations without the need to alter the contract. All that matters is that they fulfil these obligations somehow. A contract describes the relationship in terms of the data sent between the two parties. It will define the inputs and outputs of each component. It will then describe the obligations of each party in terms of how they should manipulate the inputs they receive to produce appropriate outputs. The contracts we have described thus far were peer-to-peer in as much as there was no specific contractor and no specific contractee. The roles were interchangeable. However, it is usually the case that contractual systems operate in a contractor-implementer type arrangement. In such a situation, the contractor uses the implementer in some way to perform some task for them. The contractor will send inputs to the implementer and it will then perform some processing on this data and produce some appropriate output. The role of the contract in such an arrangement is still as it was before. It must govern the requests that the contractor can make and the associated responses that the implementer can make. The contractor is not concerned with all the low-level details of how the implementer carries out their contractual obligations but they may wish to specify some sort of constraints which their approach should respect.

Chapter 3: Contracts

61

3.1.2 Introducing a running example Let us consider an example of the use of a contract in the real world. Contracts are used heavily throughout the business world. Businesses can contract some other business to make a physical component for them, to mow their lawn, to raise their marketing profile…. Here we will look at a rather more interesting example of the use of contracts which contains the basic concepts the contracting system without too many domain-specific problems. Consider the situation (taken straight from an episode of Columbo) where the executive of a large company is being black-mailed and wishes to “take care of” this person. Suppose they hire a “Contract killer” to carry this out for them. The contractor (the executive) will specify in the contract that the implementer (the killer) should kill the black-mailer. They will not describe in excruciating detail in this contract how the killer should perform this. Indeed, they would likely not be qualified to be able to form an adequate plan of action for the killer. Any way, the contractor does not want to know how the killer is going to do their job. The contract will describe the responsibilities of each party. For example, the contractor may have to pay some amount of the fee up-front before the killing and then pay the rest after the killing is complete. As was described previously, if one party does not fulfil their obligations then the contract is invalidated and the other party does not need to fulfil their obligations. For example, if the executive does not make the initial up-front payment then the killer is not obligated to kill anyone. Similarly, if the executive makes the up-front payment and then the killer does not perform the murder then the executive is in no way obligated to make the final payment. Finally, while the contractor is not concerned with the full low-level details of the implementer, they may want to place some constraints on it. For example, the executive may want to specify that the black-mailer is told who is behind his death before being killed or that the black-mailer’s death should be grisly. The use the term contract within a programming setting has existed for some time now. Much credit must go to Meyer for this. Meyer was responsible for the development of the Design-By-Contract approach. In a programming setting, both the contractor and implementer will be classes. The contractor class will use an instance of an implementer class in some way to perform its task.

class ContractKiller { attribute feePaid:integer attribute killCount:integer constructor enlist() { … } function fee():integer { … } operation payFee(amount:integer) { … } operation kill() { … } }


62

class Executive { … operation takeCareOfBlackMailer() { var killer:ContractKiller := ContractKiller.enlist() var fee:integer := killer.fee() let fee := fee – 50 call killer.payFee(fee) call killer.kill() } }

In the above representation of the Executive-Killer example, the Executive is trying to short-change the Killer. The value returned by the fee function is the amount that the executive is required to pay before the Killer is prepared to kill. However, the executive pays 50 less than this amount before asking the killer to kill. In this case, the contractor has not satisfied their part of the contract. However, there is no clear expression of what fee should be paid before the kill() operation can be called. This is where contracts come in, they allow the contractual obligations of each of the parties to be expressed formally. Here we would like to be able to express that the feePaid to a killer must be greater than or equal to the fee before the kill operation can be invoked.

3.2 Specifying contracts in Omnibus The technique we used to informally describe the contract for the ContractKiller was to describe how the contractor should interact with the class via its methods. This is the most natural way to describe a contract. Thus, we divide the contract for a class into a collection of contracts, one for each of the (publicly accessible) methods of the class.

3.2.1 Requires and Ensures clauses We then need to consider how to describe each of the methods in a class. The contractual obligations associated with a method can be broken down into two sections. There are requirements that relate to the contractor and requirements that relate to the implementer. Each of the parties must satisfy their obligations for the contract to be satisfied. The obligation of the contractor will be to ensure that some condition (characterising an acceptable circumstance to call the method) holds over the object before the method is called. The obligation of the implementer will be to ensure that assuming the contractor met their part of the contract that some condition (characterising an acceptable behaviour of the method) holds over the object after the method call completes. These two conditions can be characterised by special boolean expressions, called assertions, which evaluate to true in acceptable states. These two conditions are commonly referred to as pre-conditions and post-conditions. The pre-condition is an assertion which characterises the obligations of the contractor and the post-condition is an assertion which characterises the obligations of the implementer. Omnibus uses requires and ensures clauses to specify pre- and post-conditions, respectively.


63

The requires clause expresses what the class requires to hold before the method is invoked. The ensures clause expresses what the class will ensure holds after the method completes, assuming that the requires clause was satisfied before the call. We can now consider what the contracts of some of the methods in the ContractKiller will be. First, consider the central kill operation. What are the contractual obligations of the contractor and the implementer. We can break these into two parts as was just discussed. Consider first the obligations of the contractor (i.e. the caller of the method). We informally expressed this as:

The feePaid to a killer must be greater than or equal to the fee before the kill operation can be invoked.

We can now convert this into an assertion and then use it in a requires clause. The assertion characterising this property is:

feePaid >= fee() We can now consider the contractual obligations of the implementer (i.e. the body of the method). Informally, the value of the killCount should increase by one. More completely, the value of the killCount should be equal to one more than the value of the killCount before the method call and the value of the feePaid should not change. We require the last part because in assertions we are required to say what doesn’t change as well as what does. In order to express the assertion formally, we need some way of referring to the value that a variable had before the method call. We use the primed form of the variable to achieve this. Thus we can express this assertion as follows:

killCount = killCount’ + 1 & feePaid = feePaid’ We can combine these with the signature of the method to give the contract for the kill method.

operation kill() requires feePaid >= fee()

ensures killCount = killCount’ + 1 & feePaid = feePaid’ This contract describes the kill operation completely.


64

We can do the same for the other methods in the class to give the following contract for the class:

class ContractKiller { attribute feePaid:integer attribute killCount:integer constructor enlist() ensures feePaid = 0 & killCount = 0 function fee():integer ensures result = 500 operation payFee(amount:integer) ensures feePaid = feePaid’ + amount

& killCount = killCount’ operation kill() requires feePaid >= fee() ensures killCount = killCount’ + 1 & feePaid = 0 }

This class, with contract, formally defines the informal requirements we outlined earlier.

3.2.2 Changes clause One of the most annoying things in the writing of the ContractKiller contract was having to say what was not changed. Saying that something hasn’t changed is very easy to forget. It is very unnatural to imperative programmers who usually only state what changes. In an attempt to aid imperative programmers, prevent this mistake and make ensures assertions shorter, Omnibus introduces a changes clause. A changes clause provides a list of the things which a method can change. All variables not in this list are implicitly assumed not to have changed. The implementation of a method must respect this clause and change only the variables declared in this list. Changes clauses are only permitted in operations. Constructors and creators implicitly have a changes clause containing all the attributes of the class since these have to give initial values to every attribute. The only other type of method, the function, cannot change the value of the attributes of a class and hence implicitly has an empty changes clause (expressed as changes nothing). We can use changes clauses to re-write the operations in the ContractKiller class. We will consider the payFee() operation now. This has the effect of increasing the feePaid and does not change the killCount in any way. This is a perfect place to use a changes clause. In the changes clause, we will list the single attribute that is changed, the feePaid attribute, and all other attributes are assumed to be unchanged by the operation.


65

Thus, the “& killCount = killCount’” from the previous section is not needed. This gives us the following contract:

operation payFee(amount:integer) changes feePaid ensures feePaid = feePaid’ + amount

This is far simpler than the previous version of the contract for this method and is more natural for programmers from an imperative background. We can now re-write the ContractKiller class as follows:

class ContractKiller { attribute feePaid:integer attribute killCount:integer constructor enlist() ensures feePaid = 0 & killCount = 0 function fee():integer ensures result = 500 operation payFee(amount:integer) // we refine this later changes feePaid ensures feePaid = feePaid’ + amount operation kill() requires feePaid >= fee() changes killCount, feePaid ensures killCount = killCount’ + 1 & feePaid = 0 }

For convenience, there are two special values for changes clauses. They are: “*” and “nothing”. The use of “*” means change all the attributes in the class. This is equivalent to listing all of the attributes of the class in the changes clause. The use of “nothing” means change none of the attributes in the class. This is equivalent to listing none of the attributes of the class in the changes clause.

3.2.3 Producing the correct contract We now have a contract for the ContractKiller but is it correct? In other words, does it correctly characterise the behaviour we require of a ContractKiller? This is a hard question to answer because we have no formal description of what the contract for a ContractKiller should look like. All we have is some informal conception of what a ContractKiller should do and this is what we used to create the formal contract. There is no way to formally verify that our formal contract is consistent with our informal conception.


66

Omnibus addresses this problem by allowing the user to express high-level, critical requirements of the class in some formal fashion and then verify that these requirements are consistent with the contract of the class. These requirements are formal representations of the critical requirements of the informal conception and can be used to detect problems in the specification. Invariants There are two different variations of requirements in Omnibus. The first variation is the invariant. An invariant is an assertion which should hold over all (publically accessible) reachable states of an object of the class. Constructors and creators should establish the truth of this assertion and the operations should retains its truth. We can consider what invariants are appropriate within the ContractKiller class. In this class, the two variables feePaid and killCount are declared as integers. This means that they can hold any value within the range –2,147,483,648 to 2,147,483,647. However, does it make sense for either of these values to be negative? Certainly not. A negative value for the feePaid would indicate that the Executive had received money from the Killer. Clearly this should not be permitted by the contract. Thus, we could introduce a couple of invariants which state that the values of both of these variables should be greater than or equal to zero. Each invariant has its own name. The invariants are shown below.

invariant feePaidIsValid: feePaid >= 0 invariant killCountIsValid: killCount >= 0

We should now check to ensure that these invariants are respected, i.e. that each constructor/creator establishes their truth and that each operation retains their truth. The enlist constructor and kill operations both seem to be fine. However, there is a problem with the payFee operation. The contract for this operation is shown below.

operation payFee(amount:integer) changes feePaid ensures feePaid = feePaid’ + amount


67

The parameter of the operation is an integer and hence can have any value within the range –2,147,483,648 to 2,147,483,647. If the value of this variable is a negative number of greater magnitude than feePaid’ then new value of feePaid will be negative and the feeIsValid invariant will be invalidated. Thus, we have discovered a problem with the contract. If we consider the meaning of this operation, it is for the payment of money from the contractor to the killer. As such, the value being passed should be positive (or zero) representing the passing of money (possibly none) from the contractor to killer. A negative value would represent the passing of money from the killer to contractor which is not something which we wish to permit. We can rectify the problem by adding a requires clause to the payFee operation giving the contract shown below.

operation payFee(amount:integer) requires amount >= 0 changes feePaid ensures feePaid = feePaid’ + amount

Using the new definition of the payFee operation and including the invariants, we can re-write the ContractKiller class as follows:

class ContractKiller { attribute feePaid:integer attribute killCount:integer constructor enlist() ensures feePaid = 0 & killCount = 0 function fee():integer ensures result = 500 operation payFee(amount:integer) requires amount >= 0 changes feePaid ensures feePaid = feePaid’ + amount operation kill() requires feePaid >= fee() changes killCount, feePaid ensures killCount = killCount’ + 1 & feePaid = 0 invariant feePaidIsValid: feePaid >= 0 invariant killCountIsValid: killCount >= 0 }

Formalising Invariant laws Before moving on it is important to formalise exactly what it means for the invariants of a class to be respected. We informally stated that:

Constructors and creators should establish the truth of the invaraints and the operations should retains their truth.


68

We will now try to give formal meanings for these. Let us consider constructors and creators first. At this level, these are equivalent. Thus, let us only present the handling of constructors here. Suppose we have a constructor defined as follows:

constructor con(params) requires pre ensures post

The truth of the requires clause before the constructor call and the truth of the ensures clause after the constructor call should imply the truth of the invariant after the constructor call. We take the assertions pre and post to represent the values of the pre-condition (requires clause) and post-condition (ensures clause) as evaluated after the execution of the method. We take their primed variations, pre’ and post’ to represent the values of the pre- and post-conditions as evaluated before the execution of the method. Finally, we introduce an assertion inv to represent the invariants of the class. This is constructed by conjoining all of the invariants defined in the class. We can think of these assertions as applied below as being applied to two sets of arbitrary attribute and parameter values for which the requires clause holds over the first and the ensures clause holds over the second (with primed values taken from the first). We should then be able to logically conclude the truth of the invariant over the second (with primed values taken from the first). We can express this as a symbolic assertion giving us a formal definition for the invariant establishment law. Constructor and Creator Invariant establishment law:

pre' & post => inv where primed symbols represent the values of assertions evaluated before the constructor call and unprimed symbols represent the values of assertions evaluated after the constructor call. We can perform the same proces to give us a formal law for the retainment of invariants by operations. Suppose we have an operation defined as follows.

operation op(params) requires pre changes ch ensures post

The truth of the requires clauses and the invariants before the operation call together with the truth of the ensures clause after the operation call should together imply the truth of the invariants after the operation call. Following the same process, we can also


69

express this as a symbolic theorem giving us a formal definition for the invariant retainment law. Operation Invariant retainment law:

pre' & inv' & post => inv where primed symbols represent the values of assertions evaluated before the operation call and unprimed symbols represent the values of assertions evaluated after the operation call. Concrete Constraints Constraints provide another way of describing critical requirements in Omnibus. These can be used to express more general requirements of a class. They effectively allow test cases to be formally expressed and included within the contract of a class. Constraints take the form of expressions which manipulate instances of the class of interest. A requirement of a valid class is that each of its constraint expressions should evaluate to true. An advantage of constraints is that they are easily readable to programmers. Constraints can be used to give examples of how the class will handle different situations. This can be particularly useful to explicitly illustrate how boundary cases are handled. Thus constraints help to spell out the behaviour of a class in simple and readable terms. This is something which many formal methods fail to do and this helps make many programmers wary of these methods. The programmers thus use testing because it is the only way they can verify for themselves the exact semantics of the class. Omnibus couples the formal verification and testing approaches. Formal verification is supported through invariants (and the symbolic constraints that we will consider later in this chapter) and testing is supported through constraints. Constraints play the role of testing in such a system. It is only with facilities such as constraints that the need for manual testing can be completely eliminated. We can formulate a number of constraints for the ContractKiller class. These should be carefully selected to illustrate key concepts of the class. We could formulate the following constraints:

constraint killIncreasesKillCount: ContractKiller.enlist().payFee(500).kill().killCount() = 1 constraint killTakesAllMoney: ContractKiller.enlist().payFee(600).kill().feePaid() = 0 constraint payFeeIncreasesFeePaid: ContractKiller.enlist().payFee(50).payFee(170).feePaid() = 220

These constraints illustrate key properties of the ContratKiller class in a manner which is accessible to any programmer. The killIncreasesKillCount constraint shows how the kill operation increases the killCount. This constraint illustrates the main functionality of the class. The killTakesAllMoney constraint illustrates a subtle aspect of the kill operation. The kill operation takes all of the fee currently paid to them when it performs


70

the kill, not just their fee (what do you expect them to do? They are, after all, criminals!). Constraints of this nature are very useful in showing subtle aspects of the functionality of a class. The payFeeIncreasesFeePaid constraint shows that the affect of payFee is cumulative. Another way of handling payFee would be to re-assign the feePaid to be the value passed into the last payFee method call. Thus, this constraint shows which of a collection of possible approaches is taken to a specific problem. Including the constraints, we can re-write the ContractKiller class as follows.

class ContractKiller { attribute feePaid:integer attribute killCount:integer constructor enlist() ensures feePaid = 0 & killCount = 0 function fee():integer ensures result = 500 operation payFee(amount:integer) requires amount >= 0 changes feePaid ensures feePaid = feePaid’ + amount operation kill() requires feePaid >= fee() changes killCount, feePaid ensures killCount = killCount’ + 1 & feePaid = 0 invariant feePaidIsValid: feePaid >= 0 invariant killCountIsValid: killCount >= 0 constraint killIncreasesKillCount: ContractKiller.enlist().payFee(500).kill().killCount() = 1 constraint killTakesAllMoney: ContractKiller.enlist().payFee(600).kill().feePaid() = 0 constraint payFeeIncreasesFeePaid: ContractKiller.enlist().payFee(50) .payFee(170).feePaid() = 220 }

Constraints such as the ones above can easily be converted to run-time test cases similar to those from the jUnit framework. Thus, constraints can be used to formally encode test cases. The analyser can also generate and run test scenarios directly from them, saving much programmer effort. Constraints are part of the contract of the class. They can be viewed by the contractor and can be checked by the analyser to ensure they are respected by the class definition.


71

Symbolic constraints and quantifiers A shortcoming of the constraints we have looked at so far is that they can only reason about a particular sequence of constructor/creator and operation calls with specific, concrete, values. They do not give us the foundation to make any general statements about the component. For example, the payFeeIncreacesFeePaid tells us that enlisting, then paying 50, then paying 170 means we have paid 220, it does not say anything about whether enlisting, then paying 60, then paying 170 will mean we have paid 230. We may assume this to be the case but the constraint itself says nothing that is general enough to assume this. This is the same problem that testing suffers from. We would like to be able to say that if we enlist, then pay any value x, then pay any value y then we will have paid the value x+y (assuming x and y are appropriate non-negative numbers). To express such a statement, we need to make use of the universal quantifier. We will now describe the addition of quantifiers to the Omnibus language with this motivation. Omnibus supports the two basic quantifiers. These are forall and exists. These are commonly used in formal analysis and are the universal and existential quantifiers from Mathematics. The forall quantifier is the universal quantifier from Mathematics. The syntax for the forall expression is:

“forall” declarations (“where” boolean)? “(“ boolean “)” and it means:

For all possible assignments of values to the variables in the declarations section (such that the where clause is satisfied, if given) does the quantified boolean expression evaluate to true?

Forall expressions containing a where clause are converted to forall expressions without a where clause. The where clause is simply a notational convenience. The forall expression:

forall x:integer where x > 0 (magnitude(x) = x) can be translated to the equivalent forall expression without the where clause.

forall x:integer (x > 0 => magnitude(x) = x) The exists quantifier is the existential quantifier from Mathematics. The syntax for the exists expression is:

“exists” declarations (“where” boolean)? “(“ boolean “)” and it means:


72

Does there exist any assignment of values to the variables in the declarations section such that the quantified boolean expression evaluates to true?

Using these quantifiers, we can now express more general constraints which should hold over the objects of a class. In particular, the forall quantifier gives us the facilities to describe the constraint we mentioned informally earlier. The statement we are referring to is:

If we enlist, then pay any value x, then pay any value y then we will have paid the value x+y (assuming x and y are appropriate non-negative numbers).

This can be expressed as follows:

forall x:integer, y:integer where x >= 0 & y >= 0 ( ContractKiller.enlist().payFee(x).payFee(y).feePaid() = x+y )

Note the use of the where clause to restrict the domain of the quantified variables so that they will satisfy the requires clauses of the payFee operation. This assertion is still not as general as it could be because it still only refers to what happens when you pay two fee values in immediate succession, immediately after an enlist call. We can use quantifiers to express even more general statements by quantifying over objects. For example, consider the following assertion:

forall k:ContractKiller, amount:integer where amount >= 0 ( k.payFee(amount).feePaid() = k.feePaid() + amount )

This assertion says that forall possible ContractKiller objects, and all possible amounts to pay, the feePaid after applying payFee with the amount parameter is equal to the previous value of feePaid plus the value of the amount paid. These quantified expressions can be expressed as constraints just like the ones we considered earlier and hence included in the class definition. Constraints such as those considered earlier are referred to as concrete constraints as they are defined in terms of concrete values. Constraints containing quantified expressions are referred to as symbolic constraints as they are defined in terms of symbolic values (i.e. in terms of quantified variables).


73

We can re-write the ContractKiller class including the new constraints to give the following:

class ContractKiller { attribute feePaid:integer attribute killCount:integer constructor enlist() ensures feePaid = 0 & killCount = 0 function fee():integer ensures result = 500 operation payFee(amount:integer) requires amount >= 0 changes feePaid ensures feePaid = feePaid’ + amount operation kill() requires feePaid >= fee() changes killCount, feePaid ensures killCount = killCount’ + 1 & feePaid = 0 invariant feePaidIsValid: feePaid >= 0 invariant killCountIsValid: killCount >= 0

constraint killIncreasesKillCount: ContractKiller.enlist().payFee(500).kill().killCount() = 1 constraint killTakesAllMoney: ContractKiller.enlist().payFee(600).kill().feePaid() = 0 constraint payFeeIncreasesFeePaid: ContractKiller.enlist().payFee(50) .payFee(170).feePaid() = 220 constraint symbolicPayFeeIncreasesFeePaid:

forall x:integer, y:integer where x >= 0 & y >= 0 ( ContractKiller.enlist().payFee(x).payFee(y) .feePaid() = x+y ) constraint symbolicPayFeeAddsAmount: forall k:ContractKiller, amount:integer where amount >= 0 ( k.payFee(amount).feePaid() = k.feePaid() + amount ) }


74

Symbolic constraints allow more general statements to be made about the class. Thus, they allow more general properties of the class to be described. However, as is clear from the latest version of the ContractKiller class, they are far less readable. The payFeeIncreatesFeePaid concrete constraint is easy to understand and gets the basic point across in a manner accessible to every programmer in the language. The symbolic version, symbolicPayFeeIncreasesFeePaid, while being more general, is also a bit less easily readable and the use of quantifiers may lose some programmers. It is also not possible to use quantifiers in run-time expressions and so run-time test cases cannot be automatically generated from symbolic constraints in the same way that they can be for concrete constraints. Because of the complementing set of advantages and disadvantages of symbolic and concrete constraints, a combination of the two should be used to describe the critical requirements of a class.

Omnibus - Pennsylvania State University

Documents

Transcript of Omnibus - Pennsylvania State University