Extreme Grammaring
Development of an
industrial strength
ISO VDM-SL Grammar
Introduction
Need of building a parser for VDM from a VDM-SL grammar (VooDooM project)
Although parsing is a subset of a well studied area like compilers, grammars were always looked upon as the “ugly duckling”.
Extreme Programming + Engineering of Grammars =
Extreme Grammaring
Solution:
Background
VDM: Vienna Development Method (VDM) is one of the
most mature formal methods Primarily intended for formal specification and
development of functional aspects of software systems.
The importance of a VDM-SL Grammar: Documentation Build a parser (metric generators, language
generators, ...)
Starting point
Previous work VDM-SL grammar in Happy + Alex Some problems
State of Art (Hacking v.s. Engineering) Grammar was encoded directly Difficult to maintain/change (300 rules) Lack of tool support ...
Principles of Grammar Engineering
Introduced by Lämmel in “Towards an engineering discipline for grammaware”
1. Start from specifications - base-line grammar2. Implement by customization - technology,
implementation3. Separate concerns - modularization4. Enable evolution - minimize impact of changes5. Ensure quality - metrics, correctness6. Automate - traceability and scalability
Extreme Programming
1. The Planning Game - scope, priorities, technical estimates 2. Small Releases - very short release cycle3. Metaphor - shared story4. Simple Design - remove complexity when found5. Testing - continuous unit testing, test-first design6. Refactoring - restructure without functionality changes7. Pair Programming - two programmers one machine8. Collective Ownership - change each others code (anytime)9. Continuous Integration - build and test10. 40-Hour Week - work more = produce less11. On-Site Customer - user in team12. Coding Standards - no irrelevant personal preferences
Extreme Grammaring
1. The Planning Game - scope, priorities, technical estimates2. Small Releases - very short release cycle3. Metaphor - shared story4. Simple Design - remove complexity when found5. Testing - continuous unit testing, test-first design6. Refactoring - restructure without functionality changes7. Pair Programming - two programmers one machine8. Collective Ownership - change each others code (anytime)9. Continuous Integration - build and test10. 40-Hour Week - work more = produce less11. On-Site Customer - user in team12. Coding Standards - no irrelevant personal preferences
The Planning Game
Scope Follow strictly the ISO VDM-SL grammar spec
Priorities1. Disambiguate types
2. Disambiguate full grammar
3. Tree construction Technical estimates
Not defined...
Small Releases
Programmed releases (completed): 0.0.1 - Grammar typed from standard 0.0.2 - Disambiguated grammar 0.0.3 - AST construction
Future Releases 1.0 - Haskell front-end (finished - 29-12-2004) 2.0 - Java front-end
Testing
White box Structural testing Full visibility into how system works
Black box Functional or behavioral testing Only the interface with exterior is available
Grammar Unit Testing
Unit test Test a single method, function, etc...
Different types of unit tests: Parsing (succeeds, fails) Well-formness of the tree
Test suite Combination of all unit test
Test Coverage
Rule coverage Introduced by Purdon (1971) Explores all rules of a grammar Simple measure but doesn’t cover all cases
Context-dependent rule coverage Introduced by Lämmel in “Grammar Testing” Generalization of the above in which the context
is taken in account No known implementations
Test Coverage Metrics
KP - Kernel Productions KPr - Kernel Priorities S - States RSa - Rule Size average RSm - Rule Size maximum RC - Rule Coverage Percentage
version KP KPr S RSa RSm RC
0.0.1 1005 815 2278 2 17 52%
0.0.2 946 2753 3396 2 17 54%
0.0.3 900 2753 3518 2 17 55%
Test Coverage Metrics (2)
Although the “Generics” test-suite does not change the coverage gets lower (Injections, total nr rules)
The “expressions” and “functiontypes” were only added in 0.0.2 version.
version Generics expressions functiontypes All
0.0.1 52% 0 % 0% 52%
0.0.2 51% 24% 6% 54%
0.0.3 50% 25% 5% 56%
Refactoring
Semantic preserving transformations Study made by Lämmel in “Grammar Adaptation”
Operators: preserve - replace a phrase by an equivalent fold - replace for its definition unfold - extract definition introduce - introduction of non-terminals eliminate - elimination of non-terminals rename - renaming non-terminals
Continuous Integration
The integration test suite is a set of generic real world examples
Only 52% coverage Examples are difficult to find Most of the examples use language extensions
Examples: Found on internet Used a pre-processor for extracting code from literal
programs.
Code Standards
Nothing found about the subject The following can be applied:
Limiting the number of children in a rule Limiting the number of alternatives in a rule Prefer some sort of constructs than other Convention for the non-terminal names Convention for syntax specification Limit module size ...
Supporting the Methodology
SDF - Syntax Definition Formalism Purely declarative Very expressive with natural and concise syntax Modular structure Supported by Scannerless Generalized LR Parsing
Supports compositional grammars Allows parsing with ambiguities (allows earlier testing) Disambiguation is separated from the grammar using
priority and associative rules
SDF - Technology
Parsing sdf2table, sglr
Testing test-unit, ambtracker, SdfCoverage
Tree visualization tree2graph, graph2dot
Transformation trm2baf, implodePT
Haskell Generation Sdf2Haskell (AST, Pretty Printer)
Setting up the bases
Hard copy of the ISO VDM-SL standard (ISO/IEC 13817-1)
Initial test suite Real world examples (loc: 1970) Exercises from Formal Methods course (loc: 507)
Software: CVS to keep track of all changes parse-unit (sdf unit testing tool) Sdf2 software bundle (sdf2table, sglr) SdfCoverage Starndard unix tools (text editor, make, ...)
Development cycle
1. Initial grammar2. Correction
1. Correct grammar rules2. Correct test suite
3. Disambiguation1. Add filters2. Change grammar shape
Steps 2 and 3 should make heavy use of testing
Grammar correction
1. Isolate problem Source location Grammar rules involved
2. Correct grammar Change syntax (test suite) Run to verify test succeeds Run entire test battery
3. Commit Change Document change in message
Grammar Disambiguation
1. Isolate problem Source location Grammar rules involved
2. Create unit test Captures error Run to guarantee this
3. Correct grammar Add disambiguation filter (change syntax) Run to verify unit test succeeds Run entire test battery
4. Commit Change Document change in message
Grammar Metrics
Simple metrics Total Number of Terminals (AVG per rule) Total Number of Non-terminals (AVG per rule)
Complex metrics Introduced by Malloy in “A metrics suite for
grammar-base software” McCabe Cyclomatic complexity Halstead Effort ...
Problems found
ISO Document has ambiguities in its specification
Syntax Expressions:
Apply v.s. RecordConstructor Apply v.s. IsDefTypeExpr
EqualsDefinition CallStatement v.s. Expression
Lexical Quotes are allowed in strings and in characters
Future plans
Short-term (VDM parser “clients”): VooDooM Formal methods projects MCZ Objectifier Camila revival?
Long-term Topic open for discussion...
What’s next?
Test set completion (fill the rest 44%) Test generation Add examples manually
Analyze the rules that were not covered Try to find pathologies
Compute Grammar Metrics Test the methodology developing other
grammars.
Conclusion
Work was completed in only 3 weeks A complete grammar of the ISO VDM-SL is
for the first time public available (parser) A strong methodology for grammar
developing was defined Grammar testing were put to practice
Different types of tests Test coverage
Thank you!
Questions / Discussion
Top Related