Landau's “Grundlagen der Analysis” from Automath to lambda-delta

28
Landau’s “Grundlagen der Analysis” from Automath to lambda-delta Ferruccio Guidi Technical Report UBLCS-2009-16 September 2009 Department of Computer Science University of Bologna Mura Anteo Zamboni 7 40127 Bologna (Italy)

Transcript of Landau's “Grundlagen der Analysis” from Automath to lambda-delta

Landau’s “Grundlagen der Analysis”from Automath to lambda-delta

Ferruccio Guidi

Technical Report UBLCS-2009-16

September 2009

Department of Computer ScienceUniversity of BolognaMura Anteo Zamboni 740127 Bologna (Italy)

The University of Bologna Department of Computer Science Research Technical Reports are available in PDF andgzipped PostScript formats via anonymous FTP from the area ftp.cs.unibo.it:/pub/TR/UBLCS or via WWW atURL http://www.cs.unibo.it/ . Plain-text abstracts organized by year are available in the directory ABSTRACTS.

Recent Titles from the UBLCS Technical Report Series

2008-13 A Theory of Contracts for Strong Service Compliance, Bravetti, M., Zavattaro, G., June 2008.

2008-14 A Uniform Approach for Expressing and Axiomatizing Maximal Progress and Different Kinds of Time in Process Algebra,Bravetti, M., Gorrieri, R., June 2008.

2008-15 On the Expressive Power of Process Interruption and Compensation, Bravetti, M., Zavattaro, G., June 2008.

2008-16 Stochastic Semantics in the Presence of Structural Congruence: Reduction Semantics for Stochastic Pi-Calculus, Bravetti,M., July 2008.

2008-17 Measures of conflict and power in strategic settings, Rossi, G., October 2008.

2008-18 Lebesgue’s Dominated Convergence Theorem in Bishop’s Style, Sacerdoti Coen, C., Zoli, E., November 2008.

2009-01 A Note on Basic Implication, Guidi, F., January 2009.

2009-02 Algorithms for network design and routing problems (Ph.D. Thesis), Bartolini, E., February 2009.

2009-03 Design and Performance Evaluation of Network on-Chip Communication Protocols and Architectures (Ph.D. Thesis),Concer, N., February 2009.

2009-04 Kernel Methods for Tree Structured Data (Ph.D. Thesis), Da San Martino, G., February 2009.

2009-05 Expressiveness of Concurrent Languages (Ph.D. Thesis), di Giusto, C., February 2009.

2009-06 EXAM-S: an Analysis tool for Multi-Domain Policy Sets (Ph.D. Thesis), Ferrini, R., February 2009.

2009-07 Self-Organizing Mechanisms for Task Allocation in a Knowledge-Based Economy (Ph.D. Thesis), Marcozzi, A., February2009.

2009-08 3-Dimensional Protein Reconstruction from Contact Maps: Complexity and Experimental Results (Ph.D. Thesis), Medri,F., February 2009.

2009-09 A core calculus for the analysis and implementation of biologically inspired languages (Ph.D. Thesis), Versari, C., Febru-ary 2009.

2009-10 Probabilistic Data Integration, Magnani, M., Montesi, D., March 2009.

2009-11 Equilibrium Selection via Strategy Restriction in Multi-Stage Congestion Games for Real-time Streaming, Rossi, G.,Ferretti, S., D’Angelo, G., April 2009.

2009-12 Natural deduction environment for Matita, C. Sacerdoti Coen, E. Tassi, June 2009.

2009-13 Hints in Unification, Asperti, A., Ricciotti, W., Sacerdoti Coen, C., Tassi, E., June 2009.

2009-14 A New Type for Tactics, Asperti, A., Ricciotti, W., Sacerdoti Coen, C., Tassi, E., June 2009.

2009-15 The k-Lattice: Decidability Boundaries for Qualitative Analysis in Biological Languages, Delzanno, G., Di Giusto, C.,Gabbrielli, M., Laneve, C., Zavattaro, G., June 2009.

Landau’s “Grundlagen der Analysis”

from Automath to lambda-delta

Ferruccio Guidi

Technical Report UBLCS-2009-16

September 2009

Abstract

L.S. van Benthem Jutting’s specification of E.G.H.Y. Landau’s “Grundlagen der Analysis” in the formallanguage Aut−QE represents an early milestone in computer-checked mathematics and is the only non-trivial development finalized in the languages of the Automath family.

Here we report on our first attempt to encode this specification into the formal language λδ, a calculusinspired by the Automath language Λ∞, and to validate the resulting data set against λδ’s type system.

Our main achievement is the implementation of a software processor taking care of all the translation-related issues, including the validation and the long-term persistence of the translated data.

In this respect, the present paper fits in a research thread that aims at improving λδ by testing itscapability to encode real-world formal mathematics a reasonable manner.

1

CONTENTS

Contents

1 Introduction 22 A SuitableMathematical Theory 32.1 Sources of Information 32.2 Incompatibilities 32.3 Parsing the Specification 42.4 Numerical Summary of the Specification 53 The Abstract Layer 53.1 The Abstraction Step 63.2 Disambiguation 63.3 Retrieving the Formal Parameters 73.4 Expanding the Abbreviated Applications 83.5 Numerical Summary of the Specification in the Abstract Layer 84 The Kernel Layer 84.1 The Implemented Variant of λδ 94.2 The Instantiation Step 114.3 Persistence of the Translated Data 114.4 Numerical Summary of the Specification in the Kernel Layer 125 Validation 135.1 Reduction and Conversion 145.2 Canonical Typing 175.3 Numerical Summary of the Specification Check 186 Conclusions and Future Work 186.1 Acknowledgements 19A Some Points on the Implementation 19A.1 Interfaces of Some Compilation Units 19A.2 Performance Issues 23B Related Topics 23B.1 The “Purity” Rule in the Validation of the “Grundlagen” 23B.2 A Consequence of Unrestricted Sort Inclusion 24B.3 The XML Document Type Definition for λδ 24

1 Introduction

The formal system λδ [12] is a λ-calculus developed in the context of the HELM project [2] at theUniversity of Bologna. The calculus is inspired by the formal system Λ∞ [25] but remarkablyfeatures non-recursive definitions and explicit type annotations inside terms and environments.

In actual fact, the current version of λδ can be improved in many ways and in particular weplan to increase its expressive power, which is now comparable to the one of λ→, until we canuse the calculus “to encode a non-trivial fragment of Mathematics in a realistic manner” [12].

In this perspective it is interesting to test λδ’s capability to encode formalized mathematicaltheories on real-world examples. Given that such theories must be non-trivial, i.e. they mustcontain hundreds of proved theorems, we are naturally led to consider computer-assisted tests.

In particular, performing such tests requires the following: a real-world mathematical theoryof which a suitable formal specification is available in digital format, an implemented procedurethat encodes the items of this theory (definitions, theorems, etc.) into λδ, and an implementedvalidator that checks the correctness of the encoded items with respect to λδ’s type system.

In this paper we describe how we met the above requirements and what results we obtained.The chosen theory is Landau’s “Grundlagen der Analysis” [17], formalized by Jutting [23] in

the Automath language Aut−QE [27] and recovered by Wiedijk [28] (see Section 2).The procedurewe use to encodeAut−QE into λδ consists of two stages: abstraction (Section 3)

and instantiation (Section 4). Since explaining this procedure requires a detailed knowledge ofthe concrete Automath syntax and of the facilities it provides for coping with the verbosity of

UBLCS-2009-16 2

2 A Suitable Mathematical Theory

formal specifications, we take the opportunity of writing this paper to present this informationin a single text (see Subsection 2.3, Subsection 3.2, Subsection 3.3, Subsection 3.4).

We also want to review and update some concepts about λδ (see Subsection 4.1). In particularthis paper is not self-contained and some knowledge of Automath and λδ is assumed.

We describe the structure and the algorithms of our validator in Section 5, including someimplementation details in Appendix A. Our concluding remarks are in Section 6.

2 A Suitable Mathematical Theory

The best mathematical theory to use as a test case, is Landau’s “Grundlagen der Analysis” [17],containing 301 main theorems on the arithmetic of rational, irrational and complex numbers.

This theory was translated into the language Aut−QE [27] by Jutting [23] and represents theonly non-trivial development finalized in the languages of the Automath family [19, 9], whichare the closest to λδ. So our interest in testing λδ on Jutting’s specification is fully motivated.

The issues we have with the management of the specification are related on one hand to theavailability of practical information on the specification itself, and on the other hand to the com-patibility of λδ and Aut−QE. We will discuss these aspects in Subsection 2.1 and Subsection 2.2.

We recall the general structure and grammar of an Automath text in Subsection 2.3, while inSubsection 2.4 we present some numerical data about the specification’s contents.

2.1 Sources of Information

The digital version of the specification was recovered from Jutting’s original files [28] and isincluded in the latest distribution of Wiedijk’s validator for Aut−68 [24] and Aut−QE.

The specification, whose concrete syntax is found in [28], exploits a “paragraph system” brieflymentioned in [30] and fully explained in [23] Appendix 2. Jutting also exploits global referenceswith implicit arguments as explained in [27]. Both these facilities are recalled in Section 3.

We are still missing Jutting’s detailed explanation of his specification (five volumes titled “Atranslation of Landau’s Grundlagen in AUTOMATH” of which only the cover pages are available).

In any case some aspects of the language facilities used by Jutting seem undocumented andthus remain ambiguous to us. As a reasonable way out, we checked how these ambiguities aresolved in the “Grundlagen” knowing that the specification must be correct as it stands.

2.2 Incompatibilities

The systems Aut−QE and λδ differ in three substantial aspects as Aut−QE supports η-reduction,”purity” and “sort inclusion” whereas λδ does not support these features in its present version.

Happily, the impact of η-reduction is very low in the case of the “Grundlagen” as only two η-reduction steps are needed to validate Jutting’s specification [23]. The cause of these reductionswas studied by van Daalen, who proposed a couple of corrections to the specification with whichη-reduction is avoided during validation. Such corrections, reported by Jutting in [23], amountto η-expanding a function symbol, i.e. replacing f with λx.f(x), in two places of the specification.We easily applied these corrections to the original specification source and we checked, using thereduction reporting facility of Wiedijk’s validator, that they indeed behave as expected.

Our implemented validator supports “sort inclusion” (i.e. a form of subtyping) and ”purity”(i.e. a feature by which the type inference operation commutes with the application constructor)by implementing a slightly extended version of λδ’s original type checking algorithm.

We stress that these extensions are not meant just for validating the “Grundlagen”, but repre-sent programmed improvements of λδ: in particular ”purity” will be supported in the forthcom-ing version of the calculus and the support for “sort inclusion” is currently under design.

Remarkably, the impact of ”purity” is really low in the “Grundlagen” as this feature is neededonly once for the validation of the global constant denoted by the URI <ld:/l/r/ande2.ld >according to our URI assignment system (see Subsection 3.1 and Subsection 3.2).

UBLCS-2009-16 3

2 A Suitable Mathematical Theory

(contents)<book> ::= [ <line> ] * <EOF><line> ::= <section> | <context> | <opener> | <decl> | <def><section> ::= "+" [ " * " ]? <id> | "-" <id> | "--"<context> ::= <STAR> | <qid> <STAR><opener> ::= <id> <DEF> <EB> <E> <term>

| <id> <E> <term> <DEF> <EB>| "[" <id> <OF> <term> "]"

<decl> ::= <id> <DEF> <PN> <E> <term>| <id> <E> <term> <DEF> <PN>

<def> ::= <id> <DEF> [ "˜" ]? <term> <E> <term>| <id> <E> <term> <DEF> [ "˜" ]? <term>

<term> ::= <TYPE> | <PROP>| <qid> [ "(" [ <term> [ "," <term> ] * ]? ")" ]?| "[" <id> <OF> <term> "]" <term>| "<" <term> ">" <term>

<qid> ::= <id> [ ‘"‘ [ <id> ]? [ <PATH> <id> ] * ‘"‘ ]?<id> ::= [ "0"-"9" | "A"-"Z" | "a"-"z" | "_" | "’" | "‘" ]+

(presentational variants)<STAR> ::= " * " | "@"<DEF> ::= ":=" | "="<EB> ::= "---" | "’eb’" | "EB"<PN> ::= "???" | "’pn’" | "PN" | "’prim’" | "PRIM"<E> ::= "_E" | "’_E’" | ";" | ":"<OF> ::= ":", ","<TYPE> ::= "’type’" | "TYPE"<PROP> ::= "’prop’" | "PROP"<PATH> ::= "-", "."<EOF> ::= ";" | eof

(spaces and comments)<space> ::= [ space | tab | newline ]+<comment> ::= [ "#" | "%" ] [ . ] * [ newline | eof ]

| "" [ . ] * ""

Figure 1. Automath concrete syntax

2.3 Parsing the Specification

The grammar recognized by our Automath parser is presented in Figure 1 (our notational con-ventions for displaying grammars are in Figure 2). Properly nested comments are accepted.

It should be noted that the Automath grammar evolved trough time and has many variants[19], which we try to capture. However our parser recognizes “_E” in place of the original “un-derscored E” but this is not a problem since this notation does not appear in the “Grundlagen”.

It is important to recall the structure of a formal specification in the language Aut−QE.A text written in an Automath language (also known as an “Automath book”) is structured as

a sequence of lines, each asserting a statement. The following kinds of statement are available:

• Sectioning-related statement. This statement opens or closes a “paragraph” (a better trans-lationwould be “section” as pointed out in [28]). Automath “paragraphs” are possibly nestednamed scopes in which the global constants are declared or defined. It should be noted thata previously closed scope can be reopened. Moreover in a well-formed Automath book,

" " the enclosed characters | choice‘"‘ the character " [ ]? optionalspace tab newline eof special characters [ ] * zero or more- any character in the specified range [ ] + one or more. any character [ ] bracketing

Figure 2. Conventions for displaying the concrete syntax

UBLCS-2009-16 4

3 The Abstract Layer

Lines

Sectioning-related 1486Block openers 4297 +Global declarations 32 +Global definitions 6878 +Total 12693

Parameters

Actual parameters 136760 +Terms

Sorts 374 +References 162918 +Applications 3634 +Abstractions 4813 +

Int. ComplexityMeasure

Nodes count 319706

Figure 3. Numerical summary of the “Grundlagen” in Automath

sections are properly nested so only the last opened section can be closed.

• Block opener. This statement introduces a local declaration, i.e. it assigns a type to an iden-tifier in a given context. The semantics of context formation is recalled in Subsection 3.3.

• Global declaration. This is like the block opener but the declaration is global.

• Global definition. This statement defines an identifier as an abbreviation of a explicitlytyped term. The definition occurs in a given context and there is a way, never used in the“Grundlagen”, to inhibit the δ-expansion of the defined identifier to speed reduction.

An identifier declared or defined by a statement which is not sectioning-related, is termed a“notion”. Moreover the “notions” that are not “block openers” will be termed “global constants”.

Automath terms and types are λ-terms of a single syntactic category containing two sorts, ref-erences, typed abstractions and binary applications. References to global constants may have ac-tual parameters and a section indicator used in the disambiguation process (see Subsection 3.2).

2.4 Numerical Summary of the Specification

From the numerical standpoint, the formalized “Grundlagen” consists of the elements presentedin Figure 3. As an intrinsic complexity measure of the specification, we take the “nodes count”obtained by adding the quantities in the rows of Figure 3 marked with “+”.

This measure complies with the “nodes count” we defined in [13] and refers to the size of thetree representation of the whole specification thought as a single λ-term in the style of [10].

3 The Abstract Layer

The concrete languages of the Automath family provide for some facilities meant to decreasethe verbosity of Automath “books”. On the other hand λδ, being an abstract language, does notsupport this shorthand in the first place. Moreover λδ is an evolving framework, whose devel-opment process will eventually give rise to a family of languages. In this perspective we see thetranslation of Aut−QE into λδ not as a single-step encoding, but as a two-steps transformation.

The first step takesAut−QE to an abstract layer in which the shorthand facilities are removed,and that is easily projectable in any extension of λδ we are planning to develop so far.

The second step goes from the abstract layer to the desired version of the calculus which, inthe present case, is a slight extension of [12] to be outlined in Subsection 4.1.

UBLCS-2009-16 5

3 The Abstract Layer

As a concrete Automath language, Aut−QE includes the following facilities: the “paragraphsystem” allows to reuse identifiers avoiding name collisions (see Subsection 3.2); the “block sys-tem” allows to share the formal parameters that several global constants have in common1 (seeSubsection 3.3); the “abbreviation system” (known as the “shorthand facility” [27]) allows to omitsome actual parameters in a reference to a global constant (see Subsection 3.4).

Subsection 3.1 illustrates the actions taken to translate an Automath “book” into a “book” of ourabstract layer, and in Subsection 3.5 we compare these “books” in the case of the “Grundlagen”.

3.1 The Abstraction Step

A “book” of the abstract layer is organized in lines as an Automath “book” but some informationis added or deleted during the translation process. We informally describe this process below.

• The sectioning-related lines are removed and each global constant is disambiguated byassigning a unique URI [20] to it. This methodology complies with guidelines of the Hy-pertextual Electronic Library of Mathematics (HELM) [2], that pursues the view of a formalmathematical theory as set of distributed resources ordered by the dependence relation.

• The block-opening lines are also removed and the local declarations are discharged wherenecessary. The support for block openers in λδ will not be the one of Automath.

• The remaining lines are numbered consecutively (starting at 1) to ease the computation ofthe relative position of a global constant with respect to its references (see below).

• The information about a global definition being expandable or not, is stored because λδmight support unexpandable (i.e. “opaque”) definitions in the future.

• Every local reference is replaced by the corresponding position (i.e. de Bruijn’s) index [11].Note that in λδ a reference to the local environment occurs by position starting at 0 [12].

• Every global reference is replaced by the URI of the referred constant and the length of thereference’s local environment is stored for convenience (see below).

• The actual parameters of every global reference are completed by adding the omitted oneswhere necessary. λδ will support omitted parameters but not in the way of Automath.

The current version of λδ adopts a “single line” policy in the sense of [10], so all referencesoccur by position as there is no difference between the local environment and the global environ-ment. Since this solution seems unrealistic in the real world, where the global environment canbe very huge and distributed, global references by name will be supported eventually.

As it stands, the abstract layer allows both representations: the referred name is stored as aURI in the reference’s node, while the corresponding position index is given by:

i ≡ #r − #c + |Γ| − 1 being #r > #c (1)

where #c is the number of the line in which the referred constant is defined or declared, #r isthe number of the line where the reference occurs and |Γ| is the length of the reference’s localenvironment (this information is stored in the reference’s node as well).

An abstract layer “book” can be processed with a streaming policy (i.e. a single line at a time)and the performance of our implemented processor really improves by using such a policy.

3.2 Disambiguation

The proper lines of an Automath text exploiting the “paragraph system” facility, i.e. the lines thatare not sectioning-related, are grouped into possibly nested named sections.

Furthermore a “complete index” is assigned to each line and this is the list of the sections’names containing that line, sorted according to the outermost-to-innermost order.

1. Actually, the “block system” is more than a mere facility: see the discussion on “instantiation” in [9].

UBLCS-2009-16 6

3 The Abstract Layer

The “paragraph system” specification requires a “cover” section (named “l ” in the “Grundla-gen”) enclosing the entire “book”, so a complete index is never empty.

We can easily compute these indexes while we are scanning a “book” by maintaining a “cur-rent index”, initialized to the empty list, to be used as a stack. When we encounter a section-opening statement, we push the section’s name and when we encounter a section-closing state-ment, we pop the top section’s name. In doing so, we are assuming that the section openers andclosers are properly nested as the “paragraph system” specification requires.

A section can be reopened at the same nesting level but two sections having the same namecan not be nested (our implemented processor assumes that the “book” is well-formed because sois the “Grundlagen” specification, nevertheless we could easily add all the checks).

Once the index of a line is computed, that line receives a URI. Namely our URI scheme is “ld ”(i.e. λδ in Latin letters) and the URI’s hierarchy part is the rootless path whose segments are theindex components to which we add the name of the constant declared or defined in the line.2

For instance the URI for the first global constant of Landau’s specification, i.e. the implicationconnective, is<ld:/l/imp.ld >while the URI for the last global constant, i.e. a part of Landau’sproposition 301, is <ld:/l/e/st/eq/landau/n/rt/rp/r/c/satz301f.ld >.

Note that the “rule for constants” stated in the specification of the “paragraph system” impliesthat different proper lines of a well-formed “book” always receive different URIs.

A reference r in a line l can have a “complete index” or an “incomplete index” or no index atall. Such a reference is resolved by computing either the position index of the referred localdeclaration or the URI of the referred “notion”. The original disambiguation rules follow.

• If r has a “complete index” j, which is the concatenation of the the component s before thelist jt, and if the line l has the “complete index” i, which is the concatenation of the list ihbefore the component s and before the list it, then a “notion” with r’s name is looked up inthe section whose index is the concatenation of ih before j. Such a “notion” must exist andr is disambiguated by receiving its URI.

• If r has the “incomplete index” j and if the line l has the “complete index” i, then a “notion”with r’s name is looked up in the section whose index is the concatenation of i before j.Such a “notion” must exist and r is disambiguated by receiving its URI.

• If r has no index, a declaration with r’s name is looked up in the local environment of r. Ifsuch a declaration exists, r is disambiguated by receiving its position index.

• On the other hand, a “notion” with r’s name is looked up in the sections containing the linel, sorted according to the innermost-to-outermost order. Such a “notion” must exist and r isdisambiguated by receiving its URI.

• If r is a “context marker” (see Subsection 3.3), then r must refer to a “block opener” otherwiser must refer to a “global constant” or to a declaration in r’s local environment.

Our implemented processor generalizes the first two rules by extending the search for a notionthat should be in a section, say k, to the sections containing k, sorted according to the innermost-to-outermost order. This mechanism agrees with the forth rule and allows to regard a referencewithout an index as having the “complete index” of the line in which it occurs.

We remark that a referencewithout an index is resolved first in its local environment and thenin the global environment. This seems to be the originally intended order of precedence becausethe “Grundlagen” fails to validate if we reverse this precedence [28].

3.3 Retrieving the Formal Parameters

The “block system” is a peculiar feature of every concrete Automath language [19].In principle a global constant may have a list of formal parameters that are retrieved by fol-

lowing a chain of “block openers”, each representing a parameter declaration.

2. The extension “.ld ” is added to each URI to comply with HELM requirements on the URI structure [2].

UBLCS-2009-16 7

4 The Kernel Layer

To this aim every “notion” has a “context marker” indicating where its chain (also termed its“context”) starts, and the rules for constructing the chain of a “notion” are given below.

• If the “notion” has an empty marker, then its chain is empty.

• If the “notion” has a reference marker pointing to a “block opener”, then its chain containsthe chain of the “block opener” plus the “block opener” itself.

• If the “notion” has no marker and the preceding statement is a “block opener”, then theintended marker is a reference to it.

• If the “notion” has no marker and the preceding statement is a global declaration or defini-tion, then the intended marker is the one of that statement (recursively).

The intended meaning of “preceding” in the last two rules becomes unclear when the “para-graph system” is in effect. Given that the “Grundlagen” becomes incorrect if “preceding” is under-stood literally, we argue that the “block system”must be aware of the “paragraph system” somehow.In particular “preceding” reasonably means “preceding in the same section or in its parent”, butmay also mean “preceding in the same section fragment or in its parent” (recall that sections canbe closed and reopened, thus a section might be divided in many fragments).

The “Grundlagen” does not help to solve this ambiguity because Jutting always reopens asection with a statement having an explicit context marker, so further investigation is required.

3.4 Expanding the Abbreviated Applications

The “abbreviation system” [27] is a facility of some concrete Automath languages including theextension of Aut−QE that Jutting used for the formal specification of the “Grundlagen”.

This facility works as follows: suppose that a “notion” c is defined or declared in a context Γof formal parameters, say x1, . . . , xn. Then a reference to c in a subsequent line, say l, generallyneeds to be applied to n actual parameters and thus appears like c(t1, . . . , tn).

Nevertheless, if the context Γ is an initial segment of the context of the notion defined ordeclared in the line l, where the reference to c appears, this reference is allowed to take less thann actual parameters and c(tm+1, . . . , tn) must be interpreted as c(x1, . . . , xm, tm+1, . . . , tn).

Here we are assuming m ≤ n, thus all actual parameters may be omitted in some cases.

3.5 Numerical Summary of the Specification in the Abstract Layer

From the numerical standpoint, the “Grundlagen” formalized in the abstract layer consists of theelements presented in Figure 4. Here we take the intrinsic complexity measure of Subsection 2.4and comparing the two, we note an increment factor of 2.36. This value confirms that the short-hand facilities provided by the concrete Automath languages are very effective in reducing theintrinsic verbosity of the source text, which decreases of the 57.6% in the present case.

4 The Kernel Layer

In principle, our implemented processor features a multi-kernel architecture that allows to testand compare several variants of λδ. Presently we are experimenting with three variants, each ofwhich has its own type checker (also termed: kernel), but some form of code factorization will beconsidered depending on how many kernels will be tested during the processor’s life cycle.

Anyway, our multi-kernel approach is motivated in that every variant of λδ employs a differ-ent abstract syntax, which is reflected in a different implementation of the kernel structures.

Here we focus our attention on the variant we named: “Basic Relative Global” (or “brg” forshort), which is the nearest to the official λδ specification [12]. This variant will be discussed inSubsection 4.1, while considering the other variants goes beyond the scope of the paper.

The translation of Aut−QE into λδ is completed in Subsection 4.2 where the transformationtaking a “book” of the abstract layer to a “book” of λδ “brg” is outlined. Moreover Subsection 4.4contains some numerical data resulting from applying this transformation to the “Grundlagen”.

UBLCS-2009-16 8

4 The Kernel Layer

Lines

Global declarations 32 +Global definitions 6878 +Total 6910

Parameters

Formal parameters 36837 +Actual parameters 314337 +

Terms

Sorts 1637 +Local references 232821 +Global references 146270 +Applications 5488 +Abstractions 10278 +

Int. ComplexityMeasure

Nodes count 754578

Figure 4. Numerical summary of the “Grundlagen” in the abstract layer

The translated data can be stored in a persistent format as we explain in Subsection 4.3.The validation issues are addressed in Section 5 where the “brg” kernel is presented.

4.1 The Implemented Variant of λδ

The main variant of λδ implemented in our processor is a modification of the calculus χλδ intro-duced in [12] (the binder χ is not needed for the “Grundlagen” but was implemented easily).

The implemented calculus, whose abstract syntax is shown in Figure 5, features terms madeof typed abstractions, untyped abbreviations, binary applications and type annotations. A se-quence of sorts is also provided. References to the local environment occur by position throughde Bruijn’s indexes, while references to the global environment occur by name. In principle onecan use any set of names for this purpose, and here we use natural numbers for simplicity (ourimplementation really exploits natural numbers to this aim, i.e. the hash values of the URIs).

Our conventions for displaying the abstract syntax of Figure 5 follow [1]. Namely the meta-variables of the syntactic categories are listed on the left-hand side of the productions, while anyother symbol on the right-hand side of the productions belongs to the syntax itself.

We recall here some notational conventions of [12]. The use of capital letters for terms comesfrom the untyped λ-calculus tradition, whereas the symbol ∗ for the sorts comes from the PureType Systems tradition [5]. The symbols # and $, with which we distinguish numeric referencesfrom literal ones, come from the Computer Science tradition. The prefixed notation for the func-tion application comes from the Automath tradition [19] and is justified in [15].

The same sort symbol ∗ and the same binding notation is used for both terms and environ-ments because λδ pursues the unification between terms and local environments. The readershould also note that no distinction is made between terms and types (in the sense on λ→) as λδpursues the unification of the two as well. On the contrary, global environments are not meant tobe unified with the previous categories so their meta-variable G appears in script font.

With respect to the original χλδ, we removed the level indication from the environment’sbottom sort. We also removed the application and the type annotation from the environment’sconstructors (these features belong to χλδ for completeness but the calculus, as it stands in [12],does not use them at all). Moreover we split the environment into a local part and a global one,providing a differentiatedway for accessing the binders in each part, i.e. by position and by name.

In particular, the reduction schemes remain essentially the same and are shown in Figure 6.These schemes are substitution-free in the spirit of the forthcoming version of λδ, and exploit

just the “relocation function” ↑i, which always appears when de Bruijn’s indexes are used.Moreover |E2| is the “length” of E2 i.e. the number of binders in E2, while x /∈ G2 means that

UBLCS-2009-16 9

4 The Kernel Layer

i,l,x ::= natural numbers starting at 0T ,U ,V ,W ::= terms

| ∗l sort of level l| #i reference to the i-th local binder| $x reference to the global binder x| 〈U〉.T annotation of T with its type U| (V ).T application of T to the argument V| λW.T local abstraction over the type W in T| δV.T local abbreviation of V in T| χ.T local binder exclusion in T

E ::= local environment| ∗ environment bottom| E.λW declaration of type W| E.δV abbreviation of V| E.χ binder exclusion

G ::= global environment| ∗ environment bottom| G.λxW declaration of type W| G.δxV abbreviation of V| G.χx binder exclusion

Figure 5. Abstract syntax of λδ “brg”

scheme environment redex reduct

β-contraction G, E ⊢ (V ).λW.T → δV.Tlocal δ-expansion G, E1.δV.E2 ⊢ #i → ↑i+1V if i = |E2|global δ-expansion G1.δxV.G2, E ⊢ $x → V if x /∈ G2

ζ-contraction for δ G, E ⊢ δV.↑1T → Tζ-contraction for χ G, E ⊢ χ.↑1T → Tυ-swap for δ G, E ⊢ (V1).δV2.T → δV2.(↑

1V1).Tυ-swap for χ G, E ⊢ (V1).χ.T → χ.(↑1V1).Tτ -contraction G, E ⊢ 〈U〉.T → T

Figure 6. Reduction steps of λδ “brg”

G1, ∗ ⊢h V : W x /∈ G2

G1.δxV.G2, E ⊢h $x : Wg−def

G1.∗ ⊢h W : V x /∈ G2

G1.λxW.E2, E ⊢h $x : Wg−decl

G, E1 ⊢h V : W i = |E2|G, E1.δV.E2 ⊢h #i : ↑i+1W

l−defG, E1 ⊢h W : V i = |E2|

G, E1.λW.E2 ⊢h #i : ↑i+1Wl−decl

G, E ⊢h ∗l : ∗h(l)sort

G, E ⊢h T : U G, E ⊢h U : VG, E ⊢h 〈U〉.T : 〈V 〉.U

castG, E.χ ⊢h T : UG, E ⊢h χ.T : χ.U

void

G, E ⊢h V : W G, E.δV ⊢h T : UG, E ⊢h δV.T : δV.U

abbrG, E ⊢h W : V G, E.λW ⊢h T : U

G, E ⊢h λW.T : λW.Uabst

G, E ⊢h V : W G, E ⊢h T : λW.UG, E ⊢h (V ).T : (V ).λW.U

applG, E ⊢h T : U G, E ⊢h (V ).U : W

G, E ⊢h (V ).T : (V ).Upure

G, E ⊢h U2 : V G, E ⊢h T : U1 G, E ⊢ U1 ↔∗ U2

G, E ⊢h T : U2conv

Figure 7. Type assignment rules of λδ “brg”

UBLCS-2009-16 10

4 The Kernel Layer

there is no binder named x in G2. The infix full stop can concatenate two environments.Here are not showing the “start rules” and the “compatibility rules” of reduction because the

whole λδ reduction system is undergoing a reaxiomatization at the moment. In any case the“compatibility rules” are based on the “shift principle” [12], which consists in shifting the bindersof a term to its local environment (this attitude makes it clear why all binders available for termsare also available for environments and why we use the same notation for them in Figure 5).

The type assignment rules of λδ “brg” are shown in Figure 7. These include the originaltype assignment rules of χλδ [12], adapted to support global references, and the “pure” typeassignment rule for function application, mentioned in [12] as Rule (13). The type assignmentjudgement G, E ⊢h T : U depends on the “sort hierarchy parameter” h, which is a function fromthe natural numbers to themselves that can be chosen at will as long as l < h(l) for every l.

The predicate G, E ⊢ U1 ↔∗ U2 states that the terms U1 and U2 are convertible in the environ-ments G and E. We recall that the conversion relation is the symmetric (hence the symbol ↔),reflexive and transitive (hence the superscript ∗) closure of the single-step reduction relation.

The reader should note that the rules for typing the binders are based on the “shift principle”.It should also be noted that a notion of “well-formed environment” is not necessary for typing.We end this section by explaining the meaning of “Basic Relative Global”. In particular “Basic”

refers to λδ’s basic version as opposed to its extensions proposed in [12] Appendix B. “Relative”refers to the use of relative local references (i.e. by position) as opposed to using absolute ones(i.e. by name) as in [12]. Finally “Global” refers to the support for the global environment.

4.2 The Instantiation Step

In the case under consideration, the second part of the translation of Aut−QE into λδ converts a“book” of the abstract layer into a “global environment” of λδ “brg”. We outline this process below.

• The list of the formal parameters on which a global constant, say x, depends are mappedto a local environment, say Ex, containing the parameter declarations as λ-items.

• A global declaration x : W having its formal parameters mapped to Ex, is mapped to theentry λx[Ex.W ] of the global environment (the concatenation is bracketed for clarity).

• A global definition x ≡ V : W having its formal parameters mapped to Ex, is mapped tothe entry δx〈[Ex.W ]〉.[Ex.V ] of the global environment (brackets added for clarity).

• The global references are connected to their actual parameters through applications.

• The two sorts Type and Prop coming from Aut−QE are mapped to ∗0 and ∗1 respectively.

Aswe see, the instantiation step is a straightforward translation not depending on the “degree”[27] of the translated expressions (i.e. the information about them being terms, types or sorts).

In the above description, x is the URI of a global constant (see Subsection 3.2), that we areidentifying for simplicity with the corresponding natural number (see Section 5).

Note that the global definitions coming from Aut−QE are typed and that we maintain thisinformation in λδ by inserting type annotations. On the other hand we lose the informationabout these definitions not being δ-expandable, because λδ “brg” can not encode it.

The reader should also note the replication of Ex in the encoding of the global definition x.

4.3 Persistence of the Translated Data

At the end of the instantiation step (see Subsection 4.2), our validator can generate a persistentXML [29] representation of the translated data in the spirit of HELM approach to long-term stor-age of mathematical contents [2]. In particular the hierarchical structure of URIs is replicated onthe file system and the information on each global constant is stored in a separate document.

We stress that one of the intended reasons for generating these documents is to allow the reuseof the stored information by third party applications managing formal mathematical knowledge.

Figure 8 shows the document generated for the notion <ld:/l/not.ld > (the logical nega-tion operator), which is the shortest one containing the widest set of λδ “brg” constructions.

UBLCS-2009-16 11

4 The Kernel Layer

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE KERNEL SYSTEM "http://helm.cs.unibo.it/lambd a-delta/xml/ld.dtd">

<ENTRY hierarchy="Z2" options="si"><ABBR uri="ld:/grundlagen/l/not.ld" name="not">

<Cast><Abst name="a">

<Sort position="1" name="Prop"/></Abst><Sort position="1" name="Prop"/>

</Cast><Abst name="a">

<Sort position="1" name="Prop"/></Abst><Appl>

<GRef uri="ld:/grundlagen/l/con.ld" name="con"/></Appl><Appl>

<LRef position="0" name="a"/></Appl><GRef uri="ld:/grundlagen/l/imp.ld" name="imp"/>

</ABBR></ENTRY>

Figure 8. The document “grundlagen/l/not.ld.xml”

Item Tag Item Tag Item Tag Item Tag Item Tag Item Tag

∗l Sort #i LRef $x GRef 〈U〉 Cast (V ) ApplλW Abst δV Abbr χ Void λxW ABST δxV ABBR χx VOID

Figure 9. λδ’s syntactic items and the corresponding XML elements

The document contains an entry of a λδ global environment (see Figure 5) with additionalinformation for rendering and validation. On the other hand, the intended version of λδ is notspecified since this persistence format is meant to be the same for all versions of the calculus.

The root element, named “ENTRY”, specifies the validation requirements. Namely what “sorthierarchy parameter” to use (see Subsection 5.2) and whether sort inclusion must be enabled (seeSubsection 5.1). In particular the indication “si ” stands for “sort inclusion enabled”.

The representation of the logical data complies with the use of the “item notation” [15] pursuedby λδ, in that each XML element represents an “item” of the abstract syntax as shown in Figure 9.

A “name” attribute is included in most elements to improve the presentation of the logicaldata. The names of the local binders are α-converted when necessary to avoid conflicts.

All documents can be validated against the provided “document type definition” (DTD) [29].The data set of all XML documents resulting from the translation of the “Grundlagen” is avail-

able for download at λδ’s Web site <http://helm.cs.unibo.it/lambda-delta/ >.

4.4 Numerical Summary of the Specification in the Kernel Layer

Once translated in λδ “brg”, the “Grundlagen” consists of the elements presented in Figure 10.As before, the nodes count is computed adding the numbers in the rows marked with “+”.The increment of term items with respect to Figure 4 is due to the replication of the parameter

declarations occurring when the parameters of the global definitions are discharged (see Subsec-tion 4.2). Numerically, we quantify this increment with the factor 1.32 obtained by comparing thenodes counts in the two cases. In order to obtain a more precise factor, we may want to subtractthe global entries (6910) from both counts. This refinement gives the factor 1.33.

Comparing the measures in Figure 10 and Figure 3, we find an increment factor of 3.12 thatrepresents the “intrinsic loss” associated to our translation of the “Grundlagen” seen as a whole.

UBLCS-2009-16 12

5 Validation

Global Environment Entries

Declarations 32 +Definitions 6878 +Total 6910

Terms

Sorts 2917 +Local references 295202 +Global references 204436 +Type annotations 6878Applications 399147 +Abstractions 89620 +

Int. ComplexityMeasure

Nodes count 998232

Figure 10. Numerical summary of the “Grundlagen” in λδ “brg”

wfh(∗)sort

wfh(G) G, ∗ ⊢h W : Vwfh(G.λW )

abstwfh(G) G, ∗ ⊢h V : W

wfh(G.δV )abbr

wfh(G)wfh(G.χ)

void

Figure 11. Correctness rules of λδ “brg”

5 Validation

In this section we describe the structure of the “brg” kernel, whose purpose is to check the correct-ness of a global environment representing a “book” with respect to the type system of λδ “brg”.

From the formal standpoint, a global environment G is correct (or well-formed) with respectto a hierarchy parameter h if it satisfies the predicate wfh(G) defined by the rules of Figure 11.

In this respect, the main task of the kernel is to decide the three predicates: G, E ⊢ U1 ↔∗ U2

(conversion), G, E ⊢h T : U (type assignment) and wfh(G) (correctness).To this aim, the kernel is divided in seven compilation units (i.e.modules) outlined below.

1. Brg. This module defines the kernel data structures, which include: terms, local environ-ments and global environment entries. Some helper functions for constructing and inspect-ing these structures are also provided (see Appendix A.1 for some source code). Remark-ably each term can be annotatedwith some extra-logical information which includes bindernames and auxiliary data used by the reduction machinery (see Subsection 5.1).

2. BrgOutput. This is a pretty-printer for terms and local environments to display warningsand error messages. A facility for displaying the statistics of Figure 10 is provided as well.

3. BrgEnvironment. This module implements the global environment as a hash table formURIs to entries. Each entry receives a consecutive “age” number (starting at 1) exploited bythe reduction machinery (see the comments below and in Subsection 5.1).

4. BrgSubstitution. This module implements the relocation (i.e. “lift”) facility to update theposition indexes during type checking (note that this function is not needed for reduction).

5. BrgReduction. This module deals with the predicate G, E ⊢ U1 ↔∗ U2 by implementingthe domain retrieval and the convertibility check, both to be discussed in Subsection 5.1.

6. BrgType. This module deals with the predicate G, E ⊢h T : U by implementing the λδ “brg”canonical type inference algorithm to be discussed in Subsection 5.2.

UBLCS-2009-16 13

5 Validation

7. BrgUntrusted. This module deals with the predicate wfh(G) by checking the correctness ofa given global constant and by adding it to the global environment afterwards.

Terms with extra-logical annotations already appear in some systems, for instance in MATITA

[4]: the formal proof management system developed within the HELM project. Neverthelessthese systems may use several levels of terms and annotations on the kernel-level terms are veryrare. In particular MATITA [3], COQ [7] and the Automath validators [30, 28] do not use them.

Our motivation for annotating kernel terms is threefold. Firstly a local binder node in theabstract syntax of terms must contain the binder’s name for presentational purposes, but thisinformation is extra-logical if local binders are referred by position as in λδ “brg”. In this respect,we find a name annotation more appropriate than a name field beside the binder’s subterms (thisis the solution of MATITA and COQ, which are based on the Calculus of Constructions with localbinders referred by position). Secondly we find it convenient to store some extra-logical infor-mation in the terms managed by the reduction machinery (see Subsection 5.1) in order to speedthe computations while keeping the data structures clean as expected. Thirdly our experience indeveloping the “procedural reconstruction procedure” [13] (a feature of the system MATITA), leadsus to conjecture that the part of this procedure operating on kernel-level terms (i.e. the so-called“preprocessor”) would be better implemented if such terms could be annotated.

At the moment, our global environment is not a cache as it happens in the system MATITA

so the stored entries are never removed from the environment during the process of validatinga “book”. This solution is suitable for the “Grundlagen” but is not desirable in the perspective ofvalidating very big theories, so we might implement a cached environment eventually.

Both the system MATITA and the Automath validators implement a mechanism to preventuseless δ-expansions during the convertibility check. Namely a number #u is assigned to everyglobal definition u such that #v < #u is ensured when u depends on the global definition v. Inthis respect, when u and v are checked for convertibility, it is safe to expand just u if #v < #u.

In particular the system MATITA interprets #u as the “height” of u [3] whereas the Automathvalidators and the “brg” kernel interpret #u as the “age” of u [30]. These two numbering systemsare equivalent but “ages” are computed faster and impose a total order on entries so u1 6= u2

implies #u1 < #u2 or #u2 < #u1, meaning that u1 and u2 are never expanded simultaneouslyduring their convertibility check. Anyway, we might consider using “heights” in our kernel.

We stress that numbering the entries consecutively as they are put into the environment af-ter validation indeed provides for an “age” system because the rules of Figure 11 ensure that achecked constant can not enter the environment before the constants it depends on.

The reader should note that λδ “brg” adopts a reduction discipline without explicit substitu-tion, so there is no need to implement a substitution function in the BrgSubstitution module.

The BrgReduction and BrgType modules deserve a detailed discussion that the readerwill find in Subsection 5.1 and Subsection 5.2 respectively. More information on the modules isfound in Appendix A.1, which presents some of their interfaces. In Subsection 5.3 we give somenumerical data on the reductions needed to validate the “Grundlagen” in the “brg” kernel.

5.1 Reduction and Conversion

The dynamic component of λδ “brg” is managed by the module BrgReduction that implementsthe domain retrieval and the convertibility check. Formally, these functionalities are related toFigure 7(conv) and to the applicability check (encharged of testing the so-called “applicability con-dition”, which an application must satisfy in order to be correct or well typed).

Both the domain retrieval and the convertibility check require to apply a chain of reductionsteps to some given terms and the literature suggests (see the citations in [3] Subsection 5.1) thatweak head normal forms provide for good points to break the reduction chains. Therefore thebasic task of the reduction machinery is to compute the weak head normal form of a term.

Moreover it is well known that an efficient reduction strategy limits the invocation of costlyoperations such as useless or duplicated reduction steps, substitutions and relocations on terms.

Our aim now is to discuss how we met the above requirements in the “brg” kernel and in thisrespect, the distinguishing feature of our reduction apparatus is that it works without performing

UBLCS-2009-16 14

5 Validation

any relocation or substitution (i.e. “lifting” or “delifting” adopting awell-established terminology).• Pseudo-reductions. Some parts of the Automath type checking algorithm can be implementedby supporting some pseudo-reductions in the validator’s reduction apparatus.

This is the case of “sort inclusion”: a feature that Aut−QE formally supports through subtyp-ing [27] and that Zandleven’s validator [30] supports by allowing the following reduction step:

λW.∗l → ∗l (sort inclusion) (2)

that we display with the notation of Figure 5. Using the terminology of [8], this reduction isenabled when testing an “inferred type” and the corresponding “expected type” for convertibility.Namely Rule (2) applies only to the “inferred type” and only when no other reduction is applicable.

The reader should note that proper conversion extended with sort inclusion can be related to“cumulativity”: an order relation appearing in the meta-theory of the system ECC [18].

Anyway. sort inclusion is not essential in Aut−QE as it can be axiomatized in the calculus [9].Another example is “purity”: a feature by which the type inference operation commutes with

the application constructor. Aut−QE formally supports “purity” through a specific type rule [27]that we can mimic by allowing the following pseudo-reduction steps in the validator:

G, E1.λW.E2 ⊢ #i → ↑i+1W if i = |E2| (local reference typing) (3)

G1.λxW.G2, E ⊢ $x → W if x /∈ G2 (global reference typing) (4)

that we display following the conventions of Figure 6. Generally speaking, the reduction steps(2), (3) and (4) break λδ’s meta-theory (esp. the Church-Rosser property and the subject reductionof the arity assignment) and are implemented in the kernel just for the benefit of the “brg” typechecker, which invokes them in a controlled manner.3 This is to say that these steps are not meantfor standard reduction procedures (as the one that computes the normal form of a term).• Computation of the weak head normal form. The weak head normal form (w.h.n.f.) of aterm is computed by a Krivine Abstract Machine (KAM) [16]4 extended to cope with λδ “brg”.

Therefore β-redexes are evaluated using the call-by-name strategy, which agrees with theshape of β-contraction used by λδ (Figure 6). The call-by-need evaluation usually provides formore efficient reductions, but the “brg” kernel does not implement this strategy at the moment.

The system MATITA uses a KAM parameterized over the evaluation strategy [3] whereas ourvalidator does not implement this feature yet, thus the strategy is built in the machine.

Our KAM computes the w.h.n.f. of a term T0 by applying β-contractions, local δ-expansions,υ-swaps and τ -contractions (see Figure 6). Moreover, global δ-expansions and reference typingare activated on demand. The features of our machine are discussed below.

In addition to the definition entries of the standard KAM, our machine environment E cancontain declaration entries and exclusion entries with the aim of performing deep weak headreductions during the convertibility check without unwinding the machine.

Since the initial term T0 is closed in a given local environment E0, the initial state of the ma-chine should contain E0 translated into a machine environment E0. To avoid the cost of thistranslation, we use machine environments in place of local environments throughout the valida-tor. This solution implies a minor change in the original structure of the local environment (seeFigure 5) that amounts to adding a pointer to an auxiliary environment in each entry. See Ap-pendix A.1 for details. It should be noted that the system MATITA faces this problem differently:namely its KAM incorporates a read-only local environment besides the machine environment[3]. The formal drawback of this solution is that the local environment does not support the addi-tional structures needed by the KAM (closures, call-by-need data, etc.) so the KAM runs at “fullspeed” only when the term to be reduced refers just to the machine environment.

The caller of our machine can push any entry in its environment before starting it and whensuch pushed entries are declarations, they are annotated with increasing consecutive numbers

3. In actual fact, typability does not imply (even weak) normalization if unrestricted sort inclusion is allowed in λδ

because there is no difference between abstraction an instantiation in this calculus as in Aut−QE [9] (see Appendix B.2).4. With respect to [16], our machine does not process multiple β-redexes in a single reduction step.

UBLCS-2009-16 15

5 Validation

(starting at 0 according to λδ conventions). Such numbers serve as “absolute position indexes” ofthe declaration entries in the machine environment and allow to perform the convertibility checkwithout computing the relocations due to the ζ-contractions (see below). The other relocations,coming from local δ-expansions and υ-swaps, are already avoided in any KAM.

The machine stops on sorts and abstractions. Moreover it stops on global references to defi-nitions if global δ-expansion is disabled. Finally it stops on references to declarations (both localand global) if reference typing is disabled. An error is produced if references to excluded entriesare encountered (both local and global). When stopping on a global reference, the KAM providesinformation on the referred constant including its age. When stopping on a local reference to adeclaration, the KAM provides its domain and its “absolute position index”.• Domain retrieval. In the λδ setting, the “applicability condition” is explained as follows.

An application (V ).T is typable in the environments G and E if the item (V ) matches anabstraction item, say λW1 found somewhere in T , E or G. Moreover W1 (i.e. the “domain” ofthe item) and the type, say W2, of V must be convertible. According to the application andconversion rules of Figure 7, λW1 must be the head item of the w.h.n.f. of the type, say U , of T .

If the head of the w.h.n.f. of U is not an abstraction item and if the “purity” rule is in effect, thew.h.n.f. of the iterated types of U (i.e. the type of U , the type of the type of U and so on) must becomputed and their head items must be considered (a finite number of these is enough).5

From the implementative standpoint, the w.h.n.f. of the iterated type of U that may containthe desired item λW1 is computed fairly efficiently by running our KAM on the term U withglobal δ-expansion and reference typing enabled (see Rule (3) and Rule (4)).

The reader should note that the KAM applies Rule (3) avoiding the mentioned relocation.• The convertibility check. The convertibility of two typed terms T1 and T2 is formally es-tablished by comparing their normal forms syntactically, but this naive algorithm may yield thecomputation of redundant reduction steps as, in many real cases, the reducts of T1 and T2 can becompared before reaching their normal forms. A more efficient algorithm sets some breakpointsin the normalization of T1 and T2 and compares the reducts when such breakpoints are reached.

In particular the terms T1 and T2 are reduced in parallel and their normalization processstops each time a deep w.h.n.f. is computed. At this point the reducts’ heads are compared andthe normalization of the reducts is restarted in some cases as we will explain.

Empirical statistics on real-world type checking [3] show that most terms tested for convert-ibility are actually identical. This suggests to attempt the α-convertibility check before the (moreexpensive) full convertibility check in every cycle of the algorithm. This optimization alreadyappears in the system MATITA but it is not enabled in our validator yet.

In every cycle of the convertibility check, the terms to be tested are presented as machinesand firstly their w.h.n.f.’s are computed by running such machines with global δ-expansion andreference typing disabled. Then the reducts are compared as follows.

Two sorts are compared by level. The machine stacks (that contain the application itemspreceding the reducts) are not considered because a sort after an application is always invalid.This is to say that the convertibility check is optimized for type checking.

Two references to global declarations are compared using the ages of the referred constants(see beginning of Section 5). Equal ages imply equal references and we only need to check thestacks (see below). The ages are provided by the machines besides the reducts.

Two references to global definitions are also compared by age. The case of equal ages ishandled as above whereas, when the ages differ, the reference with the greater age is δ-expandedbefore entering the next cycle. A reference to a global definition is matched against any otherreduct by δ-expanding the definition before entering the next cycle as well.

Two references to local declarations are compared using their “absolute position indexes”, whichare provided by the machines besides the reducts. Equal positions imply equal references andweonly need to check the stacks. This test deserves a justification. Given that the references may beclosed in different environments, the usual comparison by relative position (i.e. de Bruijn’s) indexpresupposes relocating these references by means of ζ-reductions, whereas our reduction ma-

5. Being a type, the term U and its iterated types are strongly normalizable, so this search eventually comes to an end.

UBLCS-2009-16 16

5 Validation

chinery strives to avoid such relocations because of their computational cost (see Appendix A.2for some performance information). To this aimwe observe that a KAM never pushes declarationentries in its environment during execution, so these entries are pushed in the KAM just by itscaller when deep w.h.n.f.’s must be computed. Hence the test works if the following invariantis maintained: when a declaration entry needs to be pushed in one machine, a correspondingdeclaration entry must be pushed in the other machine as well. In doing so, these declarationentries will receive the same “absolute position index” in the respective machines, and a referenceto a declaration will match just a reference to the corresponding declaration.

Two abstractions are matched by comparing their domains and then, in case of success, bycomparing their scopes after pushing the corresponding declarations in the respective machines.

The sort inclusion test is enabled on demand and is attempted as a last resort before assertingthat the compared terms T1 and T2 are not convertible. This test breaks the symmetry of theconvertibility check as it works under the assumption that T1 is the “expected type” of a term Twhile T2 is the “inferred type” of T , following the terminology of [8]. If T1 is a sort ∗l and T2 is anabstraction λW.U , then T2 is set to U and we enter the next cycle after pushing the declarationentry λW in both machines to preserve the previously mentioned invariant. This is equivalentto leaving T2 unchanged while setting T1 to λW.∗l before continuing the check, but in this case auseless comparison between two instances of W would be performed when matching the headsof T1 and T2. Sort inclusion must be used with caution: in particular it must be disabled whencomparing the machine stacks and the domains of the abstractions. Without this restriction, somenon-normalizing terms, like the term “Ω”, pass the validation process (see Appendix B.2).

Coming now to the convertibility check of the machine stacks, suffice to say that it is per-formed componentwise, provided that the two stacks have the same length. The given machinescan be reused for these componentwise checks but their “absolute position index” indicators (hold-ing the index for the next declaration entry pushed in the machine) must not be cleared.

5.2 Canonical Typing

As we reminded in Subsection 4.1, λδ’s type system depends on the “sort hierarchy parameter”,which our validator sets by default to Z2(l) ≡ l + 2 to ensure that the sorts Type and Prop aredisconnected in the “sort hierarchy graph” (see [12] Appendix A.2) as it was reasonably intendedby the Aut−QE designers. Nevertheless this choice is not hard-coded and every function of theform Zk(l) ≡ l + k can be selected, provided that k 6= 0 (note that k must be a natural number).

In λδ’s type system every valid term is typable so the possibility to compute the canonicaltype of a term can be taken as a condition ensuring its validity.6 Formally, the canonical type of atyped term in λδ is the τ -normal form of its “static type” defined in [12] Subsection 2.5.

This amounts to type a term allowing the conversion rule just to ensure the “applicabilitycondition” and to remove the explicit type annotations (see Figure 7 for reference).

The “brg” kernel infers the canonical type of a given term using a standard algorithm like theone adopted by the system MATITA [3]. In particular every abbreviation item δV is annotatedwith its type (i.e. it becomes δ〈W 〉.V where W is the inferred type of V ) before entering the localenvironment, so W is typed only once. Moreover this environment contains just trusted items.

We recall that validating an application, say (V ).T , requires performing the applicabilitycheck, which runs as follows: if the domain retrieval function applied to the type of T returnsan abstraction λW.U , then W (i.e. the “domain” of the abstraction, regarded as the “expected type”)and the type of V (i.e. the “inferred type”) are tested for convertibility, otherwise the check fails.

The distinguishing feature of our implementation is the use of reduction machines (see Sub-section 5.1) in place of standard local environments (see Figure 5) throughout the type checker.

This optimization allows to perform the whole applicability check without unwinding thereduction machine used for the domain retrieval, which is ready for the convertibility check.

Nevertheless the standard algorithm we implemented, does not avoid the relocations of Fig-ure 7(l− def) and Figure 7(l − decl) because the algorithm maintains the invariant that the in-ferred type of a term is closed in the environment of that term. In our opinion, dropping this

6. This is not the case in other type systems. For instance the term “” of the λ-Cube [5] is valid without being typable.

UBLCS-2009-16 17

6 Conclusions and Future Work

Proper Reductions

β-contractions 1034626local δ-expansions 494271global δ-expansions 17166virtual ζ-contractions for δ 3694769υ-swaps for δ 2040476τ -contractions 17166

Pseudo-reductions

local reference typing 1sort inclusions 904

Convertibility checks

total 298902solved by α-conversion at least 107021

Figure 12. Numerical summary of the reductions for the “Grundlagen” check

restriction (i.e. allowing the given term and its inferred type to be closed in different environ-ments) would allow to avoid the mentioned relocations with their computational cost.

In this respect, we are aiming at developing a relocation-free type checking algorithm for λδ.

5.3 Numerical Summary of the Specification Check

When validating the “Grundlagen”, the “brg” kernel performs the reduction steps summarized inFigure 12. Note that the χ-related reductions and the global reference typing do not occur. Wedo not perform ζ-reductions (see Subsection 5.1) and we estimated their amount by running acall-by-name abstract machine without closures because our KAM does not infer this data.

Note that the body of every global definition has an explicit type annotation (see Subsec-tion 4.2) that must be removed each time a reference to that definition id δ-expanded. This factexplains why the kernel performs as many τ -contractions as global δ-expansions.

Wiedijk’s validator [28], run on the same source, reports 8871 β-contractions and 19998 globalδ-expansions. Our system computes more β-contractions because the instantiation of a globalconstant generates a β-redex for each actual parameter (whereas instantiation is not connectedto β-reduction in Automath [9]), while we are not clear why the number of global δ-expansionsdiffers in the two systems. So further investigation is needed to understand the problem.

Jutting reports 8583 β-contractions and 20004 global δ-expansions for the original “Grundla-gen” text with η-contraction enabled [23] and Wiedijk’s validator confirms this data.

Our preliminary implementation of the α-convertibility check shows that at least one third ofthe convertibility checks performed by the kernel can by solved by α-conversion in this case.

6 Conclusions and Future Work

In the previous sections we presented an implemented procedure that allows to encode a specifi-cation in the Automath language Aut−QE [27] into a set of logical data within the formal systemλδ [12], to store this data set in a persistent format and to validate it against λδ’s type system.

As a test case, we successfully processed Jutting’s specification of Landau’s “Grundlagen” [23].The reason of this experiment is threefold: firstly we now have a data set on which to test λδ’s

capability to encode real-world formal mathematics; secondly we are interested in designing andimplementing a procedure mapping λδ to the λ-Cube [5] (or to its extensions, like [14]) and thisdata set will serve as a use case; thirdly we would like to favor the reuse of Jutting’s specificationby adding it to the library of a state-of-the-art proof assistant like MATITA [4], which for instancecan produce a procedural or a declarative representation of the logical data [13, 21].

UBLCS-2009-16 18

A Some Points on the Implementation

The present work allowed us to review and update some aspects of λδ, and to implement anefficient validator for this calculus. Moreover, we took the opportunity to collect in a single paperall the information available to our knowledge on the management of a real Automath text.

6.1 Acknowledgements

We would like to thank F. Wiedijk for having patiently recovered Jutting’s formal specificationof the “Grundlagen”, thus preventing its definitive loss. We are also grateful the whole HELMWorking Group for many valuable discussions on kernel design and implementation issues.

We would like to dedicate this paper in memory of our grandmother Enrica Ferrari, an excel-lent mathematician and physicist to whom we owe our predisposition to Mathematics.

A Some Points on the Implementation

Our validator, named HELENA7 (pronounced /he"lena/), is written in the CAML programminglanguage [6] to comply with the software developed in the context of the HELM project.

The source files are located in the directory /trunk/helm/software/lambda-delta/ ofthe HELM SVN repository <http://helm.cs.unibo.it/software/index.html >.

It should be noted that the implementation makes an extensive use of the Continuation-Passing Style (CPS) [22] strategy in order to improve the validator’s performance.

As a general pattern, when our functions take two continuations, the first one is invoked incase of error, while the other one is invoked in case of success.

Our presentation of HELENA (Appendix A.1) is not a line-by-line description in the style of[3], and concerns just the most stable part of the code: i.e. the interfaces of the main compilationunits. Some data on HELENA’s performance are discussed in Appendix A.2.

A.1 Interfaces of Some Compilation Units

In this Appendix we present the interfaces of the most relevant compilation units implementedin HELENA version 0.8.0 8, which include the data structures for the representation of the logicaldata and the signatures of the “brg” kernel functions. In this respect, the reader is required a basicknowledge of the CAML programming language in order to read the displayed code.

The file “aut.ml ” shown in Figure 13, specifies the data structure for Automath’s abstractsyntax, that we derived from the concrete syntax given in Figure 1. We stress that the formaldefinition of a “book” is not necessary because HELENA processes the Automath “books” with astreaming discipline, therefore an entire “book” is never stored in a single data structure.

The file “meta.ml ” shown in Figure 14, specifies the data structure for the correspondingcontents in the abstract layer (Subsection 3.1). The module “Entity ” is explained below.

The file “entity.ml ” shown in Figure 15, specifies the data structure for the informationunits, i.e. the declared or defined global constants, shared by all implemented kernels (see Sec-tion 3). The type of URIs is taken from the HELM uri manager “NUri ” presented in [3].

Themodule “hierarchy ”, whose interface is shown in Figure 16, is also shared by all kernelsand implements the “sort hierarchy” manager. Sorts may have names for presentational purposes,in particular the function “set_sorts ok [s_1;...;s_n] l ” assigns the names s1, . . . , sn

to the sorts ∗l, . . . , ∗(l + n − 1) respectively and returns the index (l + n). The “sort hierarchyparameter” (i.e. the function h of Subsection 4.1 encapsulated in the type “graph ”) is addressed byits name as explained in Subsection 5.2 and is applied to a sort index using the function “apply ”.

7. According to our requirements, our validator should have a woman’s name (following the first Automath validatornamed VERA [26]) symbolically related to the number 8, that denotes the female principle of reason and logic (Hod i.e.splendor) in Jewish mysticism. Our final choice fell on HELENA: the word denoting the eighth letter in the Czech phoneticalphabet. Surprisingly enough, the reduced isopsephic value of this name in its native spelling (Eλǫνη) is 8 in fact addingthe literal values we get: 5 (E) + 30 (λ) + 5 (ǫ) + 50 (ν) + 8 (η) = 98, whose theosophic reduction (i.e. the reminder modulo9) is 8. To our knowledge, “Helena” is the only woman’s name denoting the eight letter of a standard phonetic alphabet.8. As the version number suggests, the first satisfactory implementation of HELENA comes after several failed attempts.These false starts were mainly due to the poor knowledge on λδ’s meta-theory that we had before writing [12].

UBLCS-2009-16 19

A Some Points on the Implementation

type id = string ( * identifier * )type qid = id * bool * id list ( * qualified identifier: name, local?, path * )

type term = Sort of bool ( * sorts: true = TYPE, false = PROP * )| GRef of qid * term list ( * reference: name, arguments * )| Appl of term * term ( * application: argument, function * )| Abst of id * term * term ( * abstraction: name, domain, scope * )

type entity = Section of (bool * id) option ( * section: Some true = open, * )( * Some false = reopen, * )( * None = close last * )

| Context of qid option ( * context: Some = last node, * )( * None = root * )

| Block of id * term ( * block opener: name, domain * )| Decl of id * term ( * declaration: name, domain * )| Def of id * term * bool * term ( * definition: name, domain, * )

( * transparent?, body * )

Figure 13. The file “aut.ml”

type uri = Entity.uri ( * uri: alias for NUri.uri * )type id = Entity.id ( * identifier: alias for Aut.id * )

type term = Sort of bool ( * sorts: true = TYPE, false = PROP * )| LRef of int * int ( * local reference: * )

( * local environment length, * )( * de Bruijn’s index * )

| GRef of int * uri * term list ( * global reference: * )( * local environment length, * )( * name, arguments * )

| Appl of term * term ( * application: argument, function * )| Abst of id * term * term ( * abstraction: name, domain, scope * )

type pars = (id * term) list ( * parameter declarations: name, domain * )

( * entry: line number, parameters, name, domain, (transparen t?, body) * )type entry = int * pars * uri * term * (bool * term) option

type entity = entry option ( * None = section, context, block * )( * Some = declaration, definition * )

Figure 14. The file “meta.ml”

type uri = NUri.uri ( * uri * )type id = Aut.id ( * identifier: alias of string * )

type ‘bind entry = int * uri * ‘bind ( * age, uri, binder * )

type ‘bind entity = ‘bind entry option ( * None = section, context, block * )( * Some = declaration, definition * )

Figure 15. The file “Entity.ml”

type graph ( * sort hierarchy parameter * )

val set_sorts: (int -> ‘a) -> string list -> int -> ‘aval get_sort: (unit -> ‘a) -> (string -> ‘a) -> int -> ‘aval graph_of_string: (unit -> ‘a) -> (graph -> ‘a) -> string - > ‘aval string_of_graph: (string -> ‘a) -> graph -> ‘aval apply: (int -> ‘a) -> graph -> int -> ‘a

Figure 16. The file “hierarchy.mli”

UBLCS-2009-16 20

A Some Points on the Implementation

type uri = Entity.uri ( * uri: alias for NUri.uri * )type id = Entity.id ( * identifier: alias for Aut.id * )

type attr = Name of id ( * name * )| Apix of int ( * absolute position index * )

type attrs = attr list ( * attributes * )

type bind = Void of attrs ( * attrs * )| Abst of attrs * term ( * attrs, domain * )| Abbr of attrs * term ( * attrs, body * )

and term = Sort of attrs * int ( * attrs, hierarchy index * )| LRef of attrs * int ( * attrs, position index * )| GRef of attrs * uri ( * attrs, reference * )| Cast of attrs * term * term ( * attrs, domain, element * )| Appl of attrs * term * term ( * attrs, argument, function * )| Bind of bind * term ( * binder, scope * )

type entry = bind Entity.entry

type entity = bind Entity.entity

type lenv = Null ( * local environment * )( * Cons: tail, relative local environment, binder * )

| Cons of lenv * lenv option * bind

Figure 17. The file “brg.ml” (first part)

val get: (unit -> ‘a) -> (lenv -> bind -> ‘a) -> lenv -> int -> ‘aval rev_iter: (unit -> ‘a) -> ((unit -> ‘a) -> lenv -> bind -> ‘a ) -> lenv -> ‘aval fold_left: (‘b -> ‘a) -> ((‘b -> ‘a) -> ‘b -> bind -> ‘a) -> ‘b -> lenv -> ‘aval name: (unit -> ‘a) -> (id -> ‘a) -> attrs -> ‘aval apix: (unit -> ‘a) -> (int -> ‘a) -> attrs -> ‘a

Figure 18. The file “brg.ml” (interface of second part)

val set_entry: (Brg.entry -> ‘a) -> Brg.entry -> ‘aval get_entry: (unit -> ‘a) -> (Brg.entry -> ‘a) -> Brg.uri -> ‘a

Figure 19. The file “brgEnvironment.mli”

val lift: (Brg.term -> ‘a) -> int -> int -> Brg.term -> ‘a

Figure 20. The file “brgSubstitution.mli”

type kam ( * Krivine abstract machine * )

val empty_kam: kamval get: (unit -> ‘a) -> (Brg.bind -> ‘a) -> kam -> int -> ‘aval push: (kam -> ‘a) -> kam -> Brg.bind -> ‘aval xwhd: (kam -> Brg.term -> ‘a) -> kam -> Brg.term -> ‘aval are_convertible: (unit -> ‘a) -> (unit -> ‘a) ->

?si:bool -> kam -> Brg.term -> kam -> Brg.term -> ‘a

Figure 21. The file “brgReduction.mli”

UBLCS-2009-16 21

A Some Points on the Implementation

type message = (BrgReduction.kam, Brg.term) Log.message

val type_of: (message -> ‘a) -> (Brg.term -> ‘a) ->?si:bool -> Hierarchy.graph -> BrgReduction.kam -> Brg.te rm -> ‘a

Figure 22. The file “brgType.mli”

val type_check: (BrgType.message -> ‘a) -> (Brg.entity -> ‘ a) ->?si:bool -> Hierarchy.graph -> Brg.entity -> ‘a

Figure 23. The file “brgUntrusted.mli”

The first part of the file “brg.ml ” shown in Figure 17, specifies the data structures for λδ“brg” abstract syntax. Each syntactical item (see Figure 9) contains a list of attributes (“attrs ”)that include a name for the item (“Name”) and an absolute position index (“Apix ”) discussed inSubsection 5.1. The local environment (“lenv ”) is an extended list where each entry has an extralocal environment discussed in Subsection 5.1 as well. Note that environment entries (“entry ”,“Cons’) and terms exploit the same binder structure (“bind ”) as explained in Subsection 4.1.

The second part of the file “brg.ml ” contains some helper functions that allow the partialevaluation of the “brg” CAML constructors and the facilities listed in Figure 18. The function“get ” returns a local entry referred by position (the returned local environment is the extra one ifit exists). The functions “name” and “apix ” return the first such attribute in a list. The functions“rev_iter ” and “fold_left ” are CPS iterators on “lenv ”. Note that “rev_iter ” providesa local environment to its map function (the extra one if it exists), but “fold_left ” does not.

The module “BrgEnvironment ”, whose interface is shown in Figure 19, implements theglobal environment manager. We stress that this environment must be maintained in a trustedstatus. The function “set_entry ” returns the given entry with the “age” updated as necessary.

The module “BrgSubstitution ”, whose interface is shown in Figure 20, implements therelocation function used only by the module “BrgType ”. The integer parameters of the function“lift ” are the relocation’s hight and depth respectively. The 0-high relocations are optimized.

The module “BrgReduction ”, whose interface is shown in Figure 21, implements the re-duction machinery presented in Subsection 5.1. The functions “get ” and “push ” allow to ac-cess a reduction machine’s environment in the way a local environment is usually accessed.This interface allows an efficient implementation of the module “BrgType ” based on reduc-tion machines rather than on local environments (see Subsection 5.2). The functions “xwhd” and“are_convertible ” act on machine-term pairs. Namely “xwhd” (i.e. “extendedweak head”) isthe “domain retrieval” function, while “are_convertible ” implements the convertibility check.This check applies only to types: the arguments must be an “expected type” and the corresponding“inferred type” respectively. The flag “si ” (cleared by default) enables sort inclusion. We stressthat “xwhd” and “are_convertible ” expect typable terms and trusted machines.

The module “BrgType ”, whose interface is shown in Figure 22, implements the canonicaltype inference algorithm outlined in Subsection 5.2. The type “Log.message ” is an abstract rep-resentation of an error message (not documented in this paper). The only function is “type_of ”whose flag “si ” (cleared by default) enables sort inclusion during the convertibility checks. Westress that “type_of ” expects a trusted machine (no check is done to ensure this condition).

The module “BrgUntrusted ”, whose interface is shown in Figure 23, implements the val-idation of an untrusted information unit. Global declarations and definitions are type checked,while the other units are always valid. The validated units having a URI are placed in the globalenvironment and are returned with their “age” updated. The only function is “type_check ”whose flag “si ” (cleared by default) enables sort inclusion during the convertibility checks.

UBLCS-2009-16 22

B Related Topics

Processing phase Run time fraction Run time9

Parsing (Subsection 2.3) 8% 0.7sAbstraction (Subsection 3.1) 17% 1.6sInstantiation (Subsection 4.2) 4% 0.4sValidation (Section 5) 71% 6.5s

Figure 24. HELENA’s overall performance processing the “Grundlagen”

+l[a:PROP][b:PROP]# omitted lines[a1:and(a,b)]# omitted linesande2:=et(b,th6’’-imp’’(a,not(b),a1)):b # <ld:/l/ande 2.ld>+ra@[b:[x,a]PROP][a1:and(a,b)]ande2:=<ande1(a,b,a1)>ande2(a,b,a1):<ande1(a,b,a1)> b # <ld:/l/r/ande2.ld>-r-l

Figure 25. Some “Grundlagen” lines related to <ld:/l/r/ande2.ld>

A.2 Performance Issues

The average run time fractions spend by HELENA for parsing, translating and validating the“Grundlagen” with the “brg” kernel, are shown in Figure 24.9 Note that the parsing phase, beingdelegated to standard CAML parsing tools (i.e. OCAMLLEX and OCAMLYACC), could be improved.

Wiedijk’s validator spends the following run time fractions processing the same Automathtext [28]: 28% for the parsing, 18% for the translation and 54% for the validation.

It is important to note that when the “brg” kernel was equipped a call-by-name reductionmachine without closures, which performed the relocations occurring in the local δ-expansionsand in the υ-swaps as they appear in Figure 6, the run time spent on validation was 1.5 timeslonger. This empirical observation confirms that relocating a term is indeed a costly operation.

We also note that our preliminary version of the α-convertibility check does not speed the val-idation process in this case. Detecting an improvement might require a better implementation.

B Related Topics

This Appendix includes additional information related to HELENA and the “Grundlagen”. In par-ticular we discuss “purity” in Appendix B.1, a consequence of unrestricted sort inclusion in Ap-pendix B.2 andwe present the “document type definition” of our XML documents in Appendix B.3.

B.1 The “Purity” Rule in the Validation of the “Grundlagen”

We said in Subsection 2.2 that the “Purity” type assignment rule (formally presented as Fig-ure 7(pure) and implemented by the “brg” kernel with the pseudo-reduction (3)), is needed justonce in the case of the “Grundlagen” to validate the global constant n. 366 that we denote herewith the URI <ld:/l/r/ande2.ld >. Wiedijk’s validator provides the excerpt in Figure 25showing the relevant “Grundlagen” lines involved in the validation we are discussing.

In particular, we see that the instantiated <ld:/l/ande2.ld > appearing in the body of<ld:/l/r/ande2.ld > is typed by the parameter b, which is typed by λxa.∗Prop.

9. The average run time on the HELM server (2 × AMD Athlon MP 1800+, 1.53 GHz, 256 KB L2 cache) is: 9.2s.

UBLCS-2009-16 23

REFERENCES

# This book is not accepted in AUT-QE because [y:’type’] is no t allowed# This book is accepted in lambda-delta with sort inclusion b ut Omega is not# valid if sort inclusion is allowed on the term backbone only# This book is valid in lambda-delta with unrestricted sort i nclusion

+l@ Delta := [x:[y:’type’]’type’]<x>x : [x:[y:’type’]’type ’]’type’

Omega := <Delta>Delta : ’type’-l

Figure 26. The “Ω” example

This is to say that in this situation, λxa.∗Prop denotes a collection of function spaces, b is aspace in this collection and the instantiated <ld:/l/ande2.ld > is a function in this space.10

Here the item holding the domain of the function, i.e. the item λxa, is located two levels upthe type hierarchy instead of just one level up, so “purity” must be invoked to retrieve it.

For convenience, we are now referring sorts and local variables by name rather than by index.

B.2 A Consequence of Unrestricted Sort Inclusion

If sort inclusion is allowed when testing two abstraction items for convertibility, λδ loses thestrong normalization property of typed terms. Here we consider the case of the well-knownterm “Ω” presented in Figure 26. In particular, the “applicability condition” of Omega’s outer β-redex requires matching the type of Delta against the domain of its head abstraction item (i.e.[x:[y:’type’]’type’]’type’ against [y:’type’]’type’ ). Now these terms are abstrac-tions and their scopes are convertible (being both ’type’ ). The point here is that their domains(i.e.[y:’type’]’type’ and ’type’ ) are convertible as well if sort inclusion is in effect.

Note that here the inferred types are presented first, and the expected types follow.

B.3 The XML Document Type Definition for λδ

The persistent XML representation of λδ’s logical data (see Subsection 4.3) comes with a “docu-ment type definition” (DTD) whose current version is presented in Figure 27. The DTD is availablefrom λδ’s Web site at <http://helm.cs.unibo.it/lambda-delta/xml/ld.dtd >.

References

[1] M. Abadi and L. Cardelli. A Theory of Objects, volume 14 of Monographs in Computer Science.Springer, Heidelberg, Germany, March 1990.

[2] A. Asperti, L. Padovani, C. Sacerdoti Coen, F. Guidi, and I. Schena. Mathematical Knowl-edge Management in HELM. Annals of Mathematics and Artificial Intelligence, 38(1):27–46,May 2003.

[3] A. Asperti, W. Ricciotti, C. Sacerdoti Coen, and E. Tassi. A compact kernel for the calculusof inductive constructions. SADHANA Academy Proceedings in Engineering Sciences, 34(1):71–144, February 2009. Special Issue on Interactive Theorem Proving and Verification.

[4] A. Asperti, C. Sacerdoti Coen, E. Tassi, and S. Zacchiroli. User Interaction with the MatitaProof Assistant. To appear in Journal of Automated Reasoning, Special Issue on User Inter-faces for Theorem Proving., 2006.

[5] H.P. Barendregt. Lambda Calculi with Types. Osborne Handbooks of Logic in Computer Science,2:117–309, 1993.

10. Speaking in Automath terms, an abstraction denotes a function, a function space or a collection of function spacesdepending on its degree being 3, 2 or 1 respectively. Expressing this hierarchy is a distinguishing feature of Automath.

UBLCS-2009-16 24

REFERENCES

<?xml version="1.0" encoding="UTF-8"?>

<!ENTITY % leaf ’(Sort|LRef|GRef)’><!ENTITY % node ’(Cast|Appl|Abst|Abbr|Void)’><!ENTITY % term ’(%node; * ,%leaf;)’><!ENTITY % entry ’(ABST|ABBR|VOID)’>

<!ELEMENT Sort EMPTY><!ATTLIST Sort position NMTOKEN #REQUIRED name CDATA #IMPL IED><!ELEMENT LRef EMPTY><!ATTLIST LRef position NMTOKEN #REQUIRED name CDATA #IMPL IED><!ELEMENT GRef EMPTY><!ATTLIST GRef uri CDATA #REQUIRED name CDATA #IMPLIED><!ELEMENT Cast %term;><!ATTLIST Cast name CDATA #IMPLIED><!ELEMENT Appl %term;><!ATTLIST Appl name CDATA #IMPLIED><!ELEMENT Abst %term;><!ATTLIST Abst name CDATA #IMPLIED><!ELEMENT Abbr %term;><!ATTLIST Abbr name CDATA #IMPLIED><!ELEMENT Void EMPTY><!ATTLIST Void name CDATA #IMPLIED><!ELEMENT ABST %term;><!ATTLIST ABST uri CDATA #REQUIRED name CDATA #IMPLIED><!ELEMENT ABBR %term;><!ATTLIST ABBR uri CDATA #REQUIRED name CDATA #IMPLIED><!ELEMENT VOID EMPTY><!ATTLIST VOID uri CDATA #REQUIRED name CDATA #IMPLIED><!ELEMENT ENTRY %entry;><!ATTLIST ENTRY hierarchy NMTOKEN #REQUIRED options NMTOK ENS #IMPLIED>

Figure 27. The file “ld.dtd”

[6] Caml development team. The Objective Caml system: release 3.11. Documentation and user’smanual. INRIA, Orsay, France, November 2008.

[7] Coq development team. The Coq Proof Assistant Reference Manual: release 8.1pl4. INRIA,Orsay, France, October 2008.

[8] Y. Coscoy. ANatural Language Explanation for Formal Proofs. In C. Retore, editor, Int. Conf.on Logical Aspects of Computational Linguistics (LACL), volume 1328 of LNAI, pages 149–167,Heidelberg, Germany, Sept 1996. Springer.

[9] N.G. de Bruijn. A plea for weaker frameworks. In Logical Frameworks, pages 40–67. Cam-bridge University Press, Cambridge, UK, 1991.

[10] N.G. de Bruijn. AUT-SL, a single line version of AUTOMATH. In Selected Papers on Automath,pages 275–281. North-Holland Pub. Co., Amsterdam, The Netherlands, 1994.

[11] N.G. de Bruijn. Lambda calculus notation with nameless dummies, a tool for automaticformula manipulation, with application to the Church-Rosser theorem. In Selected Papers onAutomath, pages 375–388. North-Holland Pub. Co., Amsterdam, The Netherlands, 1994.

[12] F. Guidi. The Formal System λδ. To appear in ACM Transactions on Computational Logic,July 2008.

[13] F. Guidi. Procedural Representation of CIC Proof Terms. To appear in Journal of Auto-mated Reasoning, Special issue on Programming Languages and Mechanized MathematicsSystems, February 2009. Published online: June 2009.

UBLCS-2009-16 25

REFERENCES

[14] F. Kamareddine, R. Bloo, and R.P. Nederpelt. On π-conversion in the λ-cube and the combi-nation with abbreviations. APAL, 97(1-3):27–45, 1999.

[15] F. Kamareddine and R.P. Nederpelt. A useful λ-notation. Theoretical Computer Science,155(1):85–109, 1996.

[16] J.-L. Krivine. A call-by-name lambda-calculus machine. Higher-Order and Symbolic Compu-tation, 20(3):199–207, September 2007.

[17] E.G.H.Y. Landau. Grundlagen der Analysis. Chelsea Pub. Co., New York, USA, 1965.

[18] Z. Luo. An Extended Calculus of Constructions. Ph.D. thesis, University of Edinburgh,1990.

[19] R.P. Nederpelt, J.H. Geuvers, and R.C. de Vrijer, editors. Selected Papers on Automath, volume133 of Studies in Logic and the Foundations of Mathematics, Amsterdam, The Netherlands, 1994.North-Holland Pub. Co.

[20] Network Working Group. Uniform Resource Identifiers (URI): Generic Syntax. RCF 2396,Aug 1998.

[21] C. Sacerdoti Coen. Declarative Representation of Proof Terms. To appear Journal of Auto-mated Reasoning, Special issue on Programming Languages and Mechanized MathematicsSystems, January 2009. Published online: June 2009.

[22] C. Strachey and C.P. Wadsworth. Continuations: A Mathematical Semantics for HandlingFull Jumps. Higher-Order and Symbolic Computation, 13(1):135–152, April 2000.

[23] L.S. van Benthem Jutting. Checking Landau’s “Grundlagen” in the AUTOMATH System.Ph.D. thesis, Eindhoven University of Technology, 1977.

[24] L.S. van Benthem Jutting. Description of AUT-68. In Selected Papers on Automath, pages251–273. North-Holland Pub. Co., Amsterdam, The Netherlands, 1994.

[25] L.S. van Benthem Jutting. The language theory of λ∞, a typed λ-calculus where terms aretypes. In Selected Papers on Automath, pages 655–683. North-Holland Pub. Co., Amsterdam,The Netherlands, 1994.

[26] L.S. van Benthem Jutting and R. Wieringa. Representatie van expressies in het verificatiepro-gramma VERA 79. Memorandum 1979-15, Department of Mathematics, Eindhoven Univer-sity of Technology, November 1979.

[27] D.T. van Daalen. A Description of Automath and Some Aspects of its Language Theory.In Selected Papers on Automath, pages 101–126. North-Holland Pub. Co., Amsterdam, TheNetherlands, 1994.

[28] F. Wiedijk. A Nice and Accurate Checker for the Mathematical Language Automath. Docu-mentation of the AUT checker, version 4.1, 1999.

[29] XML Core Working Group. Extensible Markup Language (XML) 1.0 (Fifth Edition). WorldWide Web Consortium, Cambridge, USA, November 2008. W3C Recommendation.

[30] I. Zandleven. A Verifying Program for Automath. In Selected Papers on Automath, pages783–804. North-Holland Pub. Co., Amsterdam, The Netherlands, 1994.

UBLCS-2009-16 26