An Axiomatic Theory for Partial Functions · An Axiomatic Theory for Partial Functions Jan Kuper...

An Axiomatic Theory for Partial Functions

Jan Kuper

University of Twente, Department of Computer ScienceP.O.Box 217, 7500 AE Enschede, The Netherlands

e-mail: [email protected]

Abstract

We describe an axiomatic theory for the concept of one-place, par-tial function, where function is taken in its extensional sense. Thetheory is rather general, i.e., concepts like natural numbers and setsare definable, and topics as non-strictness and self application can behandled. It contains a model of the (extensional) lambda calculus,and commonly applied mechanisms (like currying, inductive definabil-ity) are possible. Furthermore, the theory is equi-consistent with andequally powerful as ZF-Set Theory.

The theory (called Axiomatic Function Theory, AFT) is describedin the language of classical first order predicate logic with equality andone non-logical predicate symbol for function application. By meansof some notational conventions, we describe a method within this logicto handle undefinedness in a natural way.

1 Introduction

Recently, the interest in the concept of partial function has grown, especiallywithin theoretical computer science and the theory of computing (cf. Moggi(1988) for references). An important reason for that is, that partial functionsarise naturally when we are modeling computer programs. Parallel to thisgrowing interest in the concept of partial function, interest in a logic ofpartial terms or a logic of (un)definedness has also increased.

In this paper we want to give the basic outline of an axiomatic theoryabout partial functions, which can be considered as a general and preciseframework for reasoning about partial functions. We also sketch a possibil-ity to handle undefinedness within classical first order logic, without beingforced to change the logical rules.

1

The concept of function seems to be equally fundamental as the conceptof set, e.g., notions like natural number, set, ordered n-tuple, can easily bedefined, using only the concepts of partial function and function application.Furthermore, a naive way of defining functions and operating on them leadsto the Russell Paradox. To avoid the Russell Paradox, we develop a Zermelo-Fraenkel-like system of axioms, which gives us an intuitively very reasonableaxiomatization of the concept of partial function. We want the theory to berather general, so we allow self application and related topics (though notfor all functions), we indicate a natural way to handle non-strict functions,and we add a functional form of the Axiom of Choice.

Functions are taken in their extensional sense, i.e., they are completelydetermined by what they do. In other words, they are completely determinedby the set of ordered pairs of their arguments and corresponding results.However, functions are not identified with these sets of ordered pairs, butare considered as objects in their own right. Within the theory presentedhere, the concept of function is the basic concept, and sets can be definedas specific functions. So, the theory gives us a possibility to reason formallyabout functions in a more direct way than by means of sets of ordered pairs.The penalty of course is, that we can only reason indirectly about sets.

We describe a universe IF, in which all objects are one-place functions.All functions have functions as arguments, and functions as results, i.e.,all functions are partial functions from IF to IF. So from the beginning, allfunctions are higher order functions. There is one basic ternary relationA, called the relation of application. The elementary formula A(x, y, z) isinterpreted as: the function x applied to the argument y yields the result z.

The theory which we present here is expressed in the language of firstorder predicate logic with equality, and A as the only non-logical symbol.There is no need for a “bottom element” to represent undefinedness. Whena function f is undefined for an argument x, this can simply be expressedas ¬∃y(A(f, x, y)). That is, (un)definedness is a binary relation betweena function f and a (possible) argument x. Treating (un)definedness as aproperty of the result of applying f to x gives us a problem in case thisresult does not exist. In such a case there is nothing to have a property, noteven the property of being undefined (this is a problem of a formal language,not of intuition). That is to say, if we want to consider (un)definedness asa property of the result of applying f to x, we are forced to introduce oneor more elements (“bottom elements”) to represent this result when f isnot defined for x. We do not want to do that, so within the given languagewe define (un)definedness as a binary predicate. In fact, this convenientexpressibility of undefinedness in a first order language is the reason to prefer

2

this ternary relation of application A over a binary operation of application.However, there is one major drawback of this ternary relation: there

are no terms. Formulas must be constructed using only variables, =, andA, and the ordinary logical connectives and quantifiers. As a consequence,formulas are hard to read, especially, when we want to express some morecomplicated situations, like iterated applications which we would like towrite as, e.g., fx(gy) (application is understood to associate to the left).For easy readability of the ternary relation of application, we will use infixnotation. Thus A(f, x, y) will be notated as fx ' y, or as f •x ' y. Thenegation ¬A(f, x, y) will be notated as fx 6' y. In Section 3 we will elaboratethis infix notation further.

In Section 2 we will mention some other theories on functions, and in Sec-tion 4 we describe the Russell Paradox in a functional setting. In Section 5we formulate the axioms of the theory, and some concepts and consequencesthat a general theory of functions should include at the very least. A com-plete list of the axioms can be found at the end of Section 5. In Section 6we prove the (relative) consistency of the theory by interpreting it in ZF-SetTheory without the Axiom of Foundation, but including Boffa’s Axiom ofUniversality and the Axiom of Choice, whereas in Section 7 we interpretthis variant of set theory within the theory of functions. Together, Sec-tions 6 and 7 prove that ZF-Set Theory and the theory presented here areequi-consistent and equally powerful.

2 Other Theories on Functions

There are different theories in which we can speak formally about functions.The most well known of these theories is set theory, in which a function isa set of ordered pairs answering the property of uniqueness. As we alreadyremarked, this is an indirect way to handle functions, whereas in this paperwe describe a direct way to handle them. There is a wealth of literature onset theory; we only mention Fraenkel et al (1973), Van Dalen et al(1978).

A second important theory, in which the concept of function takes a cen-tral place, is the (untyped) lambda calculus. However, the lambda calculusis not a theory about the general concept of function, but about expres-sions to denote functions. Illustrative to this point is, that in the lambdacalculus function application is identified with substitution. We show thatthe theory presented here contains a model for the untyped lambda cal-culus (Theorem 5.8.4). Some important references on lambda calculus are

3

Barendregt (1984), Hindley and Seldin (1986).Closely related to the untyped lambda calculus are several variants of

typed lambda calculi, constructive type theory and generalized type systems.Here the same remark applies as made with respect to the untyped lambdacalculus. We hope that the theory presented in this paper can be a generalframework for the semantics of different typed lambda calculi. For referenceson these subjects, cf. Barendregt and Hemerik (1990), Barendregt(199?).

As a third theory about functions, we mention category theory. Re-cently, category theory has become more and more important in the theoryof computing. However, the way category theory handles the concept offunction is very abstract, and rather remote from our daily intuition aboutfunctions. We feel that our theory is more concrete in this sense. Standardreferences for category theory are MacLane (1971), Arbib and Manes(1975), Barr and Wells (1990).

There is a fourth theory about functions, which resembles our way oflooking at functions rather strongly. This theory is presented by John vonNeumann in two articles, in which he formulates an axiomatic theory forsets based on the concept of function (cf. Von Neumann (1925, 1928),Robinson (1937)). Von Neumann’s work has been very influential on thefoundations of set theory, mainly because he introduced the concept of class,and because he gave a finite axiomatization of set theory. However, thefact that he took the concept of function as the basic concept, remainedwithout almost any influence. Historically, his choice of taking the conceptof function as the basic concept, was judged as “reasonable [. . . but . . . ]rather clumsy” (cf. Fraenkel et al (1973), page 135), and as not very“manageable” (cf. Van Dalen and Monna (1972), pages 46–47). Bernaysand Godel reformulated Von Neumann’s theory directly in terms of sets andclasses (cf. Bernays (1937, 1941), Godel (1940)).

As important reasons for this historical judgement we mention that VonNeumann himself intended his work as a foundation for sets, and so hisfoundation was an indirect one from the very outset. In the second place,his axiomatic system is rather complicated, containing approximately twentyaxioms, some of which are very hard to read. Furthermore, Von Neumann’swork was interpreted from the point of view of set theory, and not fromthe point of view of computation theory. It is exactly this point of viewfrom which the concept of (partial) function has become more and moreimportant.

4

3 Pseudo Terms

In this section we will describe some fairly simple notational conventions,in order to get a more common notation for function application, such thatformulas can be read in a natural way. If necessary, the resulting formulascan always be translated into the underlying traditional formulas, whichcontain only A as non-logical symbol. These underlying formulas are ratherclumsy and highly unreadable, especially when the situation expressed ina formula is a little bit complicated. The intention is, that when we areworking within the theory about partial functions presented in Section 5,we can forget about these underlying formulas. However, in the presentsection we only describe some conventions and examples, but we do not giveany proofs.

The contents of this section belong to the field of logic of undefinedness,logic of partial terms, etc. (for references, see e.g. Beeson (1985), Moggi(1988), Troelstra and Van Dalen (1988)). The approach we describehere resembles Feferman’s approach (cf. Feferman (1975)).

In Section 1 we agreed to write A(f, x, y) as fx ' y, or as f •x ' y.This infix notation looks like the familiar notation for function application,but there are some important differences. “•” (mostly not written) is not abinary operation, and “'” is not a binary relation, but together they forma ternary relation. We will call “•” a pseudo operation (of application), “'”a pseudo relation (of equivalence), and “fx” (or “f •x”) a pseudo term.

In order to handle the more complicated situations of iterated application(as in fx(gy)), we generalize the concept of pseudo terms inductively asfollows:

– variables x, y, z, . . . are pseudo terms,

– if σ, τ are pseudo terms, then σ•τ is a pseudo term.

Variables are also terms in the more usual sense. To avoid misunderstand-ings, we allow for brackets in the standard way. Furthermore, we will assumethat “•” associates to the left.

Infix formulas, i.e., formulas expressed in infix notation, are simply de-fined as follows:

– when σ, τ are pseudo terms, then σ ' τ and σ 6' τ are infix formulas,

– we may construct more complicated infix formulas by the usual logicalconnectives and quantifiers.

5

The only well-formed formula, that can be constructed by means of “=”, isthe elementary formula x = y, where x and y are variables. So, σ = τ is notwell-formed, when σ or τ is not a variable.

We adopt the following notational conventions for infix formulas (x, y, zstand for variables; σ, τ stand for pseudo terms; x, y do not occur in σ, τ):

x ' y means x = y, (1)στ ' z means ∃x, y(σ ' x ∧ τ ' y ∧ xy ' z), (2)σ ' τ means ∀x(σ ' x↔ τ ' x), (3)σ 6' τ means ¬(σ ' τ). (4)

It is obvious, that by means of these conventions we can always translatean infix formula into a traditional first order formula, which contains “=”and “A”, but which does not contain “'” or “•”. We will call such a formulaa normal form.

In order to avoid ambiguities in translating infix formulas, we agree thatconventions 1 and 2 have a higher priority in the translation process thanconvention 3.

As can be seen from the conventions formulated above, infix formulascan be understood in a very natural way:

– στ ' z is true, when both σ and τ are defined, and the application ofσ to τ gives the result z. Otherwise it is false.

– σ ' τ is true, when both σ and τ are undefined, and also, when bothare defined and have the same value. Otherwise it is false.

We define existence of a pseudo term σ, also called: σ is defined , as

E∗(σ)↔ ∃x(σ ' x),

where x may not occur in σ. If σ exists, then x is called the value of σ iffσ ' x.

We mention without proof, that if σ exists, then its value is unique.Pseudo terms were introduced to avoid a “bottom element” (as an ob-

ject), representing undefinedness. However, we may consider “bottom” as ashorthand notation for a non-existing pseudo term. Let e denote the emptyfunction (such a function exists, cf. Section 5.2). We agree to abbreviatee•e as ⊥ (“bottom”), and for all σ we have

¬E∗(σ)↔ σ ' ⊥.

6

It follows from the conventions above, that all functions are strict , i.e., whenf is a function and σ is an undefined pseudo term, then fσ is also undefined.In general we have that στ ' ⊥, whenever σ ' ⊥ or τ ' ⊥.

Note, that existence is not a predicate, since according to the conventionsdescribed above the right hand side of its definition may well translate todifferent normal forms when it is applied to different pseudo terms. Forexample, E∗(fx) evaluates to ∃z(fx ' z), i.e., to

∃z(A(f, x, z)),

whereas E∗(fxy) evaluates to ∃z(fxy ' z), which is (by convention 2)equivalent to ∃z, u(fx ' u ∧ uy ' z), i.e., to

∃z, u(A(f, x, u) ∧ A(u, y, z)).

So, E∗(fx) and E∗(fxy) are totally different formulas, that is, we have a se-ries of existence predicates, which we could denote as E1, E2, etc.. However,we prefer to denote them schematically by E∗ and we call E∗ a predicatescheme. We may write E, when σ is a variable. Notice, that we have∀x(E(x)).

All predicates P occurring in the remaining part of this paper, are definedusing variables, but they generalize in the same way to predicate schemesP ∗ (though in practice we will often omit the “∗”). That is, we may “substi-tute” pseudo terms for variables. However, this is not ordinary substitution,since it may change a specific predicate into another predicate, though theresulting predicate keeps more or less the intuitive meaning of the originalpredicate. We will call it substitution∗.

For example, in Section 5.1 we define the binary predicate D(f, x) as∃y(fx ' y), i.e., D(f, x) means that f is defined for x. When we substitute∗

gu for x, we get

D∗(f, gu),

and the defining formula becomes

∃y(f(gu) ' y).

By means of convention 2 this translates to

∃y, v(gu ' v ∧ fv ' y),

which is equivalent to

∃y, v(A(g, u, v) ∧ A(f, v, y)).

7

However, this formula contains three free variables (f, g and u), so D∗(f, gu)is a ternary predicate, whereas D(f, x) is a binary predicate. The intuitivemeaning of D∗(f, gu) remains in a sense the same: f is defined for gu. Inaddition, the existence of gu is also required.

The same holds for formulas: a formula ϕ changes under substitution∗

to a different formula ϕ∗ (here too we will mostly omit the “∗”). In fact,this already is present with the basic predicate A. We can look at στ ' υas a substitution∗-instance of xy ' z, i.e., of A(x, y, z). So, στ ' υ can beconsidered as infix notation for A∗(σ, τ, υ).

We have to be careful with substitution∗ of non-existing pseudo termsfor variables, since we can not say anything in advance about the truth orfalsehood of ϕ∗(σ0, . . . , σn−1), when one or more of σ0, . . . , σn−1 do not exist.For example, D∗(f, σ) is false when σ ' ⊥, whereas Inj ∗(σ) is true whenσ ' ⊥ (Inj (f) means that f is injective, cf. Definition 5.2.3). It is a matterof choice, if we accept this or not. If we don’t, then in order to show thatsome pseudo term σ “really” is injective, we separately must prove that σexists. In general, in ϕ∗(σ0, . . . , σn−1), we have several choices with respectto the existence of the σi’s occurring in it. An example of this point is givenafter Definitions 5.3.2 and 5.3.3.

Once more we want to remark, that “•” and “'” are not a real binaryoperation and relation respectively, they only behave as if they were (exceptfor the case x ' y, where “'” is equivalent to “=”). That is, intuitively“•” can be understood as a partial operation of application, and “'” as arelation of equivalence, which indeed is reflexive, transitive and symmetric(we do not prove that here).

From now on we will often omit the prefix “pseudo” and just speak aboutterms.

4 The Russell Paradox

At first sight, when starting the development of an axiomatic theory aboutfunctions, we would like it to be combinatorially complete, i.e., every func-tion that is definable by logical means exists and may be used. In everydaypractice of mathematics functions are defined by describing what they aresupposed to do with their arguments. In other words, functions are under-stood to exist, when their behaviour is described.

8

In the framework of a general first order theory about functions combi-natorial completeness thus means, that if we have a first order descriptionof this behaviour, then there exists a function which shows this behaviour.From this point of view an important axiom is a “naive comprehensionscheme”:

when ϕ(x, y) is a (first order) functional formula, i.e., a formulathat satisfies the requirement of uniqueness:

ϕ(x, y1) ∧ ϕ(x, y2)→ y1 = y2,

then there exists a function f , which “does as ϕ says”, i.e.,

fx ' y ↔ ϕ(x, y)

(f may not be free in ϕ(x, y)).

However, such a naive theory cannot exist, since it leads to a functional formof the Russell Paradox: suppose a is a function such that aa 6' a (accordingto the naive comprehension scheme, there surely exists such a function, e.g.,the empty function). Now define the “Russell function” r as follows:

rx ' y ↔ ((xx ' x→ y = a) ∧ (xx 6' x→ y = x)).

In this expression, the defining formula of r indeed is a functional formula.It can easily be seen that

∀x(rx ' x↔ xx 6' x),

i.e., x is a fixed point of r, if and only if x is not a fixed point of itself. Ifwe take r for x, then we have the Russell Paradox for functions:

rr ' r ↔ rr 6' r.

A solution to the Russell Paradox can be found in the “limitation of size”principle. The result of that is, that a function can only exist, if it is not too“powerful”, i.e., if it is not more powerful than an already existing function.

To express this restriction, suppose we want to prove the existence ofa function g which “does as ϕ says”, where ϕ is some functional formula.Then we must restrict the domain of g, such that it is not bigger than thedomain of some already existing function f . We can do that by formulatinga second functional formula ψ(u, x), and then require that for every x in the

9

domain of g there is a u in the domain of f , such that ψ(u, x) holds. Moreformally, this can be expressed as

∀f∃g∀x, y(gx ' y ↔ ϕ(x, y) ∧ ∃u(∃v(fu ' v) ∧ ψ(u, x))).

In Section 5.1 Axiom 3 expresses an alternative and more simple scheme ofrestricted comprehension. As a consequence of this axiom, all functions areso to say “essentially partial” with respect to the universe IF, so within IFthere exists no universal identity function, there exist no universal projectionfunctions, etc.

That does not mean, however, that we cannot speak in an intuitive wayof such “higher functions” which are defined on the total universe IF. Forexample, in Section 5.3 we will define an intuitive “higher function” δ fromIF to IF, such that δf is the domain of an arbitrary function f (sets will bedefined as specific functions). A second example is composition of functions,which is well-defined for every f and g in IF (though its result may be theempty function). Such “higher functions”, which exist on an intuitive level,and not within IF, might be called operations. Operations may be partialwith respect to IF, and only those operations that are “weak” enough, canexist within IF, or, more precisely, can be represented within IF. Notice thatoperations are analogous to classes in ZF Set Theory.

5 Axioms

In this Section the basic outline of the theory about partial functions isformulated. We introduce and describe the eight axioms of the theory,and give some consequences of them. The axioms are described in Sec-tions 5.1, 5.2, 5.4, 5.8 and 5.9, and a complete list of them is given inSection 5.10. In Section 5.3 we define natural numbers and sets as specificone-place functions, and in Section 5.6 we do the same with ordered n-tuplesand n-ary functions. In Section 5.5 we prove that functions may be definedinductively, and in Section 5.7 we prove that the operation of currying onn-ary functions is well-defined.

5.1 The Axioms of Uniqueness, Extensionality andRestricted Comprehension

The first two axioms are simple. They formulate uniqueness and extension-ality of functions.

Axiom 1 (Uniqueness) ∀f, x, y1, y2(fx ' y1 ∧ fx ' y2 → y1 = y2). �

10

The Axiom of Uniqueness says, that a function f has at most one value forevery argument.

Axiom 2 (Extensionality) ∀f, g(∀x(fx ' gx)→ f = g). �

According to the Axiom of Extensionality two functions f and g are equal,if they do the same to every argument. Note, that if fx and gx both do notexist, then indeed f and g do the same to x.

The next axiom describes a scheme of restricted comprehension, as an-nounced in Section 4. Before it can be formulated, we need a few definitions.

Definition 5.1.1 A function f is defined for an argument x (or x is in thedomain of f), denoted as D(f, x), iff

∃y(fx ' y). �

Note, that D(f, x)↔ E∗(fx).

Definition 5.1.2 x is a result of f (or x is in the range of f), denoted asR(f, x), iff

∃y(fy ' x). �

Definition 5.1.3 x belongs to the field of f , denoted as Fld(f, x), iff

D(f, x) ∨R(f, x). �

The next axiom (restricted comprehension) actually is an axiom scheme, inwhich a first order formula ϕ appears. ϕ(x, y) is supposed to describe the“behaviour” of the function g (restricted to the field of f), whose existenceis guaranteed by the Axiom of Restricted Comprehension, so ϕ(x, y) mustbe a functional formula, i.e., for all x, y we have

ϕ(x, y) ∧ ϕ(x, y′)→ y = y′

(cf. Section 4). Parameters are allowed in ϕ(x, y). In order to preventcircular definitions, g may not be free in ϕ.

Axiom 3 (Restricted Comprehension)

∀f∃g∀x, y(gx ' y ↔ Fld(f, x) ∧ ϕ(x, y)). �

11

Actually, this Axiom of Restricted Comprehension is a little strongerthan the one described in Section 4, but given Axioms 4 and 5 their equiv-alence can easily be proved.

The function g mentioned in Axiom 3 is uniquely determined, whichfollows immediately from the Axiom of Extensionality. It is denoted by aspecific lambda term as

λ(x7→y | Fld(f, x) ∧ ϕ(x, y)).

The general form of lambda terms is

λ(x7→y | ψ),

where ψ is a formula.As usual, x and y are called bound variables. Variables that are not

bound, are called free.In the literature, lambda terms are usually written as λx.σ, where σ is a

term. However, this form is a special case of the lambda terms we introducedhere, and we have

λx.σ ' λ(x7→y | σ ' y),

where y may not occur free in σ.Lambda terms may be considered as pseudo terms (cf. Section 3), but

we will not elaborate this possibility here. We just remark that

λ(x7→y | ψ) ' ⊥

when ψ is not a functional formula, or when ψ is too “strong”, i.e., for toomany x’s there is a y such that ψ(x, y) holds.

As an immediate consequence of Axiom 3 we prove that function com-position is allowed, i.e., we prove that for any two functions there exists aunique composition of f and g.

Theorem 5.1.4 ∀f, g∃!h∀x(hx ' g(fx)).

As usual, the composition h of f and g is denoted as g ◦ f .Proof. By the Axiom of Restricted Comprehension we can define h as:

hx ' y ↔ Fld(f, x) ∧ g(fx) ' y.

The uniqueness of h follows from the Axiom of Extensionality. �

Clearly, the part Fld(f, x) in the definition of h in this proof is superflu-ous. We mentioned it here, to answer the precise pattern of the Axiom ofRestricted Comprehension, but in the sequel we will often omit it.

12

5.2 The Axiom of Infinity

By the Axiom of Restricted Comprehension we can only prove the existenceof a function g, given the existence of a function f , so we must make astart somewhere. In the next axiom this start is made with a function withan infinite domain. The axiom simply states that there exists a functionf which can be applied iteratively to an argument a, i.e., fa exists, f(fa)exists, etc. The only thing to be aware of is, that at least for one a in thedomain of f , all a, fa, f(fa), . . . are distinct.

We start with some definitions.

Definition 5.2.1 a is a startpoint of f , denoted as Start(f, a), iff

D(f, a) ∧ ¬R(f, a). �

Definition 5.2.2 a is an endpoint of f , denoted as End(f, a), iff

R(f, a) ∧ ¬D(f, a). �

Definition 5.2.3 A function f is injective, denoted as Inj(f), iff

∀x1, x2(∃y(fx1 ' y ∧ fx2 ' y)→ x1 = x2). �

Note, that in Definition 5.2.3 it is not enough to state

fx1 ' fx2 → x1 = x2,

since fx1 ' fx2 holds whenever f is undefined for x1 and x2, and then aninjective function would have at most one element outside its domain.

Definition 5.2.4 A function f is a successor function, denoted as SF(f),iff

– Inj(f),

– ∃a(Start(f, a)),

– ¬∃a(End(f, a)). �

The Axiom of Infinity now simply says, that there exists a successor function.

Axiom 4 (Infinity) ∃f(SF (f)). �

By means of the Axioms 3 and 4 we can prove the existence of some well-known functions.

13

Theorem 5.2.5 There exists an empty function: ∃f∀x, y(fx 6' y).

The empty function is denoted as λ().Proof. Let s be a successor function. Then by Axiom 3 there is a functionf such that fx ' y ↔ Fld(s, x) ∧ x 6= x. This f is the empty function. �

Theorem 5.2.6 Let a0, a1, . . . , an−1 be pairwise distinct functions. Thenfor all b0, b1, . . . , bn−1 there exists a function f such that for all x, y

fx ' y ↔ ((x = a0 ∧ y = b0) ∨ · · · ∨ (x = an−1 ∧ y = bn−1)).

This function f is denoted as λ(a0 7→b0, . . . , an−1 7→bn−1).Proof. Let s be a successor function. Define E0(u), E1(u), . . . , En−1(u),saying respectively that u is a startpoint of s, u is a first “intermediateresult” of s, . . . , u is a (n− 1)th intermediate result of s, as follows:

E0(u)↔ Start(s, u),

Ei+1(u)↔ ∃v(Ei(v) ∧ sv ' u).

By Axiom 3 there exists a function g such that

∀u, x(gu ' x↔ D(s, u) ∧ ((E0(u) ∧ x = a0) ∨ · · ·

· · · ∨ (En−1(u) ∧ x = an−1)).

By Axiom 3 again, there exists a function f such that

∀x, y(fx ' y ↔ R(g, x) ∧ ((x = a0 ∧ y = b0) ∨ · · ·

· · · ∨ (x = an−1 ∧ y = bn−1)).

This function f is the required function. �

5.3 Natural Numbers and Sets

Using Theorems 5.2.5 and 5.2.6 we can define the natural numbers in manyways. We will define natural numbers in a Von Neumann-like style, namely,a natural number is defined as the identity function on all its predecessors.

Definition 5.3.1 (Natural Numbers)

0 ' λ()1 ' λ(07→0)2 ' λ(07→0,17→1)

...n′ ' λ(07→0,17→1, . . . , n7→n)

... �

14

In the definition of natural numbers, n′ denotes the successor of n.Notice, that this definition only defines every natural number separately,

by some informal inductive process. However, it does not give us a formaldefinition of a predicate Nat , such that Nat(x) is true iff x is a naturalnumber. We will give such a formal definition in Section 5.5.

As usual, 0 and 1 will model the truth values false and true. Using that,sets are often modeled as some characteristic function, which yields true forobjects inside the set, and false for objects outside the set. However, whenwe do that here, sets would become functions which are defined for the wholeuniverse of functions, and we would have the Russell paradox again. So wedefine sets as functions which yield true for elements of the set, and whichare undefined otherwise.

Definition 5.3.2 (Set) Set(a)↔ ∀x(D(a, x)→ ax ' 1). �

Definition 5.3.3 (Membership) x ∈ a↔ Set(a) ∧D(a, x). �

Notice that x ∈ a is meaningful when a is not a set. In such a case x ∈ a isfalse.

Notice also, that according to these definitions Set(⊥) is true, but x ∈ ⊥is false for every x. We also have that Set(λ()) is true and x ∈ λ() is false.Since λ() 6' ⊥ we are faced here with the situation of having two differentinterpretations for the empty set. This is not a real problem, but we maywish to choose (cf. Section 3) that only existing pseudo terms can be “real”sets, i.e.,

Set∗(σ)↔ ∃z(σ ' z ∧ Set(z))

(where z may not occur in σ). Then still x ∈ ⊥ is false for every x, but ⊥ isnot a set any more, and thus only the empty function represents the emptyset.

We will also use the ordinary set notation with curly braces, defined asspecific lambda terms:

{a, b, c, . . .} means λ(a7→1, b7→1, c7→1, . . .),{x | ϕ} means λ(x7→1 | ϕ).

Clearly, the concept of subset can also easily be defined.

Definition 5.3.4 (Subset) a ⊆ b↔ Set(a) ∧ Set(b) ∧ ∀x(x ∈ a→ x ∈ b).�

15

Up to now, we only could express that an object was in the domain of afunction, by means of the binary predicate D. We could not yet talk formallyabout the domain itself . The same holds for the range and the field of afunction. Using Definitions 5.3.2 and 5.3.3, we can prove that the domain,range and field of a function are sets.

Theorem 5.3.5 For every function f there exist sets a, b and c such thatfor all x

x ∈ a ↔ D(f, x),x ∈ b ↔ R(f, x),x ∈ c ↔ Fld(f, x).

The sets a, b and c are called the domain (denoted as δf), range (ρf) andfield (∆f) of f respectively. Just like lambda terms, these expressions maybe considered as pseudo terms. More generally, when σ is a pseudo term,then δσ, ρσ, ∆σ are also pseudo terms.

Notice, that when σ ' ⊥ we have to choose between δ⊥ ' λ() andδ⊥ ' ⊥. For technical reasons, we take δ⊥ ' ⊥ (see e.g. Section 5.7). Thesame remarks can be made with respect to ρ and ∆.

When in the sequel some operation F (like δ, ρ or ∆) is introduced, wewill silently assume that F can be applied to a pseudo term σ, such that Fσis again a pseudo term. A more systematic treatment of pseudo terms fallsoutside the scope of this paper.

Proof of Theorem 5.3.5. By means of Axiom 3, the domain a can bedefined as

ax ' y ↔ D(f, x) ∧ y ' 1.

Then clearly, a is a set, and for all x

x ∈ a↔ D(f, x).

The proofs of the existence of the range and field of f go the same. �

Note, that when a is a set, δa ' a.Another easy consequence of the definition of set is: if ϕ is a functional

formula, then for every set a there exists a function f which is partial (fora formal definition of partiality, cf. Definition 5.4.9) with respect to a, andwhich “does as ϕ says” (restricted to a). The next corollary formulates thisproperty.

16

Corollary 5.3.6 When ϕ is a functional formula, then for every set a thereexists a function f such that for all x, y

fx ' y ↔ x ∈ a ∧ ϕ(x, y).

Proof. Immediate. �

When a is a set, and ϕ is such that for all x, y

ϕ(x, y)→ y ' 1,

then Corollary 5.3.6 gives us the Subset Axiom (Axiom of Separation, Aus-sonderung) from ZF-Set Theory. As an immediate consequence of that, theintersection a ∩ b and the difference a \ b of two sets a and b also exist.

Clearly, often the formula ϕ occurring in Corollary 5.3.6 is such, thatδf ' a, e.g., for every set a and every u there exists a constant functionwith a as its domain and u as its constant value. Also, for every set a thereexists an identity function ia with a as its domain. Notice, that by theAxiom of Extensionality for any two sets a and b we have ia = ib ↔ a = b.

Informally, the function f from Corollary 5.3.6 can be denoted as ϕ � a,i.e., the restriction of ϕ to a. More formally, we can define the restriction ofa function to a set.

Definition 5.3.7 A function g is the restriction of a function f to a set a,denoted as f � a, iff gx ' y ↔ x ∈ a ∧ fx ' y. �

Theorem 5.3.8 ∀f, a(Set(a)→ ∃g(g ' f � a))


Note, that for two sets a and b, a � b is the intersection of a and b, i.e.,a ∩ b ' a � b, and thus in such a case � is commutative.

5.4 The Axioms of Lower and Higher Order

In this section we will describe the Axioms of Lower and Higher Order, anddiscuss some consequences of them.

The Axiom of Lower Order asserts the existence of a function g, whichcan be applied to all arguments of arguments of a given function f . TheAxiom of Higher Order guarantees the existence of a function g, that canbe applied to all functions with the same domain and the same range as agiven function f .

17

In both cases it suffices to state the existence of one such lower or higherfunction, since by virtue of the Axiom of Restricted Comprehension, we canalways prove the existence of other lower and higher functions.

Using sets they can be easily formulated.

Axiom 5 (Lower Order) ∀f∃g∀x(x ∈ δg ↔ ∃u(u ∈ δf ∧ x ∈ δu)). �

Axiom 6 (Higher Order) ∀f∃g∀x(x ∈ δg ↔ (δx ' δf ∧ ρx ' ρf)). �

Note, that these two axioms do not specify the function g exactly. It ispossible to add such an exact specification of g to them, e.g., by letting g bethe identity function on the indicated domain. However, given the Axiomof Restricted Comprehension, this second form of the axioms is equivalentto the first form. We prefer the above given form of the axioms, because ofthe brevity of the formulation.

In the remaining part of this section, we will discuss some consequencesof Axioms 5 and 6. The most important consequence of Axiom 5 is, thatwe can combine different functions into one single function, and the mostimportant consequence of Axiom 6 is, that when a and b are two sets offunctions, then there exist functions which are defined for all functions froma to b. Furthermore, we will define some concepts, and prove the well-definedness of some common operations on sets and functions.

Theorem 5.4.1 Let a be a set of sets. Then there exists a set b, such thatb '

⋃a, where

⋃a ' {x | ∃y(x ∈ y ∧ y ∈ a)}.

As is well known,⋃a is called the sum set of a. As usual, we denote

⋃{a, b}

as a ∪ b.Proof. By Axiom 5 and Axiom 3. �

Suppose, we have a set a of functions, and we want to combine thesefunctions into one single function g, such that the domain of g is the unionof the domains of the functions in a. Then the problem arises, that differentfunctions in a can be defined for some argument u, but their results foru differ. In such a case we have to decide which of these functions willdetermine the value of g applied to u. In order to make that decision, weintroduce an election function e, which elects for every u a specific f ∈ a.Then gu ' euu.

We have the following definition and theorem.

18

Definition 5.4.2 Let a be a set of functions. A function e is an electionfunction for a, iff

– δe '⋃{δf | f ∈ a},

– ρe ⊆ a. �

Theorem 5.4.3 For every set a and election function e for a there existsa combination g of all f ∈ a, such that ∀u(gu ' euu).

Proof. By Axiom 3. �

Note, that the domain of g need not be equal to the union of the domainsof all f ∈ a, since e can choose a “wrong” function f ′ for a certain u, i.e.,u 6∈ δf ′.

As an easy corollary of this theorem we have a special case of it.

Definition 5.4.4 A function h is the result of f updated with g, denoted asf [g], iff

x ∈ δg → hx ' gx,x 6∈ δg → hx ' fx. �

When g is given in the form λ(x7→y | ϕ), we may formulate f [g] withoutthe λ and without the parentheses, thus as f [x7→y | ϕ]. In the same way wehave f [σ0 7→τ0, . . . , σn−1 7→τn−1], with the obvious interpretation.

Note, that the successor n′ of a natural number n is n[n7→n]. Note also,that if a and b are two sets, then a[b] ' a ∪ b.

Corollary 5.4.5 ∀f, g∃h(h ' f [g]).

Proof. Define the election function e for the set {f, g} as follows:

ex ' g ↔ x ∈ δg,ex ' f ↔ x ∈ δf ∧ x 6∈ δg,

and then apply Theorem 5.4.3. �

Now we come to an important consequence of Axiom 6: if a and b aretwo sets of functions, then there exist functions which are defined for allfunctions from a to b. By Corollary 5.3.6 it is enough to show that theset of all functions from a to b exists, and that is what we will do in thenext few pages. Along the way we will define some concepts, and prove thewell-definedness of some common operations on sets and functions.

19

Theorem 5.4.6 Let a be a set. Then there exists a set b such that b ' Pa,where Pa ' {x | x ⊆ a}.

As usual, Pa is called the power set of a.Proof. Define for all b ⊆ a the function b′ such that

b′x ' 1 ↔ x ∈ b,b′x ' 0 ↔ x ∈ a ∧ x 6∈ b,b′x ' ⊥ ↔ x 6∈ a ∧ x 6∈ b.

Then for all b ⊆ a, with b 6' ∅ and b 6= a, we have δb′ ' a and ρb′ ' {0,1}.By Axiom 6 there exists a function f such that

δf ' {b′ | b ⊆ a ∧ b 6' ∅ ∧ b 6= a}.

By Axiom 3 we can further specify f such that fb′ ' b.If there exists no subset b of a such that b 6' ∅ and b 6= a, then δf ' ∅,

so f ' λ().Now

Pa ' ρf ∪ {∅, a},

and by Theorem 5.4.1 this set exists. �

A generalization of the notion of subset is the concept of subfunction,which we will also need further on.

Definition 5.4.7 (Subfunction) A function g is a subfunction of a func-tion f , denoted as g ⊆• f , iff

∀x, y(gx ' y → fx ' y). �

Note, that when a and b are sets, a ⊆ b↔ a⊆• b.Theorem 5.4.6 can also be generalized. It appears, that this generaliza-

tion is an easy consequence of Theorem 5.4.6 itself.

Corollary 5.4.8 For every function f the set of all subfunctions of f exists.

Proof. A subfunction of f is uniquely determined by its domain. ByTheorem 5.4.6 the power set of δf exists, so by Axiom 3 the set of allsubfunctions of f also exists. �

In general totality and partiality of functions are relative notions, thatis, a function f is total or partial with respect to a given set a. The definitionof these notions is obvious.

20

Definition 5.4.9

Total(f, a) ↔ δf ' a,Partial(f, a) ↔ δf ⊆ a, �

According to this definition, totality is a special case of partiality. Insteadof saying that f is partial with respect to a, we may also say that a is thesource set of f , denoted as Source(f, a).

Likewise, we can define that a function is surjective with respect to agiven set, and that a set is the target set of a function.

Definition 5.4.10

Surj (f, a) ↔ ρf ' a,Target(f, a) ↔ ρf ⊆ a. �

Clearly, every function is total with respect to its domain, and surjectivewith respect to its range. Functions with source set a and target set b arecalled functions from a to b.

The next theorem, preceded by two lemmas, asserts that the set of allfunctions from a to b exists.

Lemma 5.4.11 For any two sets a and b there exists a set p of all totaland surjective functions from a to b, i.e., for all f

f ∈ p↔ Total(f, a) ∧ Surj (f, b).

The set p is denoted as a⇒b.Proof. If there does not exist a function that is total with respect to a andsurjective with respect to b, we have that p ' ∅. Otherwise, we may applyAxiom 6 and Theorem 5.3.5. �

Note, that when a ' ∅ and b ' ∅, then a⇒ b ' {λ()}, and when a 6' ∅and b ' ∅, then a⇒b ' ∅. Also, when b is “bigger” than a, i.e., when thereis no total and injective function from b to a, then a⇒b ' ∅.

Lemma 5.4.12 For every two sets a and b there exists a set q of all totalfunctions from a to b, i.e., for all f

f ∈ q ↔ Total(f, a) ∧ Target(f, b).

21

The set q is denoted as a→b.Proof. By Lemma 5.4.11 for every subset u of b the set a⇒u exists. Definethe function g with source set Pb, such that gu ' a⇒u. Then

⋃(ρg) is the

required set q. �

Note, that when a ' ∅, then a→ b ' {λ()}, and when a 6' ∅ and b ' ∅,then a→b ' ∅.

Theorem 5.4.13 For every two sets a and b there exists a set r of allfunctions from a to b, i.e., for all f

f ∈ r ↔ Source(f, a) ∧ Target(f, b).

The set r is denoted as a⇀b.Proof. This proof is a variant of the proof of Lemma 5.4.12. By that lemmafor every u ⊆ a the set u→ b exists. Define the function h with source setPa such that hu ' u→b. Then

⋃(ρh) is the required set r. �

Note, that for no a and b, a⇀b ' ∅.

5.5 Induction

In this section we will show that, within the theory presented here, functionsmay be defined inductively.

Let ϕ(x, y) be a functional formula (cf. Section 4) in which parametersmay occur. Furthermore, the functionality (in a broader sense) of ϕ(x, y)may not be impaired when an arbitrary term τ is substituted∗ (cf. Section 3)for x in ϕ. That is to say, either ϕ∗(τ,⊥) holds, or there exists a unique usuch that ϕ∗(τ, u) holds. Such a formula may be written using infix notationby introducing a symbol F and writing Fx ' y for ϕ(x, y)1.

We will restrict ourselves to the following form of inductive definitions:let s be some successor function (not every successor function will be al-lowed) with startpoint a, and let σ be a term. Then we may define afunction f , with source set δs, as follows:

– fa ' σ,

– f(sx) ' F (fx).

We start with a definition.1Actually, if this infix notation is used in a more tricky way, then the requirements for

ϕ may be weakened a little. However, we do not employ this technique here.

22

Definition 5.5.1 (Minimal Successor Function) A functionm is calleda minimal successor function, denoted as MSF (m), iff

– SF (m),

– ∃!a(Start(m, a)),

– ∀f(SF (f) ∧ f ⊆•m ∧ ∀a(Start(m, a)→ Start(f, a)) → m⊆• f). �

Theorem 5.5.2 There exists a minimal successor function.

Proof. Let s be a successor function with startpoint a. Then by Corollar-ies 5.3.6 and 5.4.8 the set

t ' {s′ | SF (s′) ∧ s′ ⊆• s ∧ Start(s′, a)}

exists. Also, t is non-empty, since s ∈ t. Now define the function m, suchthat for all x, y

mx ' y ↔ ∀s′ ∈ t(s′x ' y).

Then m is a successor function with startpoint a (immediate).Furthermore, m has at most one startpoint. For suppose a′ is a second

startpoint of m, then by Axiom 3 we can define a successor function m′⊆•m,such that a and ma′ are startpoints of m′, i.e., a′ 6∈ δm′. But m′ ∈ t, and soa′ 6∈ δm, i.e., a′ is not a startpoint of m.

Finally, for every successor function f with startpoint a, such that f⊆•m,we have that f ⊆• s, so f ∈ t. Thus m⊆• f .

This proves, that m is a minimal successor function. �

The importance of the concept of minimal successor function lies in thefact that we may apply mathematical induction on (the domain of) a mini-mal successor function. More precisely, we have the following theorem.

Theorem 5.5.3 (Mathematical Induction) Let ϕ(x) be a formula, andm a minimal successor function with startpoint a. Then

ϕ(a) ∧ ∀x ∈ δm(ϕ(x)→ ϕ(mx))→ ∀x ∈ δm(ϕ(x)).

Remark. In accordance with the conventions in Section 3 we write ϕ(mx)instead of ϕ∗(mx).Proof. Suppose ϕ(a) and ∀x ∈ δm(ϕ(x) → ϕ(mx)). Define a function f ,such that for all x, y

fx ' y ↔ mx ' y ∧ ϕ(x),

23

then f ⊆•m, and ∀x ∈ δf(ϕ(x)). Furthermore, f is injective (immediate), ais a startpoint of f (immediate), and f has no endpoint (suppose y ∈ ρf ,i.e., there is an x such that fx ' y. Then mx ' y and ϕ(x). But then ϕ(y),i.e., y ∈ δf), so f is a successor function. But m is a minimal successorfunction, so f = m, i.e., ∀x ∈ δm(ϕ(x)). �

Notice, that for a successor function s we have that δs ' ∆s, so Theo-rem 5.5.3 also holds for the field of a minimal successor function.

Clearly, every minimal successor function m with startpoint a gives us amodel for the axioms of Peano Arithmetic, where a plays the role of 0, and∆m models the set of all natural numbers.

Theorem 5.5.4 (Peano) Let m be a minimal successor function with start-point a. Then

– a ∈ ∆m,

– ∀x(x ∈ ∆m→ mx ∈ ∆m),

– ∀x(mx 6' a),

– ∀x, y(mx ' my → x = y),

– ϕ(a) ∧ ∀x ∈ ∆m(ϕ(x)→ ϕ(mx))→ ∀x ∈ ∆m(ϕ(x)).


We continue with a definition.

Definition 5.5.5 (Restricted Successor Function) A function r is a re-stricted successor function, iff there exists a successor function s such that

r ⊆• s ∧ ∃a(Start(r, a)) ∧ ∃b(End(r, b)). �

Theorem 5.5.6 Let m be a minimal successor function with startpoint a.Then for every x ∈ ρm there exists a unique restricted successor functionmx ⊆•m with a as its only startpoint and x as its only endpoint.

Such anmx is called a minimal restricted successor function. As a degeneratecase, take ma ' λ().Proof. Define the function m′ such that for all y, z

m′y ' z ↔ my ' z ∧ y 6= a,

so m′ and m are equal, except for the startpoint of m. Clearly, m′ is aminimal successor function with startpoint a′ ' ma, and δm′ ' ρm. thuswe may proceed by induction on δm′:

24

– ma′ ' λ(a7→a′) is a restricted successor function with a as its onlystartpoint and a′ as its only endpoint such that ma′ ⊆•m.

– Now suppose mx is a restricted successor function with a as its onlystartpoint and x as its only endpoint, such that mx ⊆•m. Define

mmx ' mx[x7→mx],

then clearly mmx is a restricted successor function with a as its onlystartpoint and mx as its only endpoint, and mmx ⊆•m.

With respect to the uniqueness of mx, suppose there is a second minimalrestricted successor function mx with a as its only startpoint and x as itsonly endpoint such that mx 6= mx and mx ⊆•m. Define the set d as

d ' (∆mx ∪∆mx) \ (∆mx ∩∆mx),

then clearly d 6' ∅, and a, x 6∈ d.Next, define the function m as

m ' m � (∆m \ d).

We will show that m is a successor function with startpoint a, m⊆•m, andm 6= m, thus contradicting the minimality of m.

It is immediately clear that m⊆•m, m 6= m, a is a startpoint of m, andm is injective. So all we have to prove is that m has no endpoint.

Therefore, suppose z ∈ ρm, and my ' z, thus y ∈ ∆m \ d. Then y 6∈ d,i.e., either

y ∈ ∆mx ∧ y ∈ ∆mx, (1)

or

y 6∈ ∆mx ∧ y 6∈ ∆mx. (2)

In case (1), if y = x then my 6∈ ∆mx and my 6∈ ∆mx, thus my 6∈ d, i.e.,my ∈ δm.

If y 6= x then my ∈ ∆mx and my ∈ ∆mx, thus here too we have my 6∈ d.In case (2) it follows that my 6∈ ∆mx and my 6∈ ∆mx, since otherwise

my would be a startpoint of mx or mx, and my 6' a. So again we havemy 6∈ d, i.e., my ∈ δm.

Summarizing, if z ∈ ρm then z ' my ∈ δm, and thus m is a successorfunction. �

Theorem 5.5.6 justifies the following definition.

25

Definition 5.5.7 If m is a minimal successor function, then

x <m y ↔ x ∈ δmy,

x ≤m y ↔ x ∈ ∆my ∨ Start(m,x). �

In this definition, the condition Start(m,x) is necessary for the case that xand y are both equal to the startpoint of m. Then we want x ≤m y, but∆my = ∅.

Now we come to the main theorem of this section. In this theorem σis a term and ϕ a formula. ϕ answers the requirements of functionality (ina broader sense) mentioned at the beginning of this section, and ϕ(x, y) isassumed to be written as Fx ' y.

Theorem 5.5.8 (Inductive Definability) Let m be a minimal successorfunction with startpoint a. Then there exists a unique function f , such thatfor all x ∈ δm

– fa ' σ,

– f(mx) ' F (fx).

Proof. Let mx be the minimal restricted successor function with startpointa and endpoint x, such that mx ⊆•m. Define for every x ∈ δm the functionfx with δfx ' {a} if x = a and δfx ' ∆mx otherwise, such that

fxa ' σ, (1)

∀u <m x(fx(mu) ' F (fxu)). (2)

By induction it is easy to prove that for every x ∈ δm there exists a uniquefx such that (1) and (2) hold:

– for the basic case, define

fa ' λ(a7→σ).

Since δma ' ∅, fa trivially fulfills (1) and (2), and fa is unique.

– for the induction step, define

fmx ' fx[mx7→F (fxx)],

then fmx fulfills (1) and (2), and fmx is unique (straightforward).

26

Now define the function f such that for all x ∈ δm

fx ' fxx,

then f exists and is unique, and we have

fa ' faa

' σ,

and

f(mx) ' fmx(mx)' F (fxx)' F (fx),

which was to be proved. �

An application of Theorem 5.5.8 is, that for a minimal successor functionthere exists a counting function c. We have the following definition andcorollary.

Definition 5.5.9 Let m be a minimal successor function with startpoint a.A function c is a counting function of m, denoted as CF (c,m), iff

– ca ' 0,

– ∀x ∈ ∆m(c(mx) ' (cx)[cx7→cx]), �

Corollary 5.5.10 For every minimal successor function there exists a uniquecounting function c.

Proof. By Theorem 5.5.8. �

It is clear that cx is a natural number for all x ∈ ∆m (cf. Defini-tion 5.3.1), and (cx)[cx7→cx] denotes the successor of cx. So c counts howmany applications of m, starting at the startpoint of m, are needed to reachx.

As a consequence, the natural numbers are the elements of the range ofsuch a counting function c. So now we can formally define a predicate whichindicates that some object is a natural number (cf. Section 5.3).

Definition 5.5.11 An object x is a natural number , denoted as Nat(x), iff

∃m, c(MSF (m) ∧ CF (c,m) ∧ x ∈ ρc). �

27

Theorem 5.5.12 The set N of the natural numbers and the ordinary suc-cessor function sN for the natural numbers both exist and are unique.

Proof. Simply take ρc for N, and define sN by

sNn ' n′ ↔ n ∈ N ∧ n′ ' n[n7→n]. �

Without proof we mention that sN is a minimal successor function so wemay apply mathematical induction on the natural numbers. Furthermore,the natural numbers, together with their successor function, form a modelof Peano Arithmetic. We will take this model as the canonical one.

5.6 n-ary functions

Up to now we restricted ourselves to arbitrary one-place functions. However,it is easy to model many-place functions as specific one-place functions. Toreach this goal, we will first define n-tuples.

Definition 5.6.1 (Ordered n-tuple) an ordered n-tuple 〈x0, x1, . . . , xn−1〉is the function

λ(07→x0,17→x1, . . . , (n−1)7→xn−1). �

Note, that that the empty tuple 〈〉 ' λ(). Note also, that a natural numbern is the ordered n-tuple of all its predecessors in increasing order, and thatδn (and also ρn) is the set of all predecessors of n.

Theorem 5.6.2 For all sets a0, a1, . . . , an−1 the set c of all ordered n-tuples〈x0, x1, . . . , xn−1〉, with xi ∈ ai (i = 0, 1, . . . , n−1), exists.

As usual, the set c is denoted as a0×a1×· · ·×an−1, and called the cartesianproduct of a0, a1, . . . , an−1. Note, that there is a difference between e.g.(a0 × a1) × a2 and a0 × a1 × a2, which are understood to contain elementsof the form 〈〈x0, x1〉, x2〉 and 〈x0, x1, x2〉 respectively.Proof. By Lemma 5.4.12 there exists the set

{0,1, . . . , n−1} →⋃{a0, a1, . . . , an−1},

containing all n-tuples whose elements are in a0 or in a1 or in · · · or in an−1.By Corollary 5.3.6 we can separate from this set the subset containing onlythose ordered n-tuples f , such that

f0 ∈ a0, f1 ∈ a1, . . . , f(n−1) ∈ an−1.

28

This set is the required set c. �

Definition 5.6.3 (n-ary Function) A function f is an n-ary function, iffall arguments of f are of the form

〈x0, x1, . . . , xn−1〉. �

As usual, we will speak of nullary, unary, binary, etc, when n = 0, 1, 2respectively. Note, that the empty function is nullary, unary, binary, etc.

According to this definition, a nullary function is a function whose ar-guments are all of the form 〈〉. However, there is only one empty tuple, soa nullary function is a function with the empty tuple as its only argument.

All arguments of unary functions are of the form 〈x〉, so unary functionsare different from our basic concept of one-place functions. All functions inthe universe are one-place, not all functions are unary. On the other hand,for every one-place function f there exists a unary function f ′, such that forall x

f ′〈x〉 ' fx,

as can be proved easily. Also, the collection of all unary functions, togetherwith a suitably adapted relation of application, forms a model of the theorypresented here.

By Definition 5.6.3 an n-ary function f is a one-place function whosearguments are ordered n-tuples, and so the elements of such arguments off are themselves not arguments of f (at least, in general they are not);they are values of the arguments of f . We will call an element xi of anargument of an n-ary function f a quasi argument of f , and say that f isquasi applicable to xi, or xi belongs to the quasi domain of f .

Definition 5.6.4 x is a quasi argument of an n-ary function f , denoted asQD(f, x), iff there exist x0, x1, . . . , xn−1 such that

〈x0, x1, . . . , xn−1〉 ∈ δf ∧ (x = x0 ∨ x = x1 ∨ · · · ∨ x = xn−1). �

In the same way as we defined the field of a function f , we will define thequasi field of an n-ary function f .

Definition 5.6.5 x belongs to the quasi field of an n-ary function f , de-noted as QFld(f, x), iff

QD(f, x) ∨R(f, x). �

29

Theorem 5.6.6 For every n-ary function f there exist sets a and b, calledthe quasi domain and quasi field of f , such that for all x

x ∈ a ↔ QD(f, x),x ∈ b ↔ QFld(f, x).

The quasi domain of an n-ary function f is denoted as δqf , the quasi fieldas ∆qf .Proof. The set

⋃{ρx | x ∈ δf} is the required quasi domain. The quasi

field is simply the union of the quasi domain of f and the range of f . �

In the remaining part of this section, we will indicate how to simulatenon-strict functions (cf. Section 4). To do so, we need a generalization ofthe notion of ordered n-tuple.

Definition 5.6.7 (Partial n-tuple) A partial n-tuple is a function p, suchthat Source(p, δn). �

That is, a partial n-tuple is a function which is partial on {0,1, . . . , n−1}.So a partial n-tuple can be viewed as an n-tuple that may contain holes init. For example, we can denote a partial 6-tuple λ(17→x,37→y) as 〈, x, , y, , 〉.Since

λ(17→x,37→y) ' λ(07→⊥,17→x,27→⊥,37→y,47→⊥,57→⊥),

we may also write this partial 6-tuple as 〈⊥, x,⊥, y,⊥,⊥〉. Note, that anordered n-tuple is a partial n-tuple. Note also, that a partial n-tuple is apartial m-tuple, if m ≥ n.

Now we can define non-strict n-ary functions.

Definition 5.6.8 (Non-Strict n-ary Function) A function f is a non-strict n-ary function, iff all arguments of f are partial n-tuples. �

Note, that ordinary n-ary functions are also non-strict n-ary. In order toavoid misunderstanding, we should require that a non-strict function hasin its domain at least one tuple containing holes. Then ordinary n-aryfunctions can be called strict . As we already remarked, taken as one-place,all functions are strict (cf. Section 3).

As an example, consider a non-strict unary function f , such that 〈〉 ∈ δf ,and suppose σ is a non-existing term. Then 〈σ〉 ' 〈〉, and thus f〈σ〉 exists,which is in accordance with the common meaning of non-strict functions.

Within this paper we will not elaborate non-strict functions any further.

30

5.7 Currying

A well-known operation on n-ary functions is the operation of currying.Usually, it is introduced in a more or less intuitive way, and only for to-tal functions (cf. e.g. Curry and Feys (1958), Stoy (1977), Schmidt(1986)). Here we work in a general framework of partial functions, and wewill define currying in a precise way. Furthermore, we will show that thereexists a unique curried version of every n-ary function.

We start with the basic definition.

Definition 5.7.1 (Currying) A function g is a curried version of an n-aryfunction f , iff for every x0, . . . , xn−1

gx0 · · ·xn−1 ' f〈x0, . . . , xn−1〉. �

However, this definition gives us a problem with respect to the unicity of g.This problem is due to the fact that we are working in a general frameworkof partial functions. We will illustrate it with an example.

Suppose f is a ternary function, and

δf ' {〈a0, b0, c0〉, 〈a0, b1, c1〉, 〈a1, b1, c1〉}.

The curried version g of f must be such, that for all a, b, c we have

gabc ' f〈a, b, c〉,

i.e., g gives “ultimately” the same result as f . Notice however, that for onlythree triples of values of a, b and c this result will be defined.

The problem now is that, e.g., ga1b0 can be undefined, but it can alsobe the empty function, since in both cases ga1b0c0 and ga1b0c1 will be un-defined. In fact, the same problem already occurs with g itself: g can bedefined for some a ( 6= a0, a1), as long as gabc is undefined for every b and c.So, if we want to guarantee the unicity of g, we have to strengthen curryingwith some “minimality property”.

Definition 5.7.2 A curried version g of an n-ary function f is minimal, ifffor all x0, . . . , xi−1 (i = 0, 1, . . . , n−1), and for all x

x ∈ δ(gx0 · · ·xi−1)↔↔ ∃xi+1, . . . , xn−1(〈x0, . . . , xi−1, x, xi+1, . . . , xn−1〉 ∈ δf). �

In this definition it is assumed that δ is strict (as agreed in Section 5.3),i.e., δ(gx0 . . . xn−1) ' ⊥ whenever gx0 . . . xn−1 ' ⊥. Remember also, thatδ(gx0 . . . xn−1) ' ⊥ implies that x ∈ δ(gx0 . . . xn−1) is false for every x.

31

We will speak of the curried version g of an n-ary function f , denoted asΓf , when g has the minimality property. We have the following theorem.

Theorem 5.7.3 (Currying) For every n-ary function f a unique functiong exists, such that g is the curried version of f .

Remark. The definition of currying is in fact a definition scheme, i.e., forevery n we have a different definition of currying. The same holds for thistheorem, and in fact it already holds for the definition of n-ary functions.So we have a sequence of predicates 0-ary, 1-ary, 2-ary, etc. To prove thetheorem, we will prove it for 0-ary, 1-ary and 2-ary functions, and thenproceed by induction on the arity of f .

In order to see that this form of induction fits with Theorem 5.5.3, we firstdefine the binary predicate Ary(f, n), meaning that f is an n-ary function,as follows:

Ary(f, n)↔ Nat(n) ∧ ∀x ∈ δf∀i(i ∈ δx↔ i ∈ δn).

Then we must prove (by mathematical induction) the formula

∀f(Ary(f, n)→ ∃!g(g ' Γf)),

which is precisely what we described above.

Proof of Theorem 5.7.3. Let f be a nullary function, i.e., the only argumentof f is the empty tuple 〈〉. Suppose f〈〉 ' g. It is immediately clear, thatg is uniquely determined, and that g ultimately (after zero applications)gives the same result as f , when f is applied to the tuple of zero arguments.Furthermore, the minimality property becomes void, so it is trivially fulfilled.So, Γf ' g when f is a nullary function2.

Next, let f be a unary function. Define g, such that for all x

gx ' f〈x〉.

It is routine to check that g is the unique curried version of f .

Now let f be a binary function. Define the set a as follows:

a ' {x | ∃x1(〈x, x1〉 ∈ δf)}.2Often nullary functions are considered as constants. If we want to do that in the

framework of the theory presented here, it is more correct to take the curried version ofa nullary function.

32

Define for every x a function gx such that gxx1 ' f〈x, x1〉. Trivially, gxexists and is uniquely determined. Note, that gx ' λ() iff x 6∈ a.

Finally, define the function g such that gx ' gx for every x ∈ a, and g isundefined for every other x. Here too, it is easy to prove that g exists, andthat g is uniquely determined.

To conclude that g is the curried version of f , we remark that g ultimatelygives the same result as f , i.e., for every x0, x1

gx0x1 ' f〈x0, x1〉,

and, furthermore, g has the minimality property, i.e., for every x0, x

– x ∈ δg ↔ ∃x1(〈x, x1〉 ∈ δf),

– x ∈ δ(gx0)↔ 〈x0, x〉 ∈ δf .

So, g is the unique curried version of f , i.e., Γf ' g.

Now suppose that the curried version of every n-ary function exists, andlet f be an (n+1)-ary function. Define the binary function f ′, such that allfirst quasi arguments of f ′ are n−tuples, and such that

f ′〈〈x0, . . . , xn−1〉, xn〉 ' f〈x0, · · · , xn−1, xn〉.

It is easy to prove that f ′ exists and is uniquely determined. Since f ′ isbinary, we have

Γf ′〈x0, . . . , xn−1〉xn ' f ′〈〈x0, . . . , xn−1〉, xn〉,

where Γf ′ is an n-ary function. So, by the induction hypothesis Γ(Γf ′)exists and is unique. Again, to conclude that Γ(Γf ′) is the curried versionof f , we must show that Γ(Γf ′) ultimately gives the same result as f , andsecondly, that Γ(Γf ′) has the minimality property. With respect to the firstrequirement, notice that for all x0, . . . , xn−1, xn we have

Γ(Γf ′)x0 · · ·xn−1xn ' Γf ′〈x0, . . . , xn−1〉xn' f ′〈〈x0, . . . , xn−1〉, xn〉' f〈x0, . . . , xn−1, xn〉.

With respect to the second requirement, notice that for all x, and for allx0, . . . , xi−1 (i = 0, 1, . . . , n−1) we have

x ∈ δ(Γ(Γf ′)x0 · · ·xi−1)↔↔ ∃xi+1, . . . , xn−1(〈x0, . . . , xi−1, x, xi+1, . . . , xn−1〉 ∈ δ(Γf ′))↔ ∃xi+1, . . . , xn−1∃xn(〈〈x0, . . . , xi−1, x, xi+1, . . . , xn−1〉, xn〉 ∈ δf ′)↔ ∃xi+1, . . . , xn−1, xn(〈x0, . . . , xi−1, x, xi+1, . . . , xn−1, xn〉 ∈ δf).

33

Finally, for all x0, . . . , xn−1 (i.e., the case that i = n), and for all x we have

x ∈ δ(Γ(Γf ′)x0 · · ·xn−1) ↔ x ∈ δ(Γf ′〈x0, . . . , xn−1〉)↔ 〈〈x0, . . . , xn−1〉, x〉 ∈ δf ′

↔ 〈x0, . . . , xn−1, x〉 ∈ δf

Thus Γ(Γf ′) fulfills the requirements of Definitions 5.7.1 and 5.7.2 and soΓf ' Γ(Γf ′). �

Note, that the empty function has any arity whatsoever, so it is not apriori clear what it means to curry λ(). However, whatever arity we considerλ() to have, we always have Γ(λ()) ' λ().

On the other hand, this suggests that currying depends on our way oflooking at a function, that is to say, that we might consider an arbitraryfunction as being partly 0-ary, 1-ary, etc. The operation of currying canbe generalized towards arbitrary functions, by indicating which 0-ary, 1-ary,etc. part of a function must be curried.

Often, also the inverse operation of uncurrying a function f is intro-duced, but if we want to do that in the present general framework of partialfunctions, we have to indicate the arity of the uncurried version of f . Fi-nally, when we want to curry non-strict n-ary functions, we have to changethe definition of currying a little.

We don’t examine these aspects of currying here.

5.8 The Axiom of Universality

Up to now, all axioms were rather simple. The next axiom is called theAxiom of Universality , and is a more complicated one. It is concerned withfunctions that are often felt to be counter intuitive in some sense, e.g., likefunctions that can be applied to themselves (i.e., ff exists), or like functionswhich can be mutually applied to each other (i.e., fx and xf both exist). Asanother example, there can exist functions a, b and c such that ab ' c, bc ' aand ca ' b.

From a point of view of generality, we choose to allow for the existenceof “funny” functions like these. The Axiom of Universality is a generalformulation of our choice. In order to introduce it, we start with an example.Suppose we want to prove the existence of the following functions:

a ' λ(a7→a, b7→b, c7→c)b ' λ(a7→b, b7→a)c ' λ(c7→c)

34

Note, that these functions need not be unique, there can be different func-tions answering the same pattern. They are, so to speak, defined up toisomorphism.

Instead of proving the existence of a, b and c, we may equally well provethe existence of a binary function g, which reflects the same pattern, andwhich must be an apply function, i.e., g〈x, y〉 ' xy for all x ∈ ∆qg, and forall y. In other words, when we formulate g using infix notation, we get theordinary quasi operation of application restricted to ∆qg.

In our example we have

g ' λ(〈a, a〉7→a, 〈a, b〉7→b, 〈a, c〉7→c,〈b, a〉7→b, 〈b, b〉7→a,〈c, c〉7→c).

Apart from the strangeness of a, b and c there is nothing strange about gitself. So if we take three ordinary (i.e., non-funny) functions a′, b′ and c′

instead of a, b and c respectively, we have a perfectly non-strange functionf , where

f ' λ(〈a′, a′〉7→a′, 〈a′, b′〉7→b′, 〈a′, c′〉7→c′,〈b′, a′〉7→b′, 〈b′, b′〉7→a′,〈c′, c′〉7→c′).

We say that f is isomorphic to g. Using infix notation, we can denote f as•f , so we get a′•fa′ ' a′, a′•fb′ ' b′, etc.

The Axiom of Universality now proceeds in the opposite direction. Intu-itively it says: given f , there exists an apply function g, which is isomorphicto f . As a consequence, all funny functions which are elements of the quasifield of g (a, b, c in the example) also exist.

There is still one remark to make: f has to meet an additional require-ment (called internal extensionality), in order to answer the Axiom of Ex-tensionality. To show why, suppose that we extend the definition of f withan extra line, such that we get

f ′ ' f [ 〈d′, a′〉7→b′, 〈d′, b′〉7→a′ ]

' λ ( 〈a′, a′〉7→a′, 〈a′, b′〉7→b′, 〈a′, c′〉7→c′,〈b′, a′〉7→b′, 〈b′, b′〉7→a′,〈c′, c′〉7→c′,〈d′, a′〉7→b′, 〈d′, b′〉7→a′ ).

35

Then we have

f ′〈d′, a′〉 ' f ′〈b′, a′〉,f ′〈d′, b′〉 ' f ′〈b′, b′〉.

Furthermore, for all x 6= a′, b′ we have

f ′〈d′, x〉 ' f ′〈b′, x〉 (' ⊥).

The consequence of this would be that, because of g, there would exist afunction d, such that da ' ba, db ' bb and both d and b would be undefinedfor all other x. Thus, by the Axiom of Extensionality, d = b. In order tomaintain isomorphism between f and g this extension of f is not allowed.

More precisely, we have the following definitions.

Definition 5.8.1 A binary function f is internally extensional , denoted asIExt(f), iff for all x1, x2 ∈ ∆qf

∀y(f〈x1, y〉 ' f〈x2, y〉)→ x1 = x2. �

Definition 5.8.2 Two n-ary functions f and g are isomorphic, denoted asf ∼ g, iff there is a function h such that

– Inj (h),

– δh ' ∆qf,

– ρh ' ∆qg,

– ∀x0, . . . , xn−1(h(f〈x0, . . . , xn−1〉) ' g〈hx0, . . . , hxn−1〉). �

Definition 5.8.3 A binary function f is called an apply function, denotedas AF (f), iff

∀a ∈ ∆qf∀x(f〈a, x〉 ' ax). �

The Axiom of Universality can now be formulated as follows.

Axiom 7 (Universality) ∀f(IExt(f)→ ∃g(g ∼ f ∧AF (g))). �

In other words, the property of internal extensionality says that f can beinterpreted as some operation of application, restricted to a certain set (thequasi field of f). Then by the Axiom of Universality, given such an f , thereis a set (the quasi domain of g), such that the “real” (pseudo) operation

36

of application (restricted to this set) is isomorphic to f . But since g isan apply function, this restriction of the pseudo operation of applicationcoincides with g itself.

From Axiom 7 and the introducing example it is immediately clear, thatthere exist various sorts of funny functions, like functions that can be appliedto themselves, functions that can be applied to each other, etc. For example,there exist functions like f ' λ(f 7→f), or f1 ' λ(f2 7→f1) and f2 ' λ(f1 7→f2),etc.. As a more interesting example we mention that there exists a modelof the untyped, extensional lambda calculus. Such a model is natural in thesense that all objects of it are functions, and application in the model is justordinary function application.

Theorem 5.8.4 There exists a set d of functions which, together with ordi-nary function application, forms a model for the untyped, extensional lambdacalculus.

Remarks. We only give a sketch of the proof, since a detailed proof liesoutside the scope of this paper. When we use the noun “term”, terms in thesense of the lambda calculus are meant.

For a proof of the existence of a lambda calculus model in non-well-founded set theory, cf. Von Rimscha (1980).Sketch of the proof. The terms from the untyped lambda calculus can becoded as natural numbers by means of some standard technique of Godelnumbering. Let G be the set of all Godel numbers dXe where X is a term.We can define an equivalence relation ∼= on G as follows:

dXe ∼= dY e iff X =βη Y,

where =βη means βη-convertibility in the lambda calculus. Then a modelfor the lambda calculus can be constructed along the same lines as a termmodel (cf. Hindley and Seldin (1986), pages 116–117), using equivalenceclasses of Godel numbers instead of equivalence classes of terms. Withinthis model a binary operation of application •λ exists (modeling applicationfrom the lambda calculus), so there exists a binary function f such that

f〈x, y〉 ' x•λy,

where x and y are equivalence classes under ∼= of elements of G. Since weconsider the extensional lambda calculus, f is internally extensional. Thus,according to Axiom 7 there exists an apply function g which is isomorphicto f . That is, the quasi field of g with ordinary application is a model forthe untyped, extensional lambda calculus. So d ' ∆qg. �

Notice that d ⊆ d→ d.

37

5.9 The Axiom of Choice

Now we come to the final axiom of the theory, which is a simple functionalform of the Axiom of Choice. Since in the literature (cf. Rubin and Rubin(1963)) the Axiom of Choice is examined extensively, we don’t give anyapplications of it, but merely mention it.

Axiom 8 (Axiom of Choice) ∀f∃g(f ◦ g ◦ f ' f). �

Notice the concision and elegancy of this formulation of the Axiom of Choicein comparison with many other formulations of the Axiom of Choice. It wasmentioned in Arbib and Manes (1975), page 9.

5.10 List of Axioms

We conclude the description of the theory of partial functions by giving acomplete list of all the axioms.

Axiom 1 (Uniqueness) (cf. Section 5.1)

∀f, x, y1, y2(fx ' y1 ∧ fx ' y2 → y1 = y2).

Axiom 2 (Extensionality) (cf. Section 5.1)

∀f, g(∀x(fx ' gx)→ f = g).

Axiom 3 (Restricted Comprehension) (cf. Section 5.1)

∀f∃g∀x, y(gx ' y ↔ Fld(f, x) ∧ ϕ(x, y)),

where ϕ(x, y) is a functional formula, not containing g free.

Axiom 4 (Infinity) (cf. Section 5.2)

∃f(SF (f)).

Axiom 5 (Lower Order) (cf. Section 5.4)

∀f∃g∀x(x ∈ δg ↔ ∃u(u ∈ δf ∧ x ∈ δu)).

Axiom 6 (Higher Order) (cf. Section 5.4)

∀f∃g∀x(x ∈ δg ↔ (δx ' δf ∧ ρx ' ρf)).

Axiom 7 (Universality) (cf. Section 5.8)

∀f(IExt(f)→ ∃g(g ∼ f ∧AF (g))).

38

Axiom 8 (Axiom of Choice) (cf. Section 5.9)

∀f∃g(f ◦ g ◦ f ' f).

From now on, we will call the theory consisting of Axioms 1 to 8: AxiomaticFunction Theory, abbreviated as AFT.

6 Consistency

In order to prove the (relative) consistency of AFT, we will construct aninner model in ZF-Set Theory including the Axiom of Choice. For simplicitywe use a variant of ZF-Set Theory without the Axiom of Foundation, butincluding an axiom which asserts the existence of non-well-founded sets, forwhich we choose the so called Axiom of Universality. The resulting variant ofZF-Set Theory is denoted as ZFC◦U, and known to be relatively consistentwith respect to ZFC, i.e., ZF-Set Theory including the Axiom of Foundationand the Axiom of Choice (cf. Boffa (1969), Von Rimscha (1980, 1981),Aczel (1988)).

The framework of this section is set theoretical, and all concepts (likenatural number, ordered n-tuple, set, function) are to be understood in thesense of set theory, and must not be confused with the same concepts fromthe other sections of this paper, where they are meant in the sense of AFT.

As usual, an ordered pair 〈x, y〉 is defined as the set {{x}, {x, y}}. Afunction from a to b is a relation between a and b answering the propertyof uniqueness, i.e., a function is a specific set of ordered pairs. If a set a is afunction, this is denoted as Fnc(a). Note, that if a and b are two functions,then a ⊆• b ↔ a ⊆ b. The domain and range of a function f are denoted asδf and ρf respectively. Clearly, the ternary relation of application (in infixnotation, cf. Section 3) is defined as

ax ' y ↔ Fnc(a) ∧ 〈x, y〉 ∈ a.

By Tr(a) we denote, that a set a is transitive; TC (a) denotes the transitiveclosure of a.

We continue with some basic definitions, which are not completely stan-dard.

Definition 6.1 (Field) The field ∆a of a set a is the set

{x | ∃y(〈x, y〉 ∈ a ∨ 〈y, x〉 ∈ a)}. �

39

Definition 6.2 (Structure) A structure is an ordered pair 〈a, r〉 where ais a set and r a relation on a, i.e., r ⊆ a× a. �

Definition 6.3 Let s be a structure 〈a, r〉. Then s is extensional , denotedas Es(s), iff for all x, y, z ∈ a

(〈x, y〉 ∈ r ∧ 〈x, z〉 ∈ r)→ y = z. �

Definition 6.4 The internal ε-structure ε(t) of a set t is the structure 〈t, e〉,where e denotes the set

{〈x, y〉 | x, y ∈ t ∧ x ∈ y}. �

Isomorphism of two structures s and t is defined as obvious, and denoted ass ∼ t.

In set theory the problem of undefinedness also arises naturally, e.g. asin {x | ∃y(x ∈ y)}. It is common practice in set theory to handle thisproblem in an intuitive way, by agreeing that non-existing terms are notused. For example, {x | x = x} is not used, thus avoiding meaninglessformulas like {x | x = x} = a. To handle undefinedness within the settheoretical language, we can apply the same technique of pseudo terms andpredicate schemes as described in Section 3, where {x | ϕ} is considered tobe a pseudo term. The four notational conventions (cf. Section 3) remainthe same, except for convention 2 which has to be replaced by convention 2′

(σ, τ stand for pseudo terms; x, y stand for variables not occurring in σ, τ):

σ ∈ τ means ∃x, y(σ ' x ∧ τ ' y ∧ x ∈ y). (2′)

Notice, that because of these conventions, ∈ becomes a predicate scheme(though we will just write ∈, instead of ∈∗, cf. Section 3). Though we willnot need it, ⊥ can be considered as a shorthand notation for e.g. {x | x = x}.We will not further elaborate the technique of pseudo terms for set theoryhere.

As already remarked, there are set theoretical and function theoreticaldefinitions of many notions. In the consistency proof we will need bothdefinitions of several of these notions. Since the framework of this section isset theoretical, we will denote a notion by its standard formulation, as forexample 0, 1, x ∈ a, 〈x, y〉, when the set theoretical definition is meant.When a notion is meant in the sense of AFT, we will place a dot above it,as in

.0,

.1, x

.∈ a,

.〈x, y

.〉 . However, also the function theoretic definitions

40

must be understood in set theoretical terms, where a function is a specificset of ordered pairs. For example, we have

0 ' ∅ , 1 ' {∅},.0' ∅ ,

.1' {〈∅, ∅〉},

and

〈a, b〉 ' {{a}, {a, b}},.〈a, b

.〉 ' {〈x, y〉 | (x '

.0 ∧ y = a) ∨ (x '

.1 ∧ y = b)}.

Note, that the following equivalences hold (f and g denote functions in theset theoretical sense):

x.∈

.δf ↔ x ∈ δf,

x.∈ .ρf ↔ x ∈ ρf,

x.∈

.∆f ↔ x ∈ ∆f,

f.∼ g ↔ 〈∆f, f〉 ∼ 〈∆g, g〉.

In the last equivalence, f and g are supposed to be binary functions.

We continue with formulating the axioms of ZFC◦U.

Axiom ZF-1 (Extensionality) ∀a, b(∀x(x ∈ a↔ x ∈ b)→ a = b).

Axiom ZF-2 (Replacement) ∀a∃b∀y(y ∈ b↔ ∃x(x ∈ a ∧ ϕ(x, y))),where ϕ(x, y) is a functional formula.

Axiom ZF-3 (Infinity) ∃a 6' ∅∀x ∈ a∃y ∈ a(x ⊆ y ∧ x 6= y).

Axiom ZF-4 (Sum Set) ∀a∃b∀x(x ∈ b↔ ∃y(x ∈ y ∧ y ∈ a)).

Axiom ZF-5 (Power Set) ∀a∃b∀x(x ∈ b↔ x ⊆ a).

Axiom ZF-6 (Universality) ∀s(Es(s)→ ∃t(Tr(t) ∧ ε(t) ∼ s)).

Axiom ZF-7 (Choice) ∀a(∀x, y ∈ a(x = y ∨ x ∩ y ' ∅)→→ ∃b∀x ∈ a(x 6' ∅ → ∃y(b ∩ x ' {y}))).

41

When we compare these set theoretical axioms with the axioms of AFT,which were formulated in Section 5, there appears to be a striking similaritybetween AFT and ZFC◦U. All axioms but one correspond rather directlyto each other, e.g., the AFT-Axiom Scheme of Restricted Comprehensioncorresponds to the ZF-Axiom Scheme of Replacement, the AFT-Axioms ofLower and Higher Order correspond to the Sum Set and Power Set Axiomfrom ZFC◦U respectively. The only axiom that is present in AFT, but notin ZFC◦U, is the Axiom of Uniqueness.

The elegancy of some of the corresponding axioms differs a little, andit is remarkable that some of the less elegant axioms of AFT (to our tastethe Axioms of Lower and Higher Order) became a bit more elegant by usingsets. The same holds for ZFC◦U, where the Axioms of Infinity and Choicecan be made more elegant by using functions.

Of course, as an explanation of the similarities between AFT and ZFC◦U,we must say that ZF-Set Theory was an important source of inspiration dur-ing the development of AFT. Nevertheless, it is surprising that the conceptof function can be axiomatized in almost the same way as the concept ofset, such that the resulting theory about functions is equi-consistent withand equally powerful as the corresponding theory about sets. In the remain-ing part of this section and in the next section we will prove that fact byinterpreting AFT within ZFC◦U and vice versa.

We continue with some definitions, in order to construct within ZFC◦Ua model for AFT.

Definition 6.5 (Applicatively Closed) A set a is applicatively closed iff

∀x(x ∈ a→ ∆x ⊆ a). �

Definition 6.6 (Applicative Closure) The applicative closure of a set ais the smallest applicatively closed set b such that ∆a ⊆ b. �

The applicative closure of a is denoted as ∆ca. Note, that ∆ca exists forevery set a, since ∆ca ⊆ TC (a).

Definition 6.7 (Pure Function) A function f is a pure function, denotedas PF (f), iff ∀x(x ∈ ∆cf → Fnc(x)). �

Note, that ∅ is a pure function, the empty function. If f is a pure function,then all arguments and values of f are also pure functions.

The next theorem states the consistency of AFT.

Theorem 6.8 (Consistency of AFT) The collection of pure functions,together with the ternary relation of application, is a model of AFT.

42

The proof of this theorem consists of eight lemmas, one for every axiomof AFT. Clearly, all formulas are to be relativized to the collection of purefunctions, together with the relation of application, i.e., all quantifiers rangeover pure functions.

Lemma 6.9 (Axiom AFT-1) ∀f, x, y1, y2(fx ' y1 ∧ fx ' y2 → y1 = y2).

Proof. Uniqueness of pure functions follows immediately from the definitionof functions in set theory. �

Lemma 6.10 (Axiom AFT-2) ∀f, g(∀x(fx ' gx)→ f = g).

Proof. Similarly, extensionality of pure functions follows immediately fromextensionality of sets. �

Lemma 6.11 (Axiom AFT-3) ∀f∃g∀x, y(gx ' y ↔ x ∈ ∆f ∧ ϕ(x, y)).

Proof. Let f be a pure function, and ϕ(x, y) a functional formula which canbe expressed using the ternary relation of application as its only non-logicalpredicate. Clearly, the set g ' {〈x, y〉 | x ∈ ∆f ∧ PF (y) ∧ ϕ(x, y)} exists.This is the pure function, whose existence is required. �

Lemma 6.12 (Axiom AFT-4) ∃f(SF (f)).

Proof. Define the function g, with domain the set ω of all natural numbers,recursively as follows:

g0 ' ∅,g(sNα) ' {〈gα, gα〉},

where sN is the usual set theoretical successor function for ordinal numbers,i.e., sNα ' α ∪ {α}.

Then gα is a pure function for every natural number α. Now define thefunction f , with the range of g as its domain, such that fx ' {〈x, x〉}. Thenf is a pure function, and, as can be checked easily, f is a successor functionin the sense of AFT. �

Lemma 6.13 (Axiom AFT-5) ∀f∃g∀x(x ∈ δg ↔ ∃u(u ∈ δf ∧ x ∈ δu)).

Proof. Let f be a pure function. Define

g ' {〈x, x〉 | ∃u(u ∈ δf ∧ x ∈ δu)}.

Then g is a pure function, defined for all arguments of all arguments of f .The fact that g is the identity function on

⋃{δu | u ∈ δf} does not matter,

it just makes the definition of g easy. �

43

Lemma 6.14 (Axiom AFT-6) ∀f∃g∀x(x ∈ δg ↔ (δx ' δf ∧ ρx ' ρf)).

Proof. Let f be a pure function. Define the set

t ' {x ∈ P(δf × ρf) | Fnc(x) ∧ δx ' δf ∧ ρx ' ρf}.

All elements of t are pure functions. Now the existence of a pure function g,such that δg ' t, follows immediately. �

Lemma 6.15 (Axiom AFT-7) ∀f(.

IExt(f)→ ∃g(g .∼ f ∧.

AF(g))).

Proof. Suppose f is a pure function. Suppose also, that f is a binaryfunction in the sense of AFT, i.e., all arguments of f are of the form

.〈x, y

.〉 .

Let

Dqf ' {x | ∃y, z(〈.〈x, y

.〉 , z〉 ∈ f ∨ 〈

.〈y, x

.〉 , z〉 ∈ f ∨ 〈

.〈y, z

.〉 , x〉 ∈ f)}.

Now suppose, that f is internally extensional in the sense of AFT, i.e., forall x1, x2 ∈ Dqf

∀y, z(〈.〈x1, y

.〉 , z〉 ∈ f ↔ 〈

.〈x2, y

.〉 , z〉 ∈ f)→ x1 = x2.

Define for all x ∈ Dqf the set x′ ' {x,2}, where 2 ' {∅, {∅}}, and definethe set

a ' {x′ | x ∈ Dqf}⋃{{y′}, {y′, z′}, 〈y′, z′〉 | ∃u(〈

.〈u, y

.〉 , z〉 ∈ f)}.

Then for all x ∈ Dqf and for all u ∈ a we have u 6∈ x′.Next, define

r ' {〈u, v〉 | u, v ∈ a ∧ u ∈ v}⋃{〈〈y′, z′〉, x′〉 | 〈

.〈x, y

.〉 , z〉 ∈ f},

then the structure 〈a, r〉 is an extensional structure. Thus, by Axiom ZF-6there exists an isomorphism h with domain a, such that

– {hx | x ∈ a} is a transitive set,

– hu ∈ hv ↔ 〈u, v〉 ∈ r.

Now define the binary function (in the sense of AFT)

g ' {〈.〈hx′, hy′

.〉 , hz′〉 | 〈

.〈x, y

.〉 , z〉 ∈ f},

then g is a pure function such that g .∼ f and furthermore, g is an applyfunction in the sense of AFT (straightforward). �

44

Lemma 6.16 (Axiom AFT-8) ∀f∃g(f ◦ g ◦ f ' f).

Proof. We start with a well-known equivalent of the Axiom of Choice (cf.Rubin and Rubin (1963), Van Dalen et al (1978)):

∀f∃h(h⊆• f ∧ ρf ' ρh ∧ Inj (h)).

Given that f is a pure function it is immediately clear that h is a purefunction too. Now let h−1 be the inverse function of h, i.e., h−1y ' x iffhx ' y, as usual. By Axiom 3 h−1 exists when h is injective. Now it isstraightforward that f ◦ h−1 ◦ f ' f. �

7 A Model for Set Theory

In this section we will construct within AFT a model for ZFC◦U. The frame-work of this section is function theoretic again, i.e., all concepts must beunderstood in the sense of AFT. If necessary, we will place a dot above thenotation of a concept, if that concept is meant in the set theoretical sense.So the situation in this section is precisely the opposite of the situation inSection 6.

We start with some definitions and theorems.

Definition 7.1 (Weakly Transitive) A function f is weakly transitive iff

∀x ∈ δf(δx ⊆ δf). �

As a motivation for the prefix “weakly” in Definition 7.1, we give the fol-lowing definition.

Definition 7.2 (Transitive) A function f is transitive, denoted as Tr(f),iff

∀x ∈ δf(x⊆• f). �

Note, that a transitive function is also weakly transitive. Note also, thatwhen f is a set, transitivity of functions coincides with transitivity of sets.

Definition 7.3 (Domain Closure) A set a is the domain closure of afunction f , denoted as δcf , iff a is the smallest weakly transitive set, suchthat δf ⊆ a. �

Theorem 7.4 For every function f , δcf exists.

45

Proof. Define inductively the function g, with the set of natural numbersas its domain, such that

– g0 ' δf,

– g(sNn) '⋃{δx | x ∈ gn},

where sN denotes the successor function for the natural numbers (cf. Sec-tion 5.5). Then

⋃(ρg) is the required set δcf. �

Now we come to the definition, that is central to the construction of amodel for ZFC◦U.

Definition 7.5 (Pure Set) A set a is a pure set, denoted as PS (a), iff

∀x ∈ δca(Set(x)). �

The next theorem asserts, that ZFC◦U can be embedded in AFT.

Theorem 7.6 (Embedding ZFC◦U) The collection of all pure sets, to-gether with the membership relation, is a model of ZFC◦U.

Within ZFC◦U well-founded sets can be defined in a well-known way, soZFC can also be embedded in AFT (cf. e.g. Fraenkel et al (1973),pages 93–94).

The proof of Theorem 7.6 consists of seven lemmas, one for each axiomof ZFC◦U. Here too, all formulas must be relativized to the collection ofpure sets.

Lemma 7.7 (Axiom ZF-1) ∀a, b(∀x(x ∈ a↔ x ∈ b)→ a = b).

Proof. Extensionality of pure sets follows immediately from extensionalityof functions. �

Lemma 7.8 (Axiom ZF-2) ∀a∃b∀y(y ∈ b↔ ∃x(x ∈ a ∧ ϕ(x, y))),where ϕ(x, y) is a functional formula which can be expressed using member-ship as its only non-logical predicate.

Proof. Let a be a pure set. By Axiom 3 there exists a function f such thatfor all x, y

fx ' y ↔ x ∈ a ∧ ϕ(x, y) ∧ PS (y).

Then ρf is the required pure set b. �

Lemma 7.9 (Axiom ZF-3) ∃a 6' ∅∀x ∈ a∃y ∈ a(x ⊆ y ∧ x 6= y).

46

Proof. Define inductively the function f , with domain the set of naturalnumbers (in the sense of AFT), such that

– f0 ' ∅,

– f(sNn) ' {fn}.

Then ρf is a non-empty pure set, such that for all x ∈ ρf there exists ay ∈ ρf with x ⊆ y and x 6= y. �

Lemma 7.10 (Axiom ZF-4) ∀a∃b∀x(x ∈ b↔ ∃y(x ∈ y ∧ y ∈ a)).

Proof. By Theorem 5.4.1 and the observation that⋃a is a pure set, when

a is a pure set. �

Lemma 7.11 (Axiom ZF-5) ∀a∃b∀x(x ∈ b↔ x ⊆ a).

Proof. By Theorem 5.4.6 and the observation that Pa is a pure set, whena is a pure set. �

Lemma 7.12 (Axiom ZF-6) ∀s(.

Es (s)→ ∃t(Tr(t)∧ .ε(t) .∼ s)).

Proof. Suppose s is an extensional structure in the sense of set theory, i.e.,s '

.〈 a, r

.〉 , where a and r are pure sets and r ⊆ a

.× a, such that for all

x, y ∈ a

∀u(.〈u, x

.〉 ∈ r ↔

.〈u, y

.〉 ∈ r)→ x = y.

Since s is extensional, there are two possibilities with respect to an “emptyset according to r”, i.e., either there is exactly one ∅r ∈ a such that

¬∃x(.〈x, ∅r

.〉 ∈ r),

or there is no such ∅r ∈ a. If there is not, then let ∅r denote some function,which is not a pure set, such that ∅r 6∈ a (e.g., ∅r ' 2).

Now let 1r 6' ∅r also be a function which is not a pure set, and definethe binary function

f ' λ(〈x, y〉7→1r |.〈y, x

.〉 ∈ r)[ 〈1r, ∅r〉7→∅r ],

then f is internally extensional in the sense of AFT. By Axiom 7 there existsan apply function g such that g ∼ f , i.e., there is an isomorphism h withdomain ∆qf and range ∆qg such that

h∅r ' λ() and h1r ' 1.

47

Furthermore, for all x, y ∈ a

hx ' λ(hy 7→1 |.〈y, x

.〉 ∈ r),

i.e., hx is a pure set and

hx ' {hy |.〈y, x

.〉 ∈ r}.

Now define the set

t ' {hx | x ∈ a},

then t is a transitive pure set and the internal ε-structure (in the sense ofset theory) of t is isomorphic (in the sense of set theory) with s. �

Lemma 7.13 (Axiom ZF-7) ∀a(∀x, y ∈ a(x = y ∨ x ∩ y ' ∅)→→ ∃b∀x ∈ a(x 6' ∅ → ∃y(b ∩ x ' {y}))).

Proof. Let the pure set f be a function in the sense of set theory. Definethe function (f ′) in the sense of AFT as follows:

f ′ ' λ(x7→y |.〈x, y

.〉 ∈ f).

By Axiom 8 there exists a function h, such that f ◦ h ◦ f ' f.Let g′ ' (h � ρf)−1, i.e., g′ is the inverse of h � ρf , then clearly

g′ ⊆• f ′ ∧ ρf ′ ' ρg′ ∧ Inj (g′).

Now define

g ' {.〈x, y

.〉 | g′x ' y},

then g is a pure set. Furthermore, g is a function in the sense of set theory,and we have

g ⊆ f ∧ .ρf ' .

ρg ∧ Inj (g).

This is a well-known equivalent of the Axiom of Choice, and we refer thereader to the literature (e.g. Rubin and Rubin (1963), Van Dalen et al(1978)). �

Acknowledgements. I wish to thank Henk Barendregt, Maarten Fokkingaand Gerrit van der Hoeven for their valuable criticism and advice. I alsothank the referee of Information and Computation for his (her) correctionsand suggestions.

References

48

Aczel, P. (1988), Non-Well-Founded Sets, CSLI, Stanford.

Arbib, M.A. and E.G. Manes (1975), Arrows, Structures, and Functors– The Categorical Imperative, Academic Press, New York.

Barendregt, H.P. (1984), The Lambda Calculus – Its Syntax and Seman-tics (revised edition), North-Holland, Amsterdam.

Barendregt, H.P. and K. Hemerik (1990), Types in lambda calculi andprogramming languages, in: “ESOP ’90” (N. Jones, Ed.), LNCS 432,Springer, Berlin, 1 – 35.

Barendregt, H.P. (199?), Lambda calculi with types, to appear in: “Hand-book of Logic in Computer Science” (S. Abramsky, D.M. Gabbay andT.S.E. Maibaum, Eds.), Oxford University Press, Oxford.

Barr, M. and C. Wells (1990), Category Theory for Computing Science,Prentice-Hall, New York.

Beeson, M.J. (1985), Foundations of Constructive Mathematics, Springer,Berlin.

Bernays, P. (1937), A System of Axiomatic Set Theory, Part I, J. Sym-bolic Logic 2, 65 – 77.

Bernays, P. (1941), A System of Axiomatic Set Theory, Part II, J. Sym-bolic Logic 6, 1 – 17.

Boffa, M. (1969), Sur la Theorie des Ensembles sans Axiome de Fonde-ment, Bull. Soc. Math. Belg. 21, 16 – 56.

Curry, H.B. and R. Feys (1958), Combinatory Logic, Vol. I, North-Holland, Amsterdam.

Dalen, D. van, H.C. Doets and H.C.M de Swart (1978), Sets: Naive,Axiomatic and Applied, Pergamon Press, Oxford.

Van Dalen, D. and A.F. Monna (1972), Sets and Integration – An Out-line of the Development, Wolters-Noordhoff, Groningen.

Feferman, S. (1975), A language and axioms for explicit mathematics, in:“Algebra and Logic” (J.N. Crossley, Ed.), Springer, Berlin, 87 – 139.

Fraenkel, A.A., Y. Bar-Hillel, A. Levy and D. van Dalen (1973),Foundations of Set Theory, North-Holland, Amsterdam.

49

Godel, K. (1940), The Consistency of the Axiom of Choice and of the Gen-eralized Continuum Hypothesis, Princeton University Press, Princeton.

Hindley, J.R. and J.P. Seldin (1986), Introduction to Combinators andλ-Calculus, Cambridge University Press, Cambridge.

MacLane, S. (1971), Categories for the Working Mathematician, Springer,Berlin.

Moggi, E. (1988), The Partial Lambda Calculus, University of Edinburgh,Edinburgh.

Von Neumann, J. (1925), Eine Axiomatisierung der Mengenlehre, J. ReineAngew. Math. 154, 219 – 240.

Von Neumann, J. (1928), Die Axiomatisierung der Mengenlehre, Math.Z. 27, 669 – 752.

Von Rimscha, M. (1980), Mengentheoretische Modelle des λK-Kalkuls,Arch. Math. Logik Grundlag. 20, 65 – 73.

Von Rimscha, M. (1981), Universality and Strong Extensionality, Arch.Math. Logik Grundlag. 21, 179 – 193.

Robinson, R.M. (1937), The Theory of Classes – A Modification of VonNeumann’s System, J. Symbolic Logic 2, 29 – 36.

Rubin, H. and J.E. Rubin (1985), Equivalents of the Axiom of Choice(revised edition), North-Holland, Amsterdam.

Schmidt, D.A. (1986), Denotational Semantics – A Methodology for Lan-guage Development, Allyn and Bacon, Boston,

Stoy, J.E. (1977), Denotational Semantics – The Scott-Strachey Approachto Programming Language Theory, MIT Press, Cambridge, Mass.,

Troelstra, A.S. and D. van Dalen (1988), Constructivism in Mathe-matics – An Introduction, Vol. I, North-Holland, Amsterdam,

50

An Axiomatic Theory for Partial Functions · An Axiomatic Theory for Partial Functions Jan Kuper...

Documents

Transcript of An Axiomatic Theory for Partial Functions · An Axiomatic Theory for Partial Functions Jan Kuper...