A FIRST COURSE IN MATHEMATICS

110
A FIRST COURSE IN MATHEMATICS ALEXANDRU BUIUM Contents 1. Introduction 2 2. Languages 8 3. Metalanguage 13 4. Syntax 18 5. Semantics 22 6. Definitions 24 7. Witnesses 25 8. Tautologies 26 9. Theories 31 10. Proofs 36 11. Argot 40 12. Strategies 42 13. Fallacies 46 14. Examples 46 15. ZFC 53 16. Sets 56 17. Maps 60 18. Relations 63 19. Operations 68 20. Integers 72 21. Induction 75 22. Rationals 78 23. Combinatorics 79 24. Sequences 81 25. Reals 82 26. Complexes 83 27. Residues 84 28. p-adics 86 29. Algebra 87 30. Geometry 91 31. Analysis 97 32. Metamodels 99 33. Categories 101 34. Models 105 1

Transcript of A FIRST COURSE IN MATHEMATICS

A FIRST COURSE IN MATHEMATICS

ALEXANDRU BUIUM

Contents

1. Introduction 22. Languages 83. Metalanguage 134. Syntax 185. Semantics 226. Definitions 247. Witnesses 258. Tautologies 269. Theories 3110. Proofs 3611. Argot 4012. Strategies 4213. Fallacies 4614. Examples 4615. ZFC 5316. Sets 5617. Maps 6018. Relations 6319. Operations 6820. Integers 7221. Induction 7522. Rationals 7823. Combinatorics 7924. Sequences 8125. Reals 8226. Complexes 8327. Residues 8428. p-adics 8629. Algebra 8730. Geometry 9131. Analysis 9732. Metamodels 9933. Categories 10134. Models 105

1

2 ALEXANDRU BUIUM

1. Introduction

Mathematics has a dual status. On the one hand, along with natural sciences,mathematics helps describe and predict phenomena in the natural world; this isthe flavor of the mathematics of Archimedes and Galileo, for instance, or of con-temporary applied mathematics. On the other hand mathematics is a deductivediscipline, proceeding by thought alone, according to the rules of deduction, andguided by aesthetic criteria; this is the flavor of mathematics advocated by Platoand Euclid, for instance, and the flavor of contemporary pure mathematics. Ofcourse, there is no barrier between these two aspects of mathematics: historically ithas constantly been the case that the pure mathematics of one generation becamethe applied mathematics of the next generations. The fact that this dynamics iseven possible is an interesting philosophical puzzle: indeed it is not easy to explainhow pure thought, guided by aesthetics, can somehow reproduce external realityon a parallel level and have, through its applications, an impact on external reality.(Cf. Wigner’s celebrated essay “The unreasonable effectiveness of mathematics inthe natural sciences”.)

The present course is an introduction to mathematics as a deductive discipline.In particular we will not address the relation between mathematics and externalreality. As a deductive discipline mathematics can be viewed as a chapter of logic.Other disciplines that have deductive aspects (such as theoretic physics or law)are partially grounded in logic. Indeed logic organizes thought into theories. Thenmathematics can be viewed as one among many theories; as a theory mathematics ISset theory. Its main objects are certain special types of set theoretic constructionssuch as numbers, figures, and limits, leading to algebra, geometry, and analysisrespectively. This viewpoint on mathematics emerged from work of Frege, Cantor,Russell, and Hilbert at the turn of the 20th century; it evolved from a logicist to aformalist approach; this evolution roughly corresponds to the gradual eliminationof semantics from set theory (see below). The viewpoint that mathematics is,essentially, set theory was adopted by Bourbaki who, in the middle of the 20thcentury, provided such a presentation of mathematics in a series of now classicalmonographs.

Logic is then a prerequisite for mathematics and must be distinguished from thesubject called “mathematical logic” which, in its turn, is a chapter of mathematicsitself. Mathematical logic models (mirrors) logic within mathematics. This view-point was initiated by Hilbert who was one of champions of formalism. Hilbert’sprogram (proposed in 1920) was to prove, within mathematics, that mathematicsdoes not lead to contradictions (it is consistent) and can prove or disprove any ofits statements (it is complete). This dream was shattered by the work of Godel(in the 1930s) who proved in particular two mathematical theorems whose transla-tions into English are as follows: 1) consistency of mathematics cannot be provedwithin mathematics; and 2) if mathematics is consistent then it is incomplete. Onthe other hand work by many people (including Godel) showed that, in spite ofthe failure of Hilbert’s original program, mathematical logic can be successfullydeveloped provided goals less ambitious than Hilbert’s are being put forward. Ourapproach in this course is, essentially, the extreme formalistic approach of Hilbert.

Note that mathematical logic has logic as a prerequisite: in order to even talkabout mathematical logic one needs mathematics (i.e. set theory) which is in its

A FIRST COURSE IN MATHEMATICS 3

turn grounded in logic. If we use the symbol ⊂ to indicate containment thenschematically one can represent the above discussion as follows:

{mathematical logic} ⊂ {mathematics} ⊂ {logic}

Corresponding to the 3 levels above the concept of set (which is the key concept inall 3) needs to be developed 3 times, once at each level; one has therefore 3 theories:

Level 1 theory: Naive set theory;Level 2 theory: Axiomatic set theory;Level 3 theory: Formalized set theory.

Level 1 theory is necessary in order to express the main concepts of logic. Thenaive sets of Level 1 theory (also referred to here as metasets) are physical objectsthat we use to communicate, typically single signs on a piece of paper, or strings ofsigns, or collections of strings of signs. Level 2 theory (which is set theory proper)is a theory in the sense of logic and, from the viewpoint of logic, is identical tomathematics itself. Each set in this theory is a concrete single sign. Level 3 theoryis a theory in the sense of (and hence is part of) mathematical logic. In this theoryone uses Level 2 sets as “models” but there are no objects that are called sets perse. The 3 levels are mirrors of each other but they are completely distinct theories.

Historically sets were already implicitly used by Aristotle who considered theproblem of classification i.e. the problem of organizing objects into collectionsaccording to their sharing common “predicates”. This kind of collections/sets canbe essentially identified with the naive sets of the Level 1 theory (the metasets).On the other hand the modern theory of sets (especially the hierarchy of infinitesets) was invented by Cantor. But Cantor’s original sets do not fit in any of the 3Levels above. Indeed, according to Cantor’s original (admittedly vague) definition,a set is a “collection of distinct objects [...] of our thought” that have a meaning(semantics) and can proliferate indefinitely according to certain (meaningful) rules,until they reach all kinds of “levels of infinity”. These Cantor sets do not fit intoLevel 1 theory because the Level 1 metasets can be kept within the “metacountableinfinity”. Cantor sets do not fit into Level 2 theory because the sets of Level 2 theoryare single (syntactical) signs (constants in a theory) and again their multiplicity canbe restricted to the “metacountable infinity”. Finally, in Level 3 theory, there areactually no individual objects called sets so there is no room for Cantor sets. Insome sense naive sets, and more generally, Cantor sets assume a rather elaborateontology (cf. especially Badiou’s book “Being and event” for a philosophical angleon this). On the contrary, put briefly, our ontology here is minimalistic: the onlythings that we acknowledge as existing are:

a) a subject (called I);b) signs on a piece of paper (languages);c) the ability/willingness of the subject to operate with signs (e.g. to con-

struct/check syntactically correct sequences, to check/construct proofs, to performtranslations, etc.). This latter ability is postulated and apparently resides in theunconscious (cf. N. Chomsky, J. Jackendoff, etc.).

The plan of the course is as follows. The first sections are about logic (andhence about the Level 1 theory). This discussion of logic applies to any deductivediscipline, not only to mathematics. It involves essentially two steps: the analysis

4 ALEXANDRU BUIUM

of language (which is essentially part of linguistics) and the analysis of inference(which is, in some sense, logic proper). The analysis of inference will culminatewith the introduction of the notion of proof (cf. the section called Proofs); therewill be no proofs in this course before that point, although we will need, even beforethe section called Proofs, a certain weaker form of arguments that can be viewed asa sort of “metaproofs”. Our discussion of mathematics (beginning with the sectioncalled ZFC) will start by a discussion of the Level 2 theory; then we construct thebasic types of numbers; then we introduce the basic objects of algebra, geometry,and analysis. We end with a section on mathematical logic; here we will touch uponthe subject of Level 3 theory.

A word of warning is necessary: most textbooks on logic/set theory offer apresentation that limits itself to 2 rather than 3 levels of set theory. Mathematicallyoriented books on logic (like standard books in model theory) usually neglect Level1, assume Level 2 as “known”, and concentrate exclusively on developing Level 3.Philosophically oriented books usually present Levels 1 and 2 fused together (“logicand set theory are identical”) and do not discuss the place of Level 3 in this picture.

We end this Introduction by making some general comments first about logicand then about mathematics.

We start with comments on logic. The idea that rational thought obeys definitelaws goes back to the classics of logic: Aristotle (4th century BC), Leibniz (17/18thcentury), and Boole (19th century). Now rational thought does not operate withexternal reality but rather with names/signs given/attached to various parts ofexternal reality and to their relations. Signs are concrete objects: they are spoken,written, or shown. They are organized in systems of signs by certain organizationalprinciples. The collection of all signs in a system of signs is referred to as a language.So the first step in logic is the analysis of languages.

Languages are in various relations to each other. Some relations attach elementsof one language to elements with the same function in other languages; such rela-tions can be referred to as horizontal; translations are an example of such relations,e.g.

English → French

Other relations attach to elements of a language elements of a different languagethat play different functions; such relations can be referred to as vertical; exam-ples of such relations are descriptive relations, prescriptive relations, generativerelations, etc. Cf. some examples below.

Signs within a language are assembled into sentences (judgements) according tospecific rules which are common to large groups of languages. The main aspectsof languages are: syntax, semantics, reference, truth. Syntax is the set of rulesgoverning the way sentences are assembled from signs; in other words sentences arethe syntactically correct strings of symbols. Semantics (also referred to as meaning)is about what sentences “say” in relations to other languages. Reference is “whatthe signs refer to” outside the language. Truth is a property of sentences that havemeaning; a theory of truth requires that any sentence that has a meaning is eithertrue or false; a theory of truth does not require in principle that truth be capableof being verified. For instance consider the following utterances:

I through passes not electron slit

A FIRST COURSE IN MATHEMATICS 5

II. the enthusiasm of the crowds is greenIII. it is going to be hereIV. all men are immortalV. Napoleon diedVI. there is extraterrestrial intelligenceIn the above I is syntactically incorrect so it is not a sentence. II is a sentence;

it has no meaning in any reasonable translation so no truth value; its words have aprecise reference. III is a sentence; it has an imprecise reference; once a referenceis given to “it” it has a meaning; hence it becomes susceptible of being true orfalse. IV and V are sentences with meaning and reference; IV is false and V is true;the falsehood of IV and the truth of V are trivially verified. VI is a sentence, ithas meaning and reference, so it is susceptible of being true or false; its truth orfalsehood can be though of as being independent of our ability to verify it (althoughnot all theories of truth agree on this). The above are just examples; one task ofthe philosophy of language is to define the concepts of syntax, semantics, reference,truth, and build an explanatory theory of language and understanding based onthe corresponding definitions.

Now in order to “talk about” the syntax, semantics, and reference of a fixedlanguage L (or group of languages) we need to fix yet another language M . Oncewe fixed L and M we may call L the object language and M the metalanguage. Thisdesignation is relative i.e. a language may be an object language in one designationand a metalanguage in another designation. But once such a designation has beenmade we will tend to treat object languages and metalanguages according to slightlydifferent standards, as we shall explain momentarily. Between metalanguage andobject language there is a vertical relation which has a descriptive/prescriptivecharacter:

metalanguage

language

The necessity of introducing metalanguages (and indeed of introducing metamet-alanguage, metametametalanuguage, in the obvious sense) was recognized by Tarski;but our metalanguage will be slightly different from Tarski’s (ours will be poorerthan Tarski’s, as we shall explain later.)

Sentences in metalanguage will be called metasentences; meaning in metalan-guage will be called metameaning, etc. Assuming we fixed an object language anda metalanguage we will usually:

1) carefully distinguish between the signs of object language and the same signsviewed as belonging to metalanguage;

2) ignore the meaning and reference of sentences in the object language;3) take into consideration the meaning and reference of metasentences in meta-

language;4) control syntax of sentences in the object languages (as well as translations

between various object languages) through metasentences in metalanguage;

6 ALEXANDRU BUIUM

5) adopt a self-referential approach to controlling the syntax of metalanguages;this is in order to avoid the introduction of a metametalanguage, etc.

6) ban truth predicates from both the object language and the metalanguage;so the words true and false will be banned from both object languages and meta-language; or if used they will be declared meaningless.

7) replace truth by a verification process (to be explained in detail later) calledproof (for sentences) and metaproof (for metasentences).

By 1 and 6 above we dispose of a class of classical paradoxes. Indeed consider the“liar’s paradox” which is the following sentence in English: “The sentence that youreading now is false”. If one assumes the sentence is true it looks like its contentsays it is false. And if one assumes the sentence is false it looks like its content saysit is true. There are two sources of the above paradox. One is the use of the wordstrue and false equipped with their meaning. Another one is the identification ofEnglish as object language with English as metalanguage. In this course we willadopt, as already mentioned, a viewpoint which eliminates both sources of theabove paradox: this is done through 1 and 6 above.

Another reason why we want to operate at two levels (an object language leveland a metalanguage level) is that object languages (like English, theoretical physics,mathematics) usually have a complicated structure and their meaning is acquiredthrough translation into other complicated object languages (like history/literature,experimental physics, philosophy); whereas the metalanguage that controls theirsyntax has a relatively simple structure (it refers to these object languages alone,hence, essentially, to marks on a piece of paper) and its meaning is acquired throughtranslation into other simple languages (like the ones governing simple childrengames or elementary computer programs). Think of the meaning (and ways ofverification) of the sentence

1) “Radiation is quantized”

in English compared to the metasentence in “MetaEnglish”:

2) The word “radiation” occurs in the sentence “radiation is quantized”

The meaning (and so verification) of the sentence 1 is problematic: it depends onthe translation of the sentence into the language of theoretical physics; there may beseveral translations into one such language and there are several physical theories tochoose from; deciding which theory to use depends on translations between varioustheoretical physics languages and the language of data coming from experimentalphysics; the decision may depend also upon translations into usual English andinteractions of these with sentences expressing issues in the politics of science, etc.On the other hand the metameaning (and verification) of the metasentence 2 aboveis arguably less problematic: it depends on the language describing the search of aword in a sentence and as such 2 has an obvious meaning and is trivial to verify.This being the case one can defer the analysis of meaning and verification froma complicated/problematic level to an arguably simple/unproblematic level. (Themeaning and verification of 2 is not as unproblematic as it seems: see Wittgenstein’sanalyses of understanding or following orders/instructions, etc.)

So far we discussed the analysis of languages. The next step in logic is theanalysis of inference(=deduction) which allows one to verify sentences by infer-ring(=deducing) them from other sentences. If truth were allowed as a predicateinference would be required to be such that sentences inferred from true sentences

A FIRST COURSE IN MATHEMATICS 7

must themselves be true. However, since truth of sentences is not defined, weshould view inference as a substitute for the guarantee of truth. Let us fix in ourdiscussion an object language and a metalanguage. A body of sentences in theobject language obtained via inference from a given list of sentences called specificaxioms will be called a theory. The sentences of a theory will be called theorems.Inference is defined in terms of sentences referred to as background axioms. Theprocess of checking/verifying inference will be called proof. A key syntactic objectbehind inference will be a certain class of constants called witnesses; this way ofproceeding goes back to Hilbert (cf. also the exposition in Weyl’s book “Philosophyof Mathematics and Natural Sciences”). The word witness used here is borrowedfrom model theory but our use of witnesses is not model theoretic (it precedes, asin Hilbert, set theory).

All the definitions and rules of inference in the object language are spelled outin metalanguage. Now metalanguage itself needs a process of checking/verifying itsmetasentences; such a process can be referred to as metaproof. And indeed devel-oping some metaproofs is a prerequisite for even introducing the concept of proof.Metaproofs of metasentences in metalanguage will be held to lesser standards thanproofs of sentences in the object language; this is, again, unavoidable if one wants toavoid the introduction of a metametalanguage. Also one should note that Hilbert’soriginal program (in which the object language is the language of mathematics)required metaproofs to be “finitistic” (i.e., essentially, “quantifier free”). Godel’swork showed that this needs to be regarded as not feasible. Since introducing non-finitistic methods (in particular the integers) in metaproofs would be circular wewill have to proceed as follows: 1) we will only allow finitistic metaproofs (based,generally, on case by case inspection of the physical shape of language); 2) whengiving non-finitistic arguments in metaproofs we will not consider them completelysound and we shall be ready to disregard these metaproofs; 3) we will then developmathematics (with its non-finitistic proof theory); 4) we will revisit the problemsof logic within mathematics in the form of mathematical logic. Godel’s work ac-tually followed this path too: it is work in the object language of sets and not inmetalanguage.

We end by noting that there are other types of vertical relations among languagessuch as Chomsky’s relation

surface structure

deep structure

in which the vertical arrow consists of successive compositions of syntactical opera-tions such as ellipses, permutations, etc. This vertical relation is neither descriptivenor prescriptive but, rather, generative; this type of relation between languages willnot concern us in this course (although it may be the key to the semantics of meta-language which IS of concern to us). The mechanisms behind this scheme arereferred to by Chomsky as universal grammar. A “standard” example of this givenby Chomsky is the surface structure sentence

“the invisible God created the visible world”

8 ALEXANDRU BUIUM

The deep structure sentence which generates the above is:

“God is invisible, God created the world, and the world is visible”

Here are some comments on mathematics. The basic concept of mathematics isthat of number. The main types of numbers are: integer numbers, rational num-bers, real numbers, complex numbers, numbers mod p, and the p-adic numbers.(Here p is any prime number.) The collections of such numbers are denoted byZ,Q,R,C,Fp,Zp respectively. Denoting by arrows what we will call ring homo-morphisms we have the following connections between various types of numbers:

Z → Q → R → C

Zp

Fp

As the picture suggests the integers Z are the most basic numbers; we willintroduce integers axiomatically and then we will construct the rest of the numbersvia standard constructions in set theory. The part of mathematics that abstracts thebehavior of numbers is algebra. Geometry then deals with figures. Finally analysisdeals with limits, more generally with the infinite (both large and small). The coursewill provide a quick introduction to some of the basic objects of algebra, geometry,and analysis. These 3 branches of mathematics are closely interconnected; and ineach of these branches one is lead to consider all the types of numbers referred toabove. A somewhat special chapter of mathematics is mathematical logic (whichwill be mentioned in the last section of the course); it can be viewed as a specialchapter of algebra but its flavor is different from that of main stream algebra.

2. Languages

Logic starts with the analysis of language. We start with this analysis now. Thenext step, to be undertaken later, is the analysis of inference.

Example 2.1. (Syntactic analysis) Let us start with a discussion of natural lan-guages such as the English language. The English language is the collection LEngof all English words (plus separators such as parentheses, commas, etc.) We treatwords as individual symbols (and ignore the fact that they are made out of letters.)Sometimes we admit as symbols certain groups of words. One can use words tocreate strings of words such as

0)“for all not Socrates man if”

The above string is considered “syntactically incorrect”. The sentences in theEnglish language are the strings of symbols that are “syntactically correct” (ina sense to be made precise later). Here are some examples of sentences in thislanguage:

A FIRST COURSE IN MATHEMATICS 9

1) “if Socrates is a wise man then Socrates is a man.”2) “Socrates is not a mason and Socrates’ father is a mason”3) “for all things either the thing is not a man or the thing is mortal.”4) “there exists somebody who is Plato’s teacher”5) “for all things the thing is Socrates if and only if the thing is Plato’s teacher”

In order to separate sentences from a surrounding text we put them betweenquotation marks (and sometimes we write them in italics). So quotation marksdo not belong to the language but rather they lie outside the language. Checkingsyntax presupposes a partitioning of LEng into various classes of words; no wordshould appear in principle in two different classes, but this requirement is oftenviolated in practice (which may lead to different readings of the same text). Hereare the classes:

• variables: “thing, somebody,...”• constants: “Socrates, Plato, the Wise, the Mason,...”,• functional predicates: “the father of, the teacher of,...”• relational predicates: “is (belongs to), is a man, is mortal,...”• connectives: “and, or, not, if...then, if and only if”• quantifiers: “for all, there exists”• equality: “is (equals)”• separators: parantheses “(,)” and comma “,”

In general objects are named by constants or variables. Constants are namesfor specific objects while variables are names for non-specific (generic) objects.Relational predicates say/affirm something about one or several objects; if theysay/affirm something about one, two, three objects etc., they are unary, binary,ternary, etc. Functional predicates objects as arguments but do not say/affirmanything about them; all they do is refer to (or name, or specify, or point towards)something that could itself be an object. Again they can be unary, binary, ternary,etc., depending on the number of arguments. Connectives connect/combine sen-tences into longer sentences; they can be unary (if they are added to one sentencechanging it into another sentence, binary if they combine two sentences into onelonger sentence, ternary, etc.) Quantifiers specify quantity and are always followedby variables. Separators separate various parts of the text from various other parts.

In order to analyze a sentence one first looks for the connectives and one splits thesentence into simpler sentences; alternatively sentences may start with quantifiersfollowed by variables followed by simpler sentences. In any case, once one identifiessimpler sentences, one proceeds by identifying, in each of them, the constants,variables, and functional predicates applied to them (these are the objects that oneis talking about), and finally one identifies the functional predicates (which saysomething about the objects).

In our examples of sentences 1-5 we have the following.In 1 “if...then” are connectives connecting the simpler sentences “Socrates is a

wise man” and “Socrates is a man”. Let us look at the sentence “Socrates is a wiseman”. The word “Socrates” here is viewed as a constant; the group of words “wiseman” is viewed as a constant; “is” is a binary relational predicate (it says/affirmssomething about 2 objects: “Socrates” and “the wise men”; it says that the firstobject belongs to the second object). Let us look now at the sentence “Socrates isa man”. We could read this second sentence the way we read the first one but let

10 ALEXANDRU BUIUM

us read it as follows: we may still view “Socrates” as a constant but we will view“is a man” as a unary relational predicate (that says/affirms something about onlyone object, Socrates).

A concise way of understanding an analysis of English sentences as above isto create another language LFor (let us call it Formal) consisting of the followingsymbols:

• variables: “x, y, ...”• constants: “s, p, w,m”• functional predicates: “f,�”• relational predicates: “∈, ρ, †”• connectives: “∧,∨,¬,→,↔”• quantifiers: “∀,∃”• equality: “=”• separators: parantheses “(, )” and comma “,”

Furthermore let us introduce a rule (called translation) that attaches to each symbolin Formal a symbol in English as follows:

“x, y” are translated as “thing, somebody”“s, p, w,m” are translated as “Socrates, Plato, the Wise, the Mason”,“f,�” are translated as “the father of, the teacher of”“∈, ρ, †” are translated as “belongs to, is a man, is mortal”“∧,∨,¬,→,↔” are translated as “and, or, not, if...then, if and only if”“∀,∃” are translated as “for all, there exists”“=” is translated as “is”

Then the English sentence 1 is the translation of the following Formal sentence:

1’) “(s ∈ w)→ (ρ(s))”

Conversely we say that 1’ is a formalization of 1.Let us continue, in the spirit above, the analysis of the sentences 2,...,5 above.In 2 “and” and “not” are connectives (binary and unary respectively): they are

used to assemble our sentence 2 from two simpler sentences: “Socrates is a mason”and “Socrates’ father is a mason”. In both these simpler sentences “Socrates” and“mason” are constants; “the father of” is a unary functional predicate (it refers toSocrates and points towards/names his father but it does not say anything aboutSocrates); “is” is a binary relational predicate (it says something about 2 objects).Here is a formalization of 2:

2’) “(¬(s ∈ m)) ∧ (f(s) ∈ m)”

Sentence 3 starts with a quantifier “for all” followed by a variable “things” fol-lowed by a simpler sentence. That simpler sentence is made out of 2 even simplersentences “the thing is a man” and “the thing is mortal” assembled via connectives“not” and “or”. Finally “is a man, is mortal” are unary relational predicates. Hereis a formalization of 3:

3) “∀x((¬ρ(x)) ∨ †(x))”

A FIRST COURSE IN MATHEMATICS 11

Sentence 4 starts with a quantifier “there exists” followed by a variable “some-body” followed by a simpler sentence that needs to be read as “that somebody isPlato’s teacher”. In the latter “teacher of” is a unary functional predicate while“is” is equality. Here is a formalization of 4:

4’) “∃y(y = �(p))”

Sentence 5 starts with a quantifier “for all” followed by a variable “things” fol-lowed by a simpler sentence. The simpler sentence is assembled from two evensimpler sentences: “the thing is Socrates” and “the thing is Plato’s teacher” con-nected by the connective “if and only if”. Finally in these latter sentences “Socrates,Plato” are constants while “teacher of” is a unary functional predicate, and “is” isequality. Here is a formalization of 5:

5’) “∀x((x = s)↔ (x = �(p)))”

We note that “or” in English is disjunctive: “this or that” is used in place of“this or that or both”.

Also remark the use of “is” in 3 instances: as a binary relational predicateindicating belonging, as part of a unary relational predicate, and as equality.

Remark also that we view the variables “thing” and “somebody” on an equalfooting, so we ignore the fact that the first suggests an inanimate entity whereasthe second suggests a living entity.

Also note that all verbs in 1-5 are in the present tense. English allows othertenses, of course. But later in mathematics all predicates need to be viewed astense indifferent: mathematics is atemporal. This is an instance of the fact thatnatural languages like English have more expressive power than mathematics.

Finally remark that the word “exists” which could be viewed as a relationalpredicate is treated instead as part of a quantifier. Sentences like “philosophers ex-ist” and “philosophers are human” have a totally different logical structure. Indeed“philosophers exist” should be read as “there exists somebody who is a philosopher”while “philosophers are human” should be read as “for all things if the thing is aphilosopher then the thing is a human”. The fact that “exist” should not be viewedas a predicate was recognized already by Kant in his criticism of the “ontologicalargument”.

Remark 2.2. (Semantics/reference) At this stage we do not know what “meaning”is so the sentences 1,...,5 above should be viewed as meaningless; and they AREmeaningless for a person who does not know the English language. For such a per-son establishing meaning requires relating these sentences to sentences in anotherlanguage (e.g. relating them “horizontally”, i.e. translating them into French,German, a “picture language”, sign language, Braille, etc., or using a “vertical”relation to “deeper” languages); the more translations available the more definitethe meaning. On the other hand the meaning of 1’,...,5’ is taken to be given by thesentences 1,...,5 (where one assumes one knows English).

Sentences in English may refer to various aspects of reality (physical or intellec-tual reality). Sentences in Formal can be attached their own reference.

We could also ask if the sentences 1,...,5 are “true” or “false”. As explained inthe Introduction we will not define truth/falsehood for sentences in our languages.The same applies to 1’,...,5’.

12 ALEXANDRU BUIUM

Remark 2.3. (Declarative/imperative/interrogative) All sentences considered sofar were declarative (they declare their content). Natural languages have othertypes of sentences: imperative (giving a command like: “Lift this weight!”) andinterrogative (asking a question such as: “Is the electron in this portion of space-time?”). In principle, from now on, we will only consider declarative sentences inour languages. An exception to this will later be the use of imperative forms ina language called Argot; Argot will be a language in which we will write most ofour proofs and imperative forms to be used in it will be, for instance: “consider”,“assume”, “let...be”, “let us...”, etc.)

Remark 2.4. (Naming) It is useful to give names to sentences. For instance if wewant to give the name P to the English sentence “Socrates is a man” we write

P equals “Socrates is a man”

The latter is not a sentence in our language (English): neither P nor the quotationmarks belong to what we declared to be English; and the word is outside the quo-tation marks should be viewed as different from the word “is” inside the quotationmarks. (Later we will see that these words outside the quotation marks belong towhat we shall call metalanguage.) One can give various different names to the samesentence. In a similar way one can give names to sentences in Formal:

Q equals “ρ(s)”

In the above discussion we encountered 2 examples of languages: English andFormal. Here is a general definition.

Definition 2.5. A language is a collection L of symbols. The symbols in L areof 8 types called: variables, constants, functional predicates, relational predicates,logical connectives, quantifiers, equality, and separators. Some of these may bemissing. (Functional predicates and equality can always be “eliminated” in anobvious sense.) Also we assume that the list of the variables never ends (is “infinite”in the intuitive sense). The list of constants may or may not end. Finally we assumethat the only allowed separators are parentheses (, ) and commas; we especially banquotation marks “...” from the separators allowed in a language (because we wantto use them in what we shall call later metalanguage). Given a language L onecan consider the collection L∗ of all strings of symbols. We will later define whatone means by a string in L∗ to be a sentence. The collection of sentences in L∗ isdenoted by Ls. (We sometimes say “sentence in L” instead of sentence in L∗.) Asin the examples above we can give names P, ... to the sentences in L; these namesP, ... do NOT belong to the language.

Definition 2.6. A translation of a language L into another language L′ is a rulethat attaches to any symbol in L a symbol in L′; we assume constants are attachedto constants, variables to variables, etc. Due to syntactical correctness (to bediscussed later) such a translation attaches to sentences P in L sentences P ′ in L′.The analysis of translations is part of semantics and will be discussed later in moredetail.

Example 2.7. English and Formal are examples of languages. Incidentally in theselanguages the list of constants ends (there are only “finitely many” constants; herewe use “finitely many” in an intuitive sense: their list “ends”.) But it is importantto not impose that restriction for languages. If instead of English we consider a

A FIRST COURSE IN MATHEMATICS 13

variant of English in which we have words involving arbitrary many letters (e.g.words like “man, superman, supersuperman,...” etc.) then we have an example ofa language with an “infinite” number of constants.

Example 2.8. We already gave an example of translation of Formal into English;cf. Example 2.1. The translation given there for connectives, quantifiers, andequality is called the standard translation. But there are alternative translationsas follows.

Alternative translations of→ into English are “implies”, or “by...it follows that”,or “since...we get”, etc..

An alternative translation of ↔ into English is “is equivalent to”.Alternative translations of ∀ into English are “for any”, or “for every”.Alternative translations of ∃ into English are “for some” or “there is an/a”.English has many other connectives (such as “before”, “after”, “but”, “in spite

of the fact that”, etc.). Some of these we will ignore; others will be viewed asinterchangeable with others; e.g. “but” will be viewed as interchangeable with“and” (although the resulting meaning is definitely altered). Also English hasother quantifiers (such as “many”, “most”, “quite a few”, “half”, “for at leastthree”, etc.); we will ignore these other quantifiers.

Remark 2.9. Let us consider the following types of objects:1) symbols (e.g. x, y, a, b, f,�, ...,∈, ρ, ...,∧,∨,¬,→,↔,∀,∃,=, (, ));1′) collections of symbols (e.g. the collection of symbols above, denoted by L);2) strings of symbols (e.g. ∃x∀y(x ∈ y));2′) collections of strings of symbols (e.g. L∗, Ls encountered above or theories T

to be encountered later).The above types of objects will be referred to as metasets; they are precursors

of (or rather concrete models for) what later will be referred to as sets. Metasetsshould be though of as concrete (physical) objects, like signs written on a piece ofpaper or papyrus, words that can be uttered, images in a book or in our minds,etc. We assume we know what we mean by saying that a symbol belongs to (oris in) a given collection of symbols; or that a string of symbols belongs to (oris in) a given collection of strings of symbols. We will not need to iterate theseconcepts. (Later, with sets, we will assume we can indefinitely iterate concepts likethese.) We will also assume we know what we mean by performing some simpleoperations on such objects like: concatenation of strings, deleting symbols fromstrings, substituting symbols in strings with other symbols, “pairing” strings withother strings, etc. These will be encountered and explained later. Metasets willbe crucial in introducing our concepts of logic. Note that it might look like weare already assuming some kind of logic when we are dealing with metasets; soour introduction to logic might seem circular. But actually the “logic” of metasetsthat we are assuming is much more elementary than the logic we want to introducelater; so what we are doing is not circular.

3. Metalanguage

Let us fix a language L (or a group of languages). A text referring to L is itselfwritten using yet another language M ; once we fixed L and M we shall call L theobject language and M the metalanguage. (The term metalanguage was introducedby Tarski in his theory of truth; but our metalanguage differs from his in certainrespects, cf. Remark 3.5 below.)

14 ALEXANDRU BUIUM

Metalanguages and object languages are similar structures but we shall keepthem separate and we shall hold them to different standards. (Treating them equallywould require the introduction of an infinite “vertical” hierarchy of languages goingin both directions: up and down; this we want to avoid.) In particular meaningand reference will be ignored in the object language but will NOT be ignoredin metalanguage; the reference of metalanguage is actually the object languageitself. Truth will be banned from both the object language and the metalanguage.Finally syntax of languages will be described by metalanguage; and metalanguagewill explain its own syntax.

Sentences in metalanguage are called metasentences. The previous section is, ofcourse, already full of metasentences. And so will be rest of the course; this courseactually consists entirely of metasentences. Let’s have a closer look at this concept.First some examples.

Example 3.1. Here are examples of metasentences of the type we already encoun-tered (where the object language L is either English or Formal):

1) x is a variable in the sentence “∀x(x ∈ a)”.2) P equals “Socrates is a man”.

Later we will encounter other examples of metasentences such as:

3) P (b) is obtained from P (x) by replacing x with b4) Under the translation of L into L′ the translation of P is P ′

5) By the table

P ¬P P ∨ ¬PT F TF T T

the sentence P ∨ ¬P is a tautology.

6) cP is an existential witness for P7) The sequence of sentences P,Q,R, ..., U is a proof of U8) U is a theorem in T9) The theory T is consistent10) A term is a string of symbols u for which there is a term formation which

ends with u.

The metasentences 1, 3, 6 are explanations of sentence syntax (see later); 2 is anotation (it gives a name to a sentence); 4 is an explanation of semantics (see later);5 is part of a metaproof; and 7, 8, 9 are claims about inference (see later). 10 is adefinition.

Definition 3.2. Metalanguage consists of symbols of the following type:

• variables: symbol, sequence, language, term, sentence, theorem, P,Q,R, ...,cP , c

P , ...• constants: “Socrates”, “Socrates is a man”, “∧”, “=”,..., T, F , L,L∗, Ls,T,F,...,

P,Q,R, ..., cP , cP , ...

• functional predicates: the variables in, the translation of, ∧,∨,¬,→,↔,∃x,∀x;• relational predicates: is translated as, occurs in, is obtained from...by replac-

ing...with..., is a tautology, is a proof, is consistent,..., follows from, by ... it followsthat..., by ... one gets that...• connectives: and, or, not, if...then, if and only if, because,...• quantifiers: for all, there exists

A FIRST COURSE IN MATHEMATICS 15

• equality: is, equals

• separators: parantheses (,), comma ,, frames of tables ,...

Remark 3.3. Note that names of sentences in the object language become variables(although sometimes also constants) in metalanguage. The sentences of the objectlanguage become constants in metalanguage. And the connectives of the objectlanguage become functional predicates in metalanguage. Also the symbols “∧, ...”used as constants, the symbols ∧, ... used as functional predicates, and the symbolsand,... viewed as connectives should be viewed as different symbols (normally oneshould use different notation for them). Etc.

Remark 3.4. The above metalanguage can be viewed as a MetaEnlglish becauseit is based on English. One can formalize it, i.e. construct a MetaFormal metalan-guage by replacing the English words with symbols including:• connectives: & (for and), ⇒ (for if...then), ⇔ (for if and only if).• equality: ≡ (for is, equals)We will not precede this way, i.e. we will always use MetaEnglish as out meta-

language.

Remark 3.5. What Tarski called metalanguage is close to what we call metalan-guage but not quite identical. The difference is that Tarski allows metalanguage tocontain the symbols of original object language written without quotation marks.So for him (but not for us), if the language is Formal, then the following is ametasentence:

“∀x∃ys(x, y)” if and only if ∀x∃ys(x, y)

Allowing the above to be a metasentence helped Tarski define truth in a language(the famous Tarski T scheme); we will not do this here.

Remark 3.6. (Syntax/semantics/reference/metaproof/definitions) Metalanguagehas a syntax of its own which we keep less precise than that of languages so thatwe avoid the necessity of introducing a metametalanguage which regulates it; thatwould prompt introducing a metametametalanguage that regulates the metameta-language, etc. (Such a hierarchy would be reminiscent of Russell’s theory of types,and its further developments; we would like to avoid these technical issues.) Thehope is that metalanguage, kept sufficiently loose from a syntactic viewpoint, cansufficiently well explain its own syntax without leading to contradictions. The verytext you are reading now is, in effect, metalanguage explaining its own syntac-tic problems. The syntactically correct texts in metalanguage are referred to asmetasentences. It is crucial to distinguish between words in sentences and words inmetasentences which are outside the quotation marks; even if they look the samethey should be regarded as different words. In order to sustain the distinctionsbetween sentences and metasentences that we have put forward, metasentences arenever viewed themselves as sentences; one syntactical way to guarantee this is toban quotation marks (or letters like P,Q, ...) from sentences. Also we will nevername metasentences by letters like P,Q, ...; the latter letters will be reserved fornaming sentences.

In terms of semantics metasentences have, in their turn, a metameaning derivedfrom their own “horizontal” relations with, e.g. translation into, other metalan-guages or from “vertical” relations with other metalanguages; but we shall ignore

16 ALEXANDRU BUIUM

this issue and assume we understand their metameaning (as it is rather simplerthan the meaning of sentences). Also metasentences have a reference: they alwaysrefer to sentences. Further, metasentences are assumed to have no truth value (itdoes not make sense to say they are true or false).

For instance the metasentences

a. The word “elephants” occurs in the sentence “elephants are blue”b. The word “crocodiles” occurs in the sentence “elephants are grey”

can be translated in the “metalanguage of letter searches” (describing how to searcha word in a sentence, say). Both metasentences have a meaning. Intuitively we aretempted to say that (a) is true and (b) is false. As already mentioned we do notwant to introduce the concepts of true and false in this course. Instead we will ad-mit/reject sentences based on proofs and we will admit/reject metasentences basedon metaproofs. The rules regulating proofs and metaproofs will be explained as wego. For now we note that (a) can be metaproved; also the negation of (b) can bemetaproved. Metaproof is (or should be) a “finitistic” process (meaning essentially“without quantifiers”) and is based on a translation into a “computer language”:for instance to metaprove (a) take the words of the sentence “elephants are blue”one by one starting from the right (say) and ask if the word is “elephants”; thethird time you ask the answer is yes, which ends the metaproof of (a). Metaproofscan also be called verifications or showing. A similar discussion applies to someother types of metasentences; e.g. to the metasentences (1)-(7) in Example 3.1.The metaproof of (5) in Example 3.1 involves, for instance, “showing tables” whosecorrectness can be checked by inspection by a machine. (This will be expelainedlater.) The situation with the metasentences (8) and (9) in Example 3.1 is quitedifferent: there is no “finitistic” method (program) that can decide if there is ametaproof for (8); neither is there a “finitistic” method that can find a metaprooffor (8) in case there is one; same for (9). Finally (10) is a definition in metalan-guage (one should call this probably a metadefinition); definitions will be acceptedwithout metaproofs; in fact most metaproofs consist in checking that a definitionapplies to a given metasentence. The rules governing the latter would be spelledout in metametalanguage; we will not do this here.

Remark 3.7. Since P,Q are variables (sometimes constants) in metalanguage and∧,∨, ...,∃x are functional predicates in metalanguage one can form syntacticallycorrect strings P ∧Q, ...,∃xP in metalanguage, etc. If

P equals “p...”Q equals “q...”

where p, ..., q, ... are symbols then:

P ∧Q equals “(p...) ∧ (q...)”

The above should be viewed as a metaaxiom for ∧ (a precursor of axioms to beencountered later). Note that the parentheses are many times necessary; indeedwithout them the string P ∨Q∧R would be ambiguous. We will drop parentheseshowever each time there is no ambiguity. For instance we will never write (P∨Q)∧Ras P ∨Q∧R. Note that according to these conventions (P ∨Q)∨R and P ∨ (Q∨R)are still considered distinct.

Remark 3.8. If P equals “p...” then P is a constant in metalanguage which wesay stands for (or is a name for) the sentence “p...”. This creates what one can

A FIRST COURSE IN MATHEMATICS 17

call a correspondence between part of the metalanguage and part of the language.This correspondence is not like a translation between languages because constantsin metalanguage do not correspond to constants in the object language but tosentences in the object language. In some sense this correspondence is a “vertical”link between “higher” and “lower” languages whereas translation is a “horizontal”link between languages with “equal” scope.

Remark 3.9. (Disquotation) There is an operation (called disquotation or deletingquotation marks) that attaches to certain metasentences in metalanguage a sentencein the object language. Consider for instance the metasentence in MetaEnglish

1. From “Socrates is a man” and “all men are mortal” it follows that “Socratesis mortal”

Its disquotation is the following sentence in (object) English:

2. “Since Socrates is a man and all men are mortal it follows that Socrates ismortal”

Note that 1 refers to some sentences in English whereas 2 refers to something(somebody) called Socrates. So the references of 1 and 2 are different; and so aretheir meaning (if we choose to care about the meaning of 2 which we usually don’t).

If P equals “Socrates is a man”, Q equals “all men are mortal”, and R equals“Socrates is mortal” then 2 above is also viewed as the disquotation of:

1’. From P and Q it follows that R.

Disquotation is not always performable: if one tries to apply disquotation to themetasentence

1. x is a free variable in “for all x, x is an elephant”

one obtains

2. “x is a free variable in for all x, x is an elephant”

which is not syntactically correct.Disquotation is a concept involved in some of the classical theories of truth, e.g.

in Tarski’s famous:

“Snow is white” if and only if snow is white

Since we are not concerned with truth in this course we will not discuss thisconnection further.

Remark 3.10. (Declarative/imperative/interrogative) All metasentences consid-ered so far were declarative (they declare their content). There are other types ofmetasentences: imperative (giving a command like: “Prove this theorem!”, “Re-place x by b in P”, “Search for x in P”, etc.) and interrogative (asking a questionsuch as: “What is the value of the function f at 5?”, “Does x occur in P?”, etc.).The syntax of metasentences discussed above only applies to declarative metasen-tences. We will only use imperative/interrogative metasenteces in the exercises orsome metaproofs (the latter sharing a lot with computer programs); these othertypes of metasentences require additionnal syntactic rules which are clear and wewill not make explicit here.

18 ALEXANDRU BUIUM

Exercise 3.11. Consider the following utterances and explain how they can beviewed as metasentences; explain the syntactic structure and semantics of thosemetasenteces.

1) To be or not to be, that is the question.2) I do not make hypotheses.3) The sentence labelled 3 is false.1 is, of course, from Shakespeare. 2 is from Newton. 3 is, of course, the liar’s

“paradox”.

FROM NOW ON WE MAKE THE FOLLOWING CONVENTIONS:1) WHEN WE SAY “LANGUAGE” WE WILL ALWAYS REFER TO AN OB-

JECT LANGUAGE;2) WHEN WE SAY “METALANGUAGE” WE WILL ALWAYS REFER TO

A METALANGUAGE AS DESCRIBED IN THIS SECTION; THIS METALAN-GUAGE WILL REFER TO THE LANGUAGES IN 1).

4. Syntax

We already superficially mentioned syntax. In this section we discuss, in somedetail, the syntax of languages. (The syntax of metalanguages will be not explicitlyaddressed but should be viewed as similar.) All the explanations below are, ofcourse, written in metalanguage.

As we saw a language is a collection L of symbols. Given L we considered thecollection L∗ of all strings of symbols in L. In this section we explain the definitionof sentences (which will be certain special strings in L∗). Being a sentence willinvolve, in particular, a certain concept of “syntactic correctness”. The kind ofsyntactic correctness discussed below makes L a first order language. There areother types of languages whose syntax is different (e.g. second order languages,in which one is allowed to say, for instance, “for any relational predicate etc....”).First order languages are the most natural (and are entirely sufficient) for developingmathematics.

In what follows we let L be a collection of symbols consisting of variables x, y, ...,constants, functional predicates, relational predicates, connectives ∧,∨,¬,→,↔(where ¬ is unary and the rest are binary), quantifiers ∀,∃, equality =, and, as sep-arators, parentheses (, ), and commas. (For simplicity we considered 5 “standard”connectives, 2 “standard” quantifiers, and a “standard” symbol for equality; thisis because most examples will be like that. However any number of connectivesand quantifiers, and any symbols for them would do. In particular some of thesecategories of symbols may be missing.)

Example 4.1. If L has constants a, b, ..., a functional predicate f , and a relationalpredicate � then here are examples of strings in L∗:

1) )(x∀�∃aby(→2) f(f(a))3) ∃y((a�x)→ (a�y))4) ∀x(∃y((a�x)→ (a�y)))

In what follows we will define terms, formulas and sentences; 1 above will notbe a term, a formula, or a sentence; 2 will be a term; 3 will be a formula but not asentence; 4 will be a sentence.

A FIRST COURSE IN MATHEMATICS 19

Definition 4.2. A term formation is a sequence

t...s...u

of strings in L∗ such that for any string s in the sequence (including t and u) oneof the following holds:

1) s is a constant or a variable;2) s is preceded by s′, s′′, ... such that s is of the form f(s′, s′′, ...) where f is a

functional predicate.

Definition 4.3. A term is a string u for which there is a term formation endingwith u.

Remark 4.4. Functional predicates may be unary f(t), binary f(t, s), ternaryf(t, s, u), etc. When we write f(t, s) we simply mean a string of 5 symbols:

“f”, “(”, “t”, “,”, “s”, “)”

Example 4.5. If a, b, ... are constants, x, y, ... are variables, f is a unary functionalpredicate and g is a a binary functional predicate, all of them in L, then

f(g(f(b), g(x, g(x, y))))

is a term; a term formation for it is

bxyf(b)g(x, y)g(x, g(x, y))g(f(b), g(x, g(x, y)))f(g(f(b), g(x, g(x, y))))

The latter should be simply viewed, again, as a string of symbols; there is no“substitution” involved here. Substitution will play a role later, though; cf. Defi-nition 4.16.

Remark 4.6. If t, s, ... are terms and f is a functional predicate then f(t, s, ...) isa term. We cannot “prove” this metasentence (we do not know what a proof is)but we can give a justification (argument) that we can refer to a s a “metaproof”.Metaproofs are usually by “showing”. Here is our metaproof. If t, s, ... are termsthen we have term formations

t′

t′′

...t

and

s′

s′′

20 ALEXANDRU BUIUM

...s

Hence we can write a term formation

t′

t′′

...ts′

s′′

...sf(t, s, ...)

Hence f(t, s, ...) is a term.

Definition 4.7. A formula formation is a sequence

P...Q...R

of strings in L∗ such that for any string of symbols Q in the sequence (including Pand R) one of the following holds:

1) Q is of the form t = s where t, s are terms;2) Q is of the form ρ(t, s, ...) where t, s, ... are terms and ρ is a relational predicate.3) Q is preceded by Q′, Q′′ and is of the form Q′ ∧Q′′, Q′ ∨Q′′, ¬Q′, Q′ → Q′′,

Q′ ↔ Q′′.4) Q is of the form ∀xQ′ or ∃xQ′ where Q′ precedes Q.

Recall our convention that if we have a different number of symbols (writtendifferently) we make similar definitions for them; in particular some of the symbolsmay be missing altogether. E.g. if quantifiers are missing from L then we ignore 4.If equality is missing from L we ignore 1.

Definition 4.8. A string R in L∗ is called a formula if there is a formula formationthat ends with R. We denote by Lf the collection of all formulas in L∗.

Remark 4.9. Relational predicates can be unary ρ(t), binary ρ(t, s), ternaryρ(t, s, u), etc. Again, ρ(t, s) simply means a sequence of 5 symbols ρ, (, t, s, ) andnothing else. Sometimes one uses another syntax for relational predicates: insteadof ρ(t, s) one writes tρs or ρts; instead of ρ(t, s, u) one may write ρtsu, etc. All ofthis is in the language L. On the other hand if some variables x, y, ... appear in aformula P we sometimes write in metalanguage P (x, y, ...) instead of P . In particu-lar if x appears in P (there may be other variables in P as well) we sometimes writeP (x) instead of P . Formulas of the form ∀xP (equivalently ∃xP (x)) are referredto as universal formulas. Formulas of the form ∃xP (equivalently ∃xP (x)) are re-ferred to as existential formulas. Formulas of the form P → Q are referred to asconditional formulas. Formulas of the form P ↔ Q are referred to as biconditionalformulas.

A FIRST COURSE IN MATHEMATICS 21

Exercise 4.10. Give a metaproof of the following:1) if P and Q are formulas then P ∧Q, P ∨Q, ¬P , P → Q, P ↔ Q are formulas.2) if P is a formula then ∀xP and ∃xP are formulas.

Example 4.11. Assume L contains a constant c, a unary relational predicate ρ,and a unary functional predicate f . Then the following is a formula:

(∀x(f(x) = c))→ (ρ(f(x)))A formula formation for it is:

f(x) = cρ(f(x))∀x(f(x) = c)(∀x(f(x) = c))→ (ρ(f(x)))

Definition 4.12. We define the free occurrences of a variable x in a formula bythe following conditions:

1) The free occurrences of x in a formula of the form t = s are all the occurrencesof x in t together with all the occurrences of x in s;

2) The free occurrences of x in a formula ρ(t, s, ...) are all the occurrences of xin t, together with all the occurrences of x in s, etc.

3) The free occurrences of x in P ∧ Q, P ∨ Q, P → Q, P ↔ Q are the freeoccurrences of x in P together with the free occurrences of x in Q. The freeoccurrences of x in ¬P are the free occurrences of x in P .

4) No occurrence of x in ∀xP or ∃xP is free.

Definition 4.13. A variable x is free in a formula P if it has at least one freeoccurrence in P .

Example 4.14.1) x is not free in ∀y∃x(ρ(x, y)).2) x is free in (∃x(β(x))) ∨ ρ(x, a). (x has a “free occurrence” in ρ(x, a); the

“occurrence” of x in ∃x(β(x)) is not “free”.)3) x is not free in ∀x((∃x(β(x))) ∨ ρ(x, a))4) The free variables in (∀x∃y(α(x, y, z))) ∧ ∀u(β(u, y)) are z, y.

Definition 4.15. A string in L∗ is called a sentence if it is a formula (i.e. is inLf ) and has no free variables. Note that

1) Ls is contained in Lf ;2) Lf is contained in L∗;3) all terms are in L∗; no term is in Lf .

Definition 4.16. If x is a free variable in a formula P one can replace all itsfree occurrences with a term t to get a formula which can be denoted by P t

x .More generally if x, y, ... are variables and t, s, ... are terms we may replace all freeoccurrences of these variables by t, s, ... to get a formula P ts...

xy... . A more suggestive

(but less precise) notation is as follows. We write P (x) instead of P and then wewrite P (t) instead of P t

x . Similarly we write P (t, s, ...) instead of P ts...xy... We will

constantly use this P (t), P (t, s, ...), etc. notation from now on.Similarly if u is a term containing x and t is another term then one may replace

all occurrences of x in u by t to get a term which we may denote by u tx ; if we

write u(x) instead of u then we can write u(t) instead of u tx . And similarly we may

replace two variables x, y in a term u by two terms t, s to get a term u tsxy , etc. We

22 ALEXANDRU BUIUM

will not make use of this latter type of substitution in what follows (although itwill appear implicitly when we discuss tables in the section on computations).

Example 4.17. If P equals “x is a man” then x is a free variable in P . If a equals“Socrates” then P (a) equals “Socrates is a man”.

Example 4.18. If P equals “x is a man and for all x, x is mortal” then x is afree variable in P . If a equals “Socrates” then P (a) equals “Socrates is a man andfor all x, x is mortal”

Exercise 4.19. Is x a free variable in the following formulas?1) “(∀y∃x(x2 = y3)) ∧ (x is a man)”2) “∀y(x2 = y3)”Here the upper indexes 2 and 3 are unary functional predicates.

Exercise 4.20. Compute P (t) if:1) P (x) equals “∃y(y2 = x)” and “t” equals “x4”.2) P (x) equals “∃y(y poisoned x)” and “t” equals “Plato’s teacher”.

5. Semantics

We already superficially mentioned semantics. In this section we discuss, insome detail, the semantics of languages. Semantics is the analysis of meaning andmeaning comes, for us, mainly from translation. Recall:

Definition 5.1. Let L and L′ be two languages. By a translation of L in L′ weunderstand a rule that attaches to any symbol X in L a symbol X ′ in L′. Wesay that in our translation X is translated as X ′ or that X is mapped into X ′.We assume the rule “respects” the 8 types of symbols i.e. attaches variables tovariables, constants to constants, functional predicates to functional predicates,etc.

Remark 5.2. By syntactic correctness it is intuitively clear that for any formulaP in L if one replaces the symbols X in P by the symbols X ′ one obtains a formulaP ′ in L′. And the same with sentences in place of formulas. We cannot prove this(because we would need induction to prove it but again we do not even know whatproofs are at this point). We say that P is translated (or mapped) into P ′ or thatP ′ is the translation of P .

One sometimes needs a more flexible notion of translation of formulas which wenow explain. (This concept will not play a role until near the end of the course.)

Definition 5.3. Assume D(x) is a formula in L′ with one free variable x (which werefer as a “domain”). Assume we are given a translation of L into L′. Replacing allconstants and functional predicates X in L by the corresponding symbols X ′ in L′

we get a way to attach to any term t in L a term t′ in L′. We define D-translationof formulas in L into formulas in L′ by the following rules:

1) The translation (t = s)′ of t = s is t′ = s′;2) The translation (ρ(t, s, ...))′ of ρ(t, s, ...) is ρ′(t′, s′, ...).3) If P ′, Q′ are translations of formulae P,Q in L then the translation of P ∧Q,

P ∨Q, ¬P , P → Q, P ↔ Q are P ′ ∧Q′, P ′ ∨Q′, ¬P ′, P ′ → Q′, P ′ ↔ Q′

4) If P (x) is a formula in L with one free variable x with translation P ′(x)then the translation (∀xP (x))′ of ∀xP (x) is ∀x(D(x)→ P ′(x)) and the translation(∃xP (x))′ of ∃xP (x) is ∃x(D(x) ∧ P ′(x)).

A FIRST COURSE IN MATHEMATICS 23

This D-translation intuitively asks that the translated sentences “always refer”to things that have “property” (or “domain”) D.

Remark 5.4. When we later discuss proofs we will encounter another generaliza-tion of translations that will involve a conversion of text from a language L intoa text in metalanguage followed by a conversion (called disquotation) of the textin metalanguage back into a different language called argotic L; we don’t need toexplain this at this point.

Remark 5.5. Translations as above are precursors of the maps to be encounteredlater in set theory. But we do not view them as maps. Translations as above canbe called metamaps; they should be thought of as concrete objects like dictionaries,tables, or picture books designed to teach foreign languages. Or as (infinite!) tablessuch as

P P ′

Q Q′

... ...

Remark 5.6. If L is any language with connectives ∧,∨,¬,→,↔, quantifiers ∀,∃,and equality = then the “standard translation” of L into the English language isthe one specified in Example 2.1; e.g. ∀,∃ is translated as “for all, there exists” etc.Sometimes, if we translate a language L in English and we want the translation tosay that all variables in L refer to something specific in English (say to crocodiles)we use a translation where

∀ is translated as “for all crocodiles”,∃ is translated as “there exists a crocodile”.

Later we will use “sets, rings”, etc. instead of “crocodiles”.

Example 5.7. If L′ and L are English and Formal (cf. Example 2.1) then, asexplained in that Example, there is a translation of L into L′ such that the sentences1, ..., 5 in L are mapped into the sentences 1, ..., 5 in L′ respectively.

Exercise 5.8. AssumeP equals “∀x((x > 0)→ (∃y∃z∃u∃v(x = y2 + z2 + u2 + v2))”P ′ equals “any positive integer is a sum of four squares”Find languages L and L′ for the sentences P and P ′ above and find a translation

of L into L′ such that P is mapped into P ′. Hint: We do not need to know at thispoint what integers are so we need a constant to denote the collection of integersand a predicate that stands for “belongs”. By the way this sentence turns out tobe a theorem of Lagrange.

Exercise 5.9. AssumeP equals “(K(O)) ∧ (∀x(K(x)→ (x = O)))”P ′ equals “Oswald shot Kennedy and was the only person who shot Kennedy”Find languages L and L′ for the sentences P and P ′ above and find a translation

of L into L′ such that P is sent into P ′.

Definition 5.10. Let L′ be the English language and P ′ be a sentence in English.By a formalization of P ′ we mean a sentence P in a language L and a translationof L into L′ such that P is mapped into P ′, the variables, constants, and predicatesof L are denoted by single marks on paper, and the connectives, quantifiers, andequality of L are ∧,∨,¬,→,↔,∀,∃,=.

24 ALEXANDRU BUIUM

Example 5.11. In Example 2.1 the sentences 1’,...,5’ are formalizations of thesentences 1,...,5.

Exercise 5.12. Find formalizations of the following English sentences:1. “I saw a man.2. “There is no hope for those who enter this realm.”3. “There is nobody there.”4. “There were exactly two people in that garden.”

Exercise 5.13. Find formalizations of the following English sentences:1) “The movement of celestial bodies is not produced by angels pushing the bodies

in the direction of the movement but by angels pushing the bodies in a directionperpendicular to the movement”

2) “I think therefore I am”3) “Since existence is a quality and since a perfect being would not be perfect if

it lacked one quality it follows that a perfect being must exists.”4) “Since some things move and everything that moves is moved by a mover

and an infinite regress of movers is impossible it follows that there is an unmovedmover”

Hints: “but” should be thought of as “and”; “therefore” should be thought of“implies” and hence as “if...then”; “since...it follows” should be thought of, again,as “implies”.

Remark 5.14. The sentence 1 above paraphrases a statement in one of Feynman’slectures on gravity. The sentence 2 is, of course Decartes’ “cogito ergo sum”.The sentence 3 is a version of the “ontological argument” (cf. Anselm, Descartes,Leibniz, Godel for versions of this; also cf. Aquinas and Kant for refutations ofthis). The sentence 4 is a version of the “cosmological argument” (cf. Aquinas).

6. Definitions

Definition 6.1. A definition in L is a sentence of one of the forms:1) “c = t” where c is a constant and t is a term without variables.2) “∀x(ε(x) ↔ E(x))” where ε is a unary relational predicate and E is a for-

mula with one free variable. More generally for several variables, “∀x∀y(ε(x, y)↔E(x, y))” is a definition, etc.

3) “∀x∀y((y = f(x))↔ F (x, y))” where f is a unary functional predicate and Fis a formula with 2 free variables; more generally one allows several variables.

If any type of symbols is missing from the language we disallow, of course, thecorresponding definitions.

A language together with a collection of definitions is referred to as a languagewith definitions.

A definition as in 1 should be viewed as either a definition of c or a definition oft. A definition as in 2 should be viewed as a definition of ε. A definition as in 3should be viewed as a definition of f .

Remark 6.2. Given a language L with definitions and a term t without variablesone can add to the language a new constant c and one can add to the definitionsthe definition c = t. We will say that c is (a new constant) defined by c = t.

Similarly given a language L with definitions and a formula E(x) in L with onefree variable x one can add to the relational predicates of L a new unary relational

A FIRST COURSE IN MATHEMATICS 25

predicate ε and one can add to the definitions the definition ∀x(ε(x)↔ E(x)). Wewill say that ε is (a new relational predicate) defined by ∀x(ε(x)↔ E(x)).

One may introduce new functional predicates f by adding a symbol f and defi-nitions of type 3 or of type 1 (such as c = f(a, b, ...), for various constants a, b, c, ...).

Example 6.3. As an example for the introduction of a new relational predicate, saywe have a language containing a binary relational predicate ∈ and two constantsc and s (translated in English as “belongs to”, “chicken”, “space ships”). Thenone could introduce a new relational predicate ε (translated in English as “is anastrochicken”) by adding the following to the definitions:

Definition 6.4. ∀x(ε(x)↔ ((x ∈ c) ∧ (x ∈ s)))

In English the definition above would be translated as:

Definition 6.5. Something is an astrochicken if and only if it is a chicken and alsoa space ship.

The astrochicken is a concept introduced by Freeman Dyson in his lecture on“Science and science fiction”.

7. Witnesses

Let L be a language possessing quantifiers ∀,∃ (but not necessarily connectivesor equality).

Definition 7.1. A witness assignment on L is a rule that attaches to any formulaP with exactly one free variable x two constants cP and cP . The constant cP iscalled the witness for the sentence ∀xP (x); cP is called the witness for the sentence∃xP (x). If a witness assignment was fixed in L we also say L is a language withwitnesses.

Example 7.2. The symbols cP , cP are, of course, constants (or variables) in meta-

language. Symbols such as

cx=a, cx�a, c∀y((y=b)∨(y�x))

are constants in a language if the language contains x, y,=,�, a, b.

Remark 7.3. The witness terminology (and usefulness) will become clear later.But it is worth keeping in mind the following illustration of this concept. Assumethat ∀,∃ are translated in English as “for all elephants” and “there exists an ele-phant” and assume that P (x) is translated as “x is blue”. Then cP should betranslated as “the least blue elephant”; cP should be translated as “the most blueelephant”. We will see later how this works in proof theory.

Definition 7.4. Given a language L0 there is a canonical way to construct alanguage with witnesses, L, starting from L0. Indeed one first adds to L0 two

new constants cP and cP for each formula P in Lf0 with exactly one free variable.(Each cP is viewed as a new symbol; and the same with cP .) Call L1 the enlargedcollection of symbols. (Note that in case L0 is a language with definitions we DONOT add to the definitions of L0 any new definitions for the new constants!) Thenrepeat this construction with L1 in place of L0; we get a new set of symbols L2.We continue this “indefinitely”. Then we set L the collection of all symbols inL0, L1, L2, .... Then clearly L has a natural witness assignment. We call L thewitness closure of L0.

26 ALEXANDRU BUIUM

Example 7.5. Let L0 be the language with variables x, y, z, ..., no constants, nofunctional predicates, one binary relational predicate ∈, connectives ∧,∨,¬,→,↔,quantifiers ∀,∃, equality =, and typographical signs (, ), ,. Let L be the witnessclosure of L0. (This will play a role later when we introduce set theory.) Here areexamples of constants in L:

c∃y(x∈y), c∃y(x∈y), ...

cx∈c∃y(x∈y) , cx∈c∃y(x∈y), ....

The first row belongs to L1, the second to L2, etc.

Exercise 7.6. Write examples of sets belonging to L3 and L4.

Definition 7.7. Let P (x, y) be a formula with free variables x, y and let Q(x) =∀yP (x, y). If c′ is the witness for ∃xQ(x) and c′′ is the witness for ∀yP (c′, y) thenwe say that c′, c′′ are the witnesses for ∃x∀yP (x, y). One gives a similar definitionfor witnesses c′, c′′, c′′′, ... for sentences of the form ∃x∃y∀z...(...), etc.

Example 7.8. Explicitly if P (x, y) is a formula with free variables x, y the witnessesc′, c′′ for ∃x∀yP (x, y) are given by:

c∀yP (x,y), cP (c∀yP (x,y),y)

Remark 7.9. If we deal with languages with witnesses we will always tacitlyassume that all translations are compatible (in the obvious sense) with the witnessassignments. Compatibility can be typically achieved as follows: if one is givena translation of a language L0 into a language L′0 then this translation can beextended uniquely to a translation, compatible with witness assignments, of thewitness closure L of L0 into the witness closure L′ of L′0.

Remark 7.10. There is a chapter of philosophy that deals with existence; it iscalled ontology. The philosophical analysis of existence is a highly difficult task;when we say in philosophy that something exists one immediately asks: wheredoes this exists? in the physical world? in our imagination? in the realm ofpossibilities? how can one ascertain existence? etc. The concept of witness willlater give a sweeping trivial answer to existence that is completely syntactical: wehave created (or postulated) in the discussion above a constant cP for each possibleformula P (x) with exactly one free variable x; this constant will be viewed later asthe “answer to the existence problem for objects with property P”. For instancethe various sets of mathematics (such as unions, products, power sets, etc.) willbe later defined as witnesses for various corresponding sentences. We will not haveto ask (with Cantor’s critics) where do these sets exist. These sets will simplybe constants indexed by formulas, hence they will exist as signs written on paper.Ontology will be entirely replaced by syntax.

8. Tautologies

Up until now we developed an analysis of language; this is indeed the first stepto be taken in logic. The next step in logic is the analysis of inference (which isalso referred to as deduction or proof). We begin this analysis now. In order tointroduce the general notion of proof we need to first introduce tautologies; in theirturn tautologies are introduced via certain planar symbols in metalanguage calledtables.

A FIRST COURSE IN MATHEMATICS 27

Definition 8.1. Recall that T and F are two symbols in metalanguage. Recallalso that we have separators in metalanguage that are frames of tables. Usingthe above plus arbitrary constants (or variables) P and Q in metalanguage weintroduce the following strings of symbols in metalanguage (which are actuallyplanar configurations rather than linear configurations but which can obviously berearranged in the form of strings). They are referred to as the truth tables of the5 standard connectives.

P Q P ∧QT T TT F FF T FF F F

P Q P ∨QT T TT F TF T TF F F

P Q P → QT T TT F FF T TF F T

P Q P ↔ QT T TT F FF T FF F T

P ¬PT FF T

Remark 8.2. If in the tables above P is the sentence “p....” and Q is the sentence“q....” we allow ourselves, as usual, to identify the signs P,Q, P ∧ Q, etc. withthe corresponding sentences “p...”, “q...”, “(p...)∧(q...)”, etc. Also: the letters Tand F evoke “truth” and “falsehood”; but they should be viewed as devoid of anymeaning.

Fix in what follows a language L that has the 5 standard connectives ∧,∨,¬,→,↔ (but does not necessarily have quantifiers or equality).

Definition 8.3. Let P,Q, ..., R be sentences in L. By a Boolean sequence generatedby P,Q, ..., R we mean a sequence of sentences

PQ...RU......W

such that for any sentence V among U, ...,W we have that V is preceded by V ′, V ′′

(with V ′, V ′′ among P, ...,W ) and V is of one of the forms

V ′ ∧ V ′′, V ′ ∨ V ′′,¬V ′, V ′ → V ′′, V ′ ↔ V ′′

Example 8.4. The following is a Boolean sequence generated by P,Q,R:PQR¬RQ ∨ ¬RP → (Q ∨ ¬R)

28 ALEXANDRU BUIUM

P ∧R(P ∧R)↔ (P → (Q ∨ ¬R))

Example 8.5. The following is a Boolean sequence generated by P → (Q ∨ ¬R)and P ∧R:

P → (Q ∨ ¬R)P ∧R(P ∧R)↔ (P → (Q ∨ ¬R))

Remark 8.6. The same sentence may appear as the last sentence in two differentBoolean sequences; cf. the last 2 examples.

Definition 8.7. Assume we are given a Boolean sequence generated by P,Q, ..., R.For simplicity assume it is generated by P,Q,R. (When more or less than 3 gener-ators the definition is similar.) The truth table attached to this Boolean sequenceand to the fixed system of generators P,Q,R is the following string of symbols (orrather plane configuration of symbols thought of as reduced to a string of symbols):

P Q R U ... WT T T ... ... ...T T F ... ... ...F T T ... ... ...F T F ... ... ...T F T ... ... ...T F F ... ... ...F F T ... ... ...F F F ... ... ...

Note that the 3 columns of the generators consist of all 8 possible combinations of Tand F . The dotted columns correspond to the sentences other than the generatorsand are computed by the following rule. Assume V is not one of the generatorsP,Q,R and assume that all columns to the left of the column of V were computed;also assume that V is obtained from V ′ and V ′′ via some connective ∧,∨, .... Thenthe column of V is obtained from the columns of V ′ and V ′′ using the tables of thecorresponding connective ∧,∨, ... respectively.

The above rule should be viewed as a syntactic rule for metalanguage; we did notintroduce the syntax of metalanguage systematically but the above is one instancewhen we are quite precise about it.

Example 8.8. Consider the following Boolean sequence generated by P and Q:

PQ¬P¬P ∧Q

Its truth table is:P Q ¬P ¬P ∧QT T F FT F F FF T T TF F T F

A FIRST COURSE IN MATHEMATICS 29

Note that the generators P and Q are morally considered “independent” (in thesense that all 4 possible combinations of T and F are being considered for them);this is in spite of the fact that actually P and Q maybe, for instance a = b and¬(a = b) respectively.

Definition 8.9. A sentence S is a tautology if one can find a Boolean sequencegenerated by some sentences P,Q, ..., R such that

1) The last sentence in the sequence is S2) The truth table attached to the sequence and the generators P,Q, ..., R has

only T s in the S column.

Remark 8.10. We do not ask, in the definition above, that for any Boolean se-quence ending in S and for any generators the last column of the truth table haveonly T s; we only ask this to hold for one Boolean sequence and one system ofgenerators.

Example 8.11. P ∨ ¬P is a tautology. To metaprove this consider the Booleansequence generated by P ,

P¬PP ∨ ¬P

Its truth table is (check!):

P ¬P P ∨ ¬PT F TF T T

This ends out metaproof of the metasentece P ∨ ¬P is a tautology.Remark that if we view the same Boolean sequence

P¬PP ∨ ¬P

as a Boolean sequence generated by P and ¬P the corresponding truth table is

P ¬P P ∨ ¬PT T TT F TF T TF F F

and the last column in the latter table does not consist of T s only. This does notchange the fact that P ∨¬P is a tautology. Morally, in this latter computation wehad to treat P and ¬P as “independent”; this is not a mistake but rather a poorchoice in trying to prove that P ∨ ¬P is a tautology.

Example 8.12. (P ∧ (P → Q))→ Q is a tautology; it is called modus ponens. Tometaprove this consider the following Boolean sequence generated by P,Q,R:

PQP → QP ∧ (P → Q)(P ∧ (P → Q))→ Q

30 ALEXANDRU BUIUM

Its truth table is:

P Q P → Q P ∧ (P → Q) ST T T T TT F F F TF T T F TF F T F T

Exercise 8.13. Explain how the table above was computed.

Exercise 8.14. Give a metaproof of the fact that each of the sentences below is atautology:

1) (P → Q)↔ (¬P ∨Q)2) (P ↔ Q)↔ ((P → Q) ∧ (Q→ P ))

Exercise 8.15. Give a metaproof of the fact that each of the sentences below is atautology:

1) (P ∧Q)→ P2) P → (P ∨Q)3) ((P ∧Q) ∧R)↔ (P ∧ (Q ∧R))4) (P ∧Q)↔ (Q ∧ P )5) (P ∧ (Q ∨R))↔ ((P ∧Q) ∨ (P ∧R))6) (P ∨ (Q ∧R))↔ ((P ∨Q) ∧ (P ∨R))

Definition 8.16.1) Q→ P is called the converse of P → Q2) ¬Q→ ¬P is called the contrapositive of P → Q

Exercise 8.17. Give a metaproof of the fact that each of the sentences below is atautology:

1) ((P ∨Q) ∧ (¬P ))→ Q (moduls ponens, variant)2) (P → Q)↔ (¬Q→ ¬P ) (contrapositive argument)3) (¬(P ∧Q))↔ (¬P ∨ ¬Q) (de Morgan law)4) (¬(P ∨Q))↔ (¬P ∧ ¬Q) (de Morgan law)5) ((P → R) ∧ (Q→ R))→ ((P ∨Q)→ R) (case by case argument).6) (¬(P → Q))↔ (P ∧ ¬Q) (negation of an implication)7) (¬(P ↔ Q))↔ ((P ∧ ¬Q) ∨ (Q ∧ ¬P )) (negation of an equivalence)

Remark 8.18. 2) in Exercise 8.17 says that the contrapositive of an implicationis equivalent to the original implication.

Exercise 8.19. (VERY IMPORTANT) Give a metaproof of the fact that thesentence

(P → Q)↔ (Q→ P )

is not a tautology. In other words the converse of an implication is not equivalentto the original implication. “If c a human then c mortal” is not equivalent to “If cis mortal then c is a human”.

Definition 8.20. A sentence P is a contradiction if ¬P is a tautology.

Exercise 8.21. Give a metaproof of the fact that for any P the sentence P ∨ ¬Pis a tautology and the sentence P ∧ ¬P is a contradiction.

A FIRST COURSE IN MATHEMATICS 31

Exercise 8.22. Formalize the following English sentences, negate their formaliza-tion, and give the English translation of these formalized negations:

1) If Plato eats this nut then Plato is a bird.2) Plato is a bird if and only if he eats this nut.

9. Theories

Recall that for a language L equipped with the 5 standard connectives we defineda collection of sentences in L called tautologies; cf. Definition 8.9. In what follows(essentially throughout the course) we assume that L is a language equipped withthe 5 standard connectives ∧,∨,¬,→,↔, quantifiers ∀,∃, and equality = and weassume L has definitions and witnesses.

Definition 9.1. A quantifier axiom is a sentence of one of the following forms. Inwhat follows P runs through all the formulas P (x) with a free variable x and t runsthrough the terms without variables.

P (cP )→ (∀xP (x))(∀xP (x))→ P (t)P (t)→ (∃xP (x))(∃xP (x))→ P (cP )

One can write the above sentences in the form of a sequence:

P (cP )→ (∀xP (x))→ P (t)→ (∃xP (x))→ P (cP )

Remark 9.2. To understand the intuitive content of the quantifier axioms involv-ing witnesses let us look again at an example considered earlier. So assume that∀,∃ are translated in English as “for all elephants” and “there exists an elephant”and assume that P (x) is translated as “x is blue”. Let cP be translated as “theleast blue elephant”; and let cP be translated as “the most blue elephant”. Then theaxiom P (cP ) → (∀xP (x)) is translated as “if the least blue elephant is blue thenall elephants are blue”. Also the axiom ∃xP (x) → P (cP ) is translated as “if thereexists a blue elephant then the most blue elephant is blue”. These translations arereasonable and could be kept in the back of our minds when dealing with witnesses;but the correct way to think about witnesses is to completely forget about theirtranslation into natural languages.

Definition 9.3. An equality axiom is a sentence of one of the following forms. Inwhat follows t, s, u run through all (n-tuples of) without variables, f runs throughall n-ary functional predicates, and P (x) runs through all formulas with free (n-tuples of) variables x.

t = t(t = s)↔ (s = t)((t = s) ∧ (s = u))→ (t = u)(t = s)→ (f(t) = f(s))(t = s)→ (P (t)↔ P (s))

It is convenient to collect some of the definitions above into one definition:

Definition 9.4. Let L be a language with definitions and witnesses. By a back-ground axiom we understand a sentence which is either a tautology, or a definition,or a quantifier axiom, or an equality axiom.

32 ALEXANDRU BUIUM

Definition 9.5. A collection T of sentences in L is a theory if and only if it satisfiesthe following conditions:

1) T contains all background axioms;2’) (Closure under conjunction) If P,Q are in T then P ∧Q is in T;2”) (Closure under modus ponens) If P and P → Q are in T then Q is in T.The sentences in a theory are called theorems (in that theory).

Remark 9.6. A proposition is a theorem which is less “important” in the theory. Alemma is a theorem which “helps prove” another theorem. One usually reserves theword theorem for the most important sentences that belong to a theory. For a whilewe will not make any distinction between theorems, propositions, and lemmas. Butlater, in mathematics, we will start making this distinction.

Definition 9.7. If we fix a collection of sentences A,B, ... and we refer to them asspecific axioms then by an axiom we will understand either a background axiom ora specific axiom.

Definition 9.8. Fix a collection of specific axioms A,B, .... Let U be a sentence.By a proof (or a derivation) of U fromA,B, ... we understand a sequence of sentences

PQR......U

with the following property. Let S be one of P,Q,R, .... Then one of the followingholds:

1) S is an axiom.2) S is preceded in the list by sentences S′ and S′′ such that S is S′ ∧ S′′.3) S is preceded in the proof by sentences S′ and S′ → S.

Exercise 9.9. Fix a collection of specific axioms A,B, .... Give a metaproof of thefact that a sentence U has a proof from A,B, ... if and only if there is a sequenceof sentences

PQR......U

with the following property. Let S be one of P,Q,R, .... Then one of the followingholds:

1) S is an axiom.2) S is preceded in the list by (not necessarily different) sentences S′ and S′′

such that (S′ ∧ S′′) → S is an axiom. (We then say S is inferred from S′ and S′′

or that S follows from S′ and S′′.)Sequences as above will still be referred to as proofs.

Definition 9.10. Fix again specific axioms A,B, .... The collection of all sentencesU for which there is a proof of U from A,B, ... is a theory (see the Exercise below)

A FIRST COURSE IN MATHEMATICS 33

and is called the theory with (or generated by the) specific axioms A,B, ... and isdenoted by T(A,B, ...).

Exercise 9.11. Metaprove that T(A,B, ...) is indeed a theory.

Exercise 9.12. Metaprove that any theory is of the form T(A,B, ...).

Remark 9.13. If one removes the background axioms from a system of specificaxioms the theory does not change. So we may (and will) assume our systems ofspecific axioms do not contain background axioms.

Remark 9.14. Two different systems of specific axioms A,B, ... and A′, B′, ... cangenerate the same theory: i.e. T(A,B, ...) may coincide with T(A′, B′, ...).

Exercise 9.15. Let T be a theory in L with specific axioms A,B, .... Explain whyany background axiom and any specific axiom is a theorem.

Exercise 9.16. Assume we have a theory T(A,B, ...) and a sentence U . Metaprovethat U is a theorem in T(A,B, ...) if and only if we have a sequence of sentences

PQR......U

with the following property. Let S be one of P,Q,R, .... Then one of the followingholds:

1) S is a theorem in T(A,B, ...);2) S is preceded in the list by (not necessarily different) sentences S′ and S′′

such that (S′ ∧ S′′)→ S is a theorem in T(A,B, ...).Sequences as above will still be referred to as proofs. In particular this says that

we can use previously proved theorems to prove new theorems.

Remark 9.17. If the theory has, among its theorems (in particular among itsaxioms), the sentence ∃xP (x) then note that the sentence P (cP ) is a theorem. Werefer to cP as the witness for the theorem ∃xP (x). More generally if c′, c′′, c′′′, ...are the witnesses for the sentence ∃x∃y∃z...P (x, y, z, ...) and the latter sentence isa theorem then P (c′, c′′, c′′′, ...) is a theorem.

Exercise 9.18. Metaprove the latter assertion.

Remark 9.19. (Important) If the language L is the witness closure of a languageL0 then the only theorems in Ls that we are morally interested in are the onesthat already belong to Ls0. So, in some sense, one views L as an artificial languagecreated by introducing artificial witnesses. The role of L is to allow one to definethe notion of theory/theorem; but what one is really interested in are the theoremsthat do not contain the artificially introduced witnesses.

Exercise 9.20. Metaprove the above.

Exercise 9.21. Explain why if one erases the last sentence in a proof one obtainsa proof of the next to the last sentence in the original proof.

34 ALEXANDRU BUIUM

Exercise 9.22. Explain why if two of the sentences in a proof have the formS′ → S′′ and S′′ → S′′′ respectively adding S′ → S′′′ to the proof yields a proof.

Hint: ((S′ → S′′) ∧ (S′′ → S′′′))→ (S′ → S′′′) is a tautology in S′, S′′, S′′′.

Exercise 9.23. Explain why if the sentences S′, S′′ appear in a proof then addingS′ ∧ S′′ to the proof yields a proof.

Exercise 9.24. Explain why if S′ appears in a proof and S′ → S is a tautologythen adding S to the proof yields a proof.

Exercise 9.25. Explain why if S′ and S′ → S appear in the proof then adding Sto the proof yields a proof. (Use modus ponens.)

Exercise 9.26. Explain why if P → R and Q→ R appear in a proof then adding(P ∨Q)→ R to the proof yields a proof. (Use the case by case argument.)

Definition 9.27. A theory T is complete if for any sentence P either P is a theoremor ¬P is a theorem; i.e. either P is in T or ¬P is in T.

Definition 9.28. A theory is inconsistent if there is sentence P such that both Pand ¬P are theorems. A theory is consistent if it is not inconsistent; i.e. if for anysentence P either P is not a theorem or ¬P is not a theorem.

Remark 9.29. There are very few examples of theories which can be shown to beconsistent/complete. Dealing with completeness/consistency requires reformulatingthe problem within mathematics (i.e. in mathematical logic) where more tools areavailable. See the last section of the course.

Remark 9.30. Let T be a theory in a language L. So T is contained in Ls. LetF be the collection of all sentences P in Ls such that ¬P is in T. Finally let U bethe collection of all sentences that are neither in T nor in F. There are 3 possiblecases:

1) T is inconsistent (and in particular complete). In this case both T and Fcoincide with Ls and U contains no sentence.

2) T is consistent and incomplete. In this case any sentence is exactly in one ofT,F,U and U contains at least one sentence.

3) T is consistent and complete. In this case any sentence is exactly in one ofT,F hence U contains no sentence.

In case 3 note that we may define truth/falsehood of sentences syntactically asfollows: a sentence P is true if it is in T; and it is false if it is in F. This definitionis a definition in metalanguage (not in a language) and implies that a sentence iseither true or false and cannot be both true and false.

However, most theories that one is interested in (in particular set theory itself)are either in situation 1 or 2. More exactly they are in the following situation:there is no known metaproof that they are consistent; and one can metaprove thatif they are consistent then they are incomplete. (Actually one can even prove asentence in mathematics “expressing” the metasentece above.) So in these theorieswe have no reasonable definition of truth/falsehood.

Due to the discussion above it is safer and simpler to NOT define truth/falsehood;this is the way we follow in this course.

Exercise 9.31. Let T be a theory and let F be the collection of all sentences Psuch that ¬P is in T. Show that if P and Q are in T then P ∧Q is in T; and if at

A FIRST COURSE IN MATHEMATICS 35

least one of P or Q is in F then P ∧Q is in F. Also if exactly one of P and Q is inU then P ∧Q is in U. And if both P and Q are in U then P ∧Q is either in F orin U. The preceding utterances are all metasentences and can be summarized in aT/F/U “table” as follows:

P Q P ∧QT T TT F FF T FF F FU T UU F UT U UF U UU U F or U

The top of this table is entirely analogous to the T/F table of ∧. Show that theT/F tables of ∨,→,↔,¬ have similar analogous T/F/U tables. Note the followingdifference of status between the T/F tables and the T/F/U tables: the T/F tablesare just some meaningless graphical signs in metalanguage; they helped us defineproofs and theories; the T/F/U tables, being notation for metasentences, have ameaning.

Exercise 9.32. Assume neither P nor ¬P is in the theory T(A,B, ...). Give ametaproof that both T(P,A,B, ...) and T(¬P,A,B, ...) are consistent. Hint: As-sume for instance that T(P,A,B, ...) is inconsistent. LetQR...C ∧ ¬C

be a derivation of C ∧ ¬C from P,A,B, .... Then one notes thatAB...P → QP → R...P → (C ∧ ¬C)(C ∨ ¬C)→ ¬PC ∨ ¬C¬P

is a derivation of ¬P from A,B, ... so ¬P is in T(A,B, ...), a contradiction.

Exercise 9.33. Metaprove that if a consistent theory T is not complete then thereexist at least two consistent theories containing T and not coinciding with T.

Definition 9.34. A theory T is maximal if it is consistent and if any consistenttheory containing T coincides with T.

Exercise 9.35. Metaprove that a consistent theory is complete if and only if it ismaximal.

36 ALEXANDRU BUIUM

Remark 9.36. One can ask if any consistent theory is contained in a completetheory. The mirror of this in mathematical logic is a theorem (a consequence ofwhat is called Zorn’s lemma).

10. Proofs

Throughout this section we fix a language L as in the previous section (i.e.equipped with the standard 5 connectives ∧,∨,¬,→,↔, quantifiers ∀,∃, equality=, definitions, and witnesses) and we also fix a theory T in L with (specific) axiomsA,B, .... Recall the definition of a proof (Definition 9.8). In this section we wantto discuss this concept in some detail, and to give the first examples of proofs.

In order to make proofs more intelligible one usually uses labels and for each lineone indicates in parentheses how the line was derived and one uses the followingformat:

Theorem 10.1. U .

Proof.1. P (by...)2. Q (by...)3. R (by...)......635. U (by...)

The above should be viewed as a combination between the metasentence:

U is a theorem with proof: P,Q,R, ..., U

and a metaproof of it (which consists of the explanations (by....),(by...),...).

Here is an example of a theorem and a proof. It is a theorem in any theory. It isreferred to as the quantified modus ponens. Let P (x) and Q(x) be formulas withone free variable x; for each such pair of formulas we have the following:

Theorem 10.2. ((∃xP (x)) ∧ (∀x(P (x)→ Q(x))))→ (∃xQ(x)).

Proof.1. (∃xP (x))→ P (cP ) (quantifier axiom)2. (∀x(P (x)→ Q(x)))→ (P (cP )→ Q(cP )) (quantifier axiom)3. ((∃xP (x))∧ (∀x(P (x)→ Q(x))))→ (P (cP )∧ (P (cP )→ Q(cP ))) (by 1 and 2)4. ((P (cP ) ∧ (P (cP )→ Q(cP ))))→ Q(cP ) (tautology: modus ponens)5. Q(cP )→ ∃xQ(x) (quantifier axiom)6. ((∃xP (x)) ∧ (∀x(P (x)→ Q(x))))→ (∃xQ(x))) (by 3, 4, 5) �

Exercise 10.3. Explain each step in the above proof.

Example 10.4. Theorem 10.2 should be understood as an “infinite” collectionof theorems: for each choice of language and each choice of P (x) and Q(x) onehas a specific theorem (and a specific proof). For instance say that L contains abinary relational predicates †,� and c, b are constants. Then one has the followingtheorem:

((∃x(x † b)) ∧ (∀x((x † b)→ (x�c))))→ (∃x(x�c)).

The proof of Theorem 10.2 becomes, in this setting:

A FIRST COURSE IN MATHEMATICS 37

1. (∃(x † b))→ (cx†b † b)2. (∀x((x † b)→ (x�c)))→ ((cx†b † b)→ (cx†b�c))etc.

Here is another easy example of a theorem which is, again, a theorem in anytheory. Let P (x) and Q(x) be formulas with one free variable x; for each suchformula we have the following:

Theorem 10.5. (∀x(P (x) ∧Q(x)))→ (∀xP (x))

Proof.1. (∀x(P (x) ∧Q(x)))→ (P (cP ) ∧Q(cP )) (quantifier axiom)2. (P (cP ) ∧Q(cP ))→ P (cP ) (tautology)3. P (cP )→ (∀xP (x)) (quantifier axiom)4. (∀x(P (x) ∧Q(x)))→ (∀xP (x)) (by 1,2,3)

Exercise 10.6. Explain each step in the above proof.

The following theorem is, again, a theorem in any theory. Let R(x, y) be aformula with two free variables x, y; for each such formula we have the following:

Theorem 10.7. (∃x∀yR(x, y))→ (∀y∃xR(x, y))

Proof.1. (∃x∀yR(x, y))→ (∀yR(c∀yR(x,y), y)) (quantifier axiom)2. (∀yR(c∀yR(x,y), y))→ R(c∀yR(x,y), c∃xR(x,y)) (quantifier axiom)

3. R(c∀yR(x,y), c∃xR(x,y))→ (∃xR(x, c∃xR(x,y))) (quantifier axiom)4. (∃xR(x, c∃xR(x,y)))→ (∀y∃xR(x, y)) (quantifier axiom)5. (∃x∀yR(x, y))→ (∀y∃xR(x, y)) (tautology plus 1, 2, 3, 4)

Remark 10.8. The converse of Theorem 10.7 is seldom a theorem (unless ourtheory is inconsistent). Here is an example in English that suggests this. TranslateR(x, y) as “x is the father of y”. Then Theorem 10.7 is translated as “if thereis a person who is the father of everybody then everybody has a father”. But theconverse of Theorem 10.7 is translated as “if everybody has a father then there is aperson who is everybody’s father”.

Exercise 10.9. Prove the following sentences:1) (∀x∀yR(x, y))↔ (∀y∀xR(x, y))2) (∃x∃yR(x, y))↔ (∃y∃xR(x, y))3) (∀xP (x))→ (∃xP (x))

The following theorem is, again, a theorem in any theory. Let P (x) be a formulawith one free variable x; for each such formula we have the following:

Theorem 10.10. (¬(∀xP (x)))↔ (∃x(¬P (x))).

Proof.1. (∀xP (x))↔ P (cP ) (quantifier axiom)2. (¬(∀xP (x)))↔ (¬P (cP )) (by 1)3. (∃x(¬P (x))↔ (¬P (c¬P )) (quantifier axiom)4. P (cP )→ P (c¬P ) (quantifier axiom)5. (¬P (c¬P ))→ (¬P (cP )) (by 4)

38 ALEXANDRU BUIUM

6. (¬P (cP ))→ (¬P (c¬P )) (quantifier axiom)7. (¬P (cP ))↔ (¬P (c¬P )) (by 5 and 6)8. (¬(∀xP (x)))↔ (∃x(¬P (x))) (by 2, 3, 7)

Exercise 10.11. Explain each of the steps of the proof above.

Exercise 10.12. Prove the following sentences:1) (¬(∃xP (x)))↔ (∀x(¬P (x)))2) (¬(∀x∃yP (x, y)))↔ (∃x∀y(¬P (x, y)))

Remark 10.13. It should be clear what the rule is to negate sentences that startwith more quantifiers: negation changes ∀ into ∃, changes ∃ into ∀ and changes Pinto ¬P . E.g.

¬(∀x∃y∃z∀u∃vP (x, y, z, u, v))↔ (∃x∀y∀z∃u∀v(¬P (x, y, z, u, v)))

Example 10.14. We have

¬((∀x∃yP (x, y))→ (∃z∀u∀vR(z, u, v)))↔ ¬(¬(∀x∃yP (x, y)) ∨ (∃z∀u∀vR(z, u, v)))↔ (∀x∃yP (x, y)) ∧ ¬(∃z∀u∀vR(z, u, v)))↔ (∀x∃yP (x, y)) ∧ (∀z∃u∃v(¬R(z, u, v)))

Exercise 10.15. Explain all the steps in the above example.

Exercise 10.16. Prove the following sentences:1) (∀x(P (x) ∧Q(x)))↔ ((∀xP (x)) ∧ (∀xQ(x)))2) (∃x(P (x) ∨Q(x)))↔ ((∃xP (x)) ∨ (∃xQ(x)))

Exercise 10.17. Let f and g be two unary functional predicates. Prove the follow-ing sentences. These sentences correspond to what later in set theory will read: thecomposition of two surjections, respectively injections is a surjection, respectivelyan injection.

1) ((∀z∃y(f(x) = y)) ∧ (∀y∃x(g(x) = y)))→ (∀z∃x(f(g(x)) = z))2) ((∀x∀y((f(x) = f(y)) → (x = y))) ∧ (∀x∀y((g(x) = g(y)) → (x = y)))) →

((∀x∀y((f(g(x)) = f(g(y)))→ (x = y))

Exercise 10.18. Prove the following sentence:(¬(∃xP (x)))→ (∀x(P (x)→ Q(x)))

The sentence intuitively says that things that don’t exist have all the conceivableproperties. E.g. since unicorn don’t exist any unicorn is 10 feet long. Here P (x)stands for x is a unicorn and Q(x) stands for x is 10 feet long.

In the next theorem P is a sentence and Q(x) is a formula with free variable x.

Theorem 10.19. (∀x(P → Q(x)))→ (P → (∀xQ(x)))

Proof.1. (∀x(P → Q(x)))→ (P → Q(cQ)) (quantifier axiom)2. Q(cQ)↔ (∀xQ(x)) (quantifier axiom)3. (P → Q(cQ))↔ (P → (∀xQ(x)) (by 2)4. (∀x(P → Q(x)))→ (P → (∀xQ(x))) (by 1 and 3) �

Remark 10.20. Consider the following sentences:

A) (∀xP (x))→ P (t)→ (∃xP (x))→ P (cP )

A FIRST COURSE IN MATHEMATICS 39

B) (¬(∀xP (x)))↔ (∃x(¬P (x))).C) (∀x(P → Q(x)))→ (P → (∀xQ(x)))

A represents the quantifier axioms minus the one involving cP . B and C are Theo-rems 10.10 and 10.19 respectively. Conversely if one defines the witness assignmentsuch that cP is c¬P then the quantifier axioms follow from A and B. Note that thesentences A (with the axiom on cP removed), B, and C are taken as the quantifieraxioms in Manin’s book on Mathematical Logic; the moral of the discussion is thatour quantifier axioms (in a language in which cP coincides with cP ) are equivalentto Manin’s quantifier axioms plus the axiom on cP . Roughly, modulo witnesses,the two systems are equivalent.

Exercise 10.21. Let P (x) be a formula with free variable x and let Q be a sentence.Prove the following sentence:

(∀x(P (x)→ Q))↔ ((∃xP (x))→ Q)

Remark 10.22. Let L be a language (with witnesses), T a theory with axiomsA,B, .... For a sentence U consider the following statements:

1) ¬U is not a theorem.2) U is a theorem; i.e. there is a proof of U . Here we recall that the correctness

of a proof can be checked by a machine.3) There is an algorithm that guarantees that a given machine will decide (at

some unspecified point in time) whether or not there is a proof for U .4) There is an algorithm that guarantees that a given machine will decide (at

some unspecified point in time) whether or not there is a proof for U and will printthat proof in case there is one.

5) There is an algorithm that guarantees that a given machine will decide (withina known interval of time) whether there is a proof for U and will print such a proofif there is one.

Assertion 2 implies 1 (under the assumption of consistency). Also 5 implies4. Also 4 implies 3. Finally 1 and 3 imply 2. The above shows that there is awhole hierarchy of concepts related to proof: morally 1 is the weakest and 5 isthe strongest. However 5 is essentially never the case, while 1 is essentially nevercheckable. Among the concepts above the universally accepted standard of “proof”is provided by 2 which we have therefore called proof.

Remark 10.23. Given a language L we recall that it does not make sense to saythat a sentence in L is true (or false). However one sometimes abusively uses theword truth in relation to sentences of L as follows. If P is a sentence in L then bythe metasentence

(We) prove that P is true

we mean

(We) prove P

Furthermore the metasentence

(We) prove that P is false

means

(We) prove ¬PAs a variant, the metasentece

40 ALEXANDRU BUIUM

(We) give an example of an x such that P (x) is false

means

(We) prove ∃x(¬P (x))

A.s.o.

Remark 10.24. If one is given a translation of L into L′ (as usual, compatiblewith witness assignments) then any proof in L yields, in a natural way, a proof inL′. In particular if U is a theorem in L then the corresponding sentence U ′ in L′ isa theorem in L′.

Exercise 10.25. Formalize the following English sentences, negate their formal-ization, and give the English translation of these formalized negations:

1) If Plato eats at least one nut then Plato is a bird.2) Plato is a bird if and only if he eats at least one nut.3) For any planet there is a sun such that the planet revolves around that sun.4) There exists a sun with no planets revolving around it.

11. Argot

Given a language L and a translation of L into the English language LEng onemay construct a new language Largot, called argotic L (or simply argot, or slang) byreplacing some of the symbols in L by their corresponding symbols in LEng. Thenthere is a natural translation of L into Largot. When we write proofs we use howevera more flexible version of this translation which deserves, maybe, a different namebut which for simplicity we call still call translation. In this more flexible translationsentences may change more dramatically than in usual translation; new sentencesmay actually appear; others may disappear. Everything is subject to rules butrather than spelling out these rules we learn argot and the more flexible translationby example. The above type of translation has two steps: the first (which we alreadyperformed in the previous section) is to convert proofs from a sequence of sentencesin L into a sequence of metasentences containing the sentences themselves plus anexplanation of the inference process; the second step is to convert this sequence ofmetasentences back into a sequence of sentences in argotic L via a process involvingdisquotation (removing quotation marks).

Let us see how this works. As before we fix a theory T in L with axioms A,B, ....

Remark 11.1. Start with a proof:

PQR......U

Such a proof can be viewed (via disquotation) as a text in the language L. (Recallthat disquotation means replacing P,Q, ... by the sentences in L they stand for, withquotation marks deleted). Let us view above however as a text in metalanguage.As already noted such a proof can be made more intelligible by adding labels1, 2, 3, ... and explanations in parentheses showing how each sentence follows fromother sentences; the proof will be then a text in metalanguage that could look like:

A FIRST COURSE IN MATHEMATICS 41

1. P (tautology)2. Q (by axiom E and 1)3. R (by 1 and 2)........66. U (by axiom B and 3)

The above proof can be rewritten as a text in metalanguage:

Proof. From E and since P we get that Q. Since P and since Q it follows thatR........ By axiom B and since R it follows that U . �

We now apply disquotation to the latter text, i.e. replace the symbols E,P,Q, ...by the sentences in L that they stand for without quotation marks; in additionone is allowed to further replace some (or even all) of the symbols appearing inE,P,Q, ... by corresponding words in English; e.g. if

E equals “∀x∃y(x♥y)”

then the proof above becomes a text in argotic L which could look like:

Proof. By axiom [number or name of the axiom] for all x there is a y such thatx♥y; and since .... and ... it follows that ...

The process described above should be viewed as a (generalized concept of)translation of a proof from L into argotic L.

Remark 11.2. As explained in Remark 10.23 one sometimes abusively introducesin metasentences the empty words true, false in reference to sentences. This canbe done in argot as well: one may replace the words “we get” by “we get that ...is true” or “we get that ... holds”, etc. Then the translation of the above proof inargot reads:

Proof. From axiom E and since P we get that Q is true. Since P and since Q istrue it follows that R holds........ By axiom B and since R holds it follows that Uis true. �

We prefer not to do so but one encounters this abuse in mathematical texts quiteoften.

Example 11.3. The statement and proof of Theorem 10.2 can be translated intoargot as follows:

Theorem 11.4. If there exists x such that P (x) and if we know that, for all x,P (x) implies Q(x) it follows that there exists an x such that Q(x).

Proof. We know there exists c′ such that P (c′). Also we know that for allconstants c we have P (c) → Q(c). In particular for our c′ we have P (c′) → Q(c′).Since P (c′) it follows that Q(c′). This ends the proof. �

Note that c′ in the above proof plays the role of cP in the original proof.

Example 11.5. The statement and proof of Theorem 10.5 can be translated intoargot as follows:

Theorem 11.6. Assume that, for all x, P (x) ∧Q(x). Then for all x, P (x).

42 ALEXANDRU BUIUM

Proof. Let c be a constant. We know P (c) ∧ Q(c). In particular P (c). Since cwas arbitrary the conclusion follows. �

Note that unlike with cP (whose role was played in the previous example by a“specific” constant c′) the role of cP is played by an “arbitrary” constant c. Thewords “specific” and “arbitrary” are , of course, vague. Vagueness is inherent inargot.

Exercise 11.7. Translate the statement and proof of Theorem 10.10 into argot.

Remark 11.8. If U is of the form H → C (in which case H is called hypothesisand C is called conclusion) then the sentence U is translated into argot as If Hthen C or Assume H; then C. So

Theorem. H → C

is translated into argot as either of the following:

Theorem. If H then CTheorem. Assume H; then C

Similarly

Theorem. H ↔ C

is translated into argot as:

Theorem. H if and only if C

Remark 11.9. As with theorems and their proofs, definitions too can be expressedin argot. We will do this on a regular basis.

Remark 11.10. Formally metalanguage and argot are similar: they both involvesymbols from the original language plus English words. But metalanguage andargot should not be confused. Metalanguage has a meaning that we care about(coming from the fact that it is “about” the syntax, semantics, and proofs of sen-tences; most of the text in this course is written in metalanguage so we betterrecognize its meaning). On the contrary argot has a meaning that we ignore; intu-itively argot seems to be “about” the constants and variables of the language butit is, in fact, “about nothing”. One way to distinguish formally between metalan-guage and argot is to enforce the following rule: symbols in the original languageappear BETWEEN quotation marks in metalanguage and WITHOUT quotationmarks in argot.

12. Strategies

We discuss a series of standard strategies to prove theorems.As before we fix a theory T in L with specific axioms A,B, ....

Example 12.1. There are two strategies to prove a theorem of the form H → C:“direct proof” and “proof by contradiction”. A direct proof (in the form of sequenceof sentences) may look like this:

1. H → P (by axiom B)2. Q (by axiom A and 1)3. H → R (by 1 and 2)........

A FIRST COURSE IN MATHEMATICS 43

835. H → C (by axiom S and 3)

One usually translats the above proof into argot as follows:

Proof. Assume H. By axiom B we get P . By axiom A it follows that Q. Fromthe latter and by P we get R...... By axiom S and since R it follows that C. �

A proof of H → C by contradiction may look like this:

1. (¬C ∧H)→ P (by axiom S)2. Q (by axiom V and 1)3. (¬C ∧H)→ R (by 1 and 2).... ....691. (¬C ∧H)→ (A ∧ ¬A) (by 357 and 76)692. ¬(¬C ∧H) (by 691)693. H → C (by 692)

One usually translats a proof as above into argot as follows.

Proof. Assume ¬C and H and seek a contradiction. By axiom S we get P . Byaxiom V and P it follows that Q ..... By [here one needs to explicitly state 357 and76] we get A ∧ ¬A, a contradiction. This ends the proof. �

Exercise 12.2. Explain what was used in step 692 above.

Example 12.3. Direct proofs and proofs by contradiction can be given to sentencesU which are not necessarily of the form H → C. A direct proof of U is just a proofthat ends with U . A proof of U by contradiction could proceed as follows:

1. ¬U → P (by axiom A)2. P → Q (tautology)....99. ¬U → D ∧ ¬D (by 1 and 98)100. (¬D ∨D)→ U (by 99)101. U (by 100 and 3)

In argot:

Proof. Assume ¬U and seek a contradiction. Then by axiom A we get P . HenceQ.....Hence D ∧ ¬D, a contradiction. This ends the proof. �

Example 12.4. Here is a strategy to prove a theorem of the form (H ′∨H ′′)→ C;this is called a case by case proof. It may run as follows:

1. H ′ → P (by axiom A)2. H ′′ → Q (by axiom B)3. H ′ ∨H ′′ → P ∨Q (by 1 and 2)........76. P → C (by 55 and 56)77. Q→ C (by 64)78. P ∨Q→ C (by 76 and 77)79. H ′ ∨H ′′ → C (by 3 and 78)

One usually translates a proof as above into argot as follows; note that thistranslation of the proof completely changes the order of the various steps.

44 ALEXANDRU BUIUM

Proof. There are two cases: Case 1 is H ′; Case 2 is H ′′. Assume first that H ′.Then by axiom A we get P ..... By [here one needs to explicitly state 55 and 56]and since P we get C. Now assume that H ′′. By axiom B it follows that Q...... By[here one needs to explicitly state 64] and Q we get C. So in either case we get C.This ends the proof. �

Direct proofs, proofs by contradiction, and case by case proofs can be combined;we shall see this in examples. Here is a generic example:

Example 12.5. A proof of U by contradiction plus case by case may look asfollows. (In the proof below A is an arbitrary sentence. Deciding what A to use isa matter of art rather than science!)

1. (¬U ∧A)→ B (an axiom)2. B → C (by 1 and axiom...)...33. S → (Q ∧ ¬Q) (by Axiom ... and 4,32)34. (¬U ∧A)→ (Q ∧ ¬Q) (by 1,...,33)35. (¬Q ∨Q)→ (U ∨ ¬A) (by 34)36. ¬Q ∨Q (tautology)37. U ∨ ¬A (by 35, 36)38. (¬U ∧ ¬A)→ B′ (an axiom)39. B′ → C ′ (by ... and axiom...)...100. S′ → (Q′ ∧ ¬Q′) (by Axiom ... and ...)101. (¬U ∧ ¬A)→ (Q′ ∧ ¬Q′) (by 38,...,100)102. (¬Q′ ∨Q′)→ (U ∨A) (by 101)103. ¬Q′ ∨Q′ (tautology)104. U ∨A (by 102, 103)105. (U ∨ ¬A) ∧ (U ∨A) (by 37, 104)106. U (by 105)

One usually translates a proof as above into argot as follows.

Proof. Assume ¬U and seek a contradiction. There are two cases: either A or¬A. Assume first A. Then, since ¬U and A, by..., we get B. Hence by ... we getC..... Hence by ... we get S. Hence Q ∧ (¬Q), a contradiction. Now assume ¬A.Then, since ¬U and ¬A we get B′. Hence C ′..... Hence S′. Hence Q′∧ (¬Q′) whichis again a contradiction. This ends the proof.

Exercise 12.6. Explain each step in the above formal proof.

Example 12.7. In order to prove a theorem U of the form P ↔ Q one may precedeas follows:

1. P → S (by...)2. S → Q (by...)3. P → Q (by 1 and 2)4. Q→ T (by...)5. T → P (by...)6. Q→ P (by 4 and 5)7. P ↔ Q (by 3 and 6)

A FIRST COURSE IN MATHEMATICS 45

Alternatively, in argot,

Proof. We need to prove two things: first that P → Q; and next that Q → P .To prove P → Q assume P . By ... we get S. By ... we get Q. So P → Q is proved.To prove Q→ P assume Q. By ... we get T . By ... we get P . So Q→ P is proved.This ends the proof of the theorem. �

Example 12.8. Sometimes a theorem U has the statement:

The following conditions are equivalent:1) P;2) Q;3) R.

What is being meant is that U is

(P ↔ Q) ∧ (P ↔ R) ∧ (Q↔ R)

One proceeds “in a circle” as follows. (We just give the argot version of the proof).

Proof. It is enough to prove 3 things: first that P → Q; then that Q→ R; thenthat R → P . To prove that P → Q assume P . By P we get .... hence we get ....hence we get Q. To prove that Q → R assume Q. By Q we get .... hence we get.... hence we get R. To prove that R → P assume R. By R we get .... hence weget .... hence we get P . �

Exercise 12.9. Explain why, in the last Remark, it is enough to prove P → Q,Q→ R, and R→ P in order to conclude U .

Example 12.10. In order to prove a theorem of the form P ∧ Q one usuallyproceeds as follows:

Proof. We first prove P . By ... and ... it follows that ...; by ... and ... it followsthat P . Next we prove Q. By ... and ... it follows that ...; by ... and ... it followsthat Q. This end the proof. �

Example 12.11. In order to prove a theorem of the form P ∨Q one may proceedby contradiction as follows:

Proof. Assume ¬P and ¬Q and seek a contradiction. By ... and ... it followsthat ...; by ... and ... it follows that A∧¬A, a contradiction. This end the proof. �

Example 12.12. In order to prove a theorem of the form ∀xP (x) one may proceedas follows:

Proof. Let c be arbitrary. By ... it follows that ... and hence P (c). �

Example 12.13. In order to prove a theorem of the form ∃xP (x) one may pro-ceed as follows; this is called a proof by example and it applies only to existentialsentences ∃xP (x) (NOT to universal sentences like ∀xP (x)).

Proof. By ... we know that there exists c such that Q(c). By ... and ... it followsthat ... By ... and ... it follows that P (c). �

In the above c in NOT ARBITRARY but rather an EXAMPLE.

46 ALEXANDRU BUIUM

13. Fallacies

A fallacy is a logical mistake. Here are some typical fallacies:

Example 13.1. Confusing an implication with its converse. Say we want to provethat H → C. A typical mistaken proof would be: Assume C; then by ... we getthat ... hence H. The error consists of having proved Q→ P rather than P → Q.

Example 13.2. Proving a universal sentence by example. Say we want to prove∀xP (x). A typical mistaken proof would be: By ... there exists c such that ... hence... hence P (c). The error consists in having proved ∃xP (x) rather than ∀xP (x).

Example 13.3. Defining a constant twice. Say we want to prove that ¬(∃xP (x))by contradiction. A mistaken proof would be: Assume there exists c such P (c).Since we know that ∃xQ(x) let c be (or define c) such that Q(c). By P (c) and Q(c)we get ... hence ..., a contradiction. The error consists in defining c twice in twounrelated ways: first c plays the role of the specific constant cP ; then c plays therole of cQ. But cP and cQ are not the same. We will see examples of this later; seeExercise 20.28.

Exercise 13.4. Give examples of wrong proofs of each of the above types. If youcan’t now wait until we get to discuss the integers.

Remark 13.5. Later, when we discuss induction we will discuss another typicalfallacy; cf. Example 21.7.

14. Examples

We analyze in what follows a few toy examples of theories and proofs of theoremsin these theories. Later we will present the main example of theory which is settheory (identified with mathematics itself). As usual the witness assignment in ourlanguages will not be given explicitly.

Example 14.1. The first example is what later in mathematics will be referred toas the uniqueness of neutral elements. The language L of the theory has constantse, f, ..., variables x, y, ..., and a binary functional predicate ?. We introduce thefollowing:

Definition 14.2. e is called a neutral element if

∀x((e ? x = x) ∧ (x ? e = x)).

So we added “is a neutral element” as a new relational predicate. We do notconsider any specific axiom. We prove the following:

Theorem 14.3. If e and f are neutral elements then e = f .

The sentence that needs to be proved is: “If H then C”. Recall that in generalfor such a sentence H is called hypothesis and C is called conclusion. Here is adirect proof:

Proof. Assume e and f are neutral elements. Since e is a neutral element itfollows that ∀x(e ? x = x). By the latter e ? f = f . Since f is a neutral element weget ∀x(x ? f = x). So e ? f = e. Hence we get e = e ? f . Hence we get e = f . �

Remark 14.4. Note that we skipped some of the “explanations” showing whatsentences have been used at various stages of the proof; this is common practice.

A FIRST COURSE IN MATHEMATICS 47

Exercise 14.5. Convert the above argot proof into the original (non-argotic) lan-guage. Here by “convert” we understand finding a proof whose translation in argotis the given argot proof.

Here is a proof by contradiction of the same theorem:

Proof. Assume e 6= f and seek a contradiction. So either e 6= e ? f or e ? f 6= f .Since ∀x(e ? x = x) it follows that e ? f = f . Since ∀x(x ? f = x) we get e ? f = e.So we get e = e ? f , a contradiction. �

Exercise 14.6. Convert the above argot proof into the original (non-argotic) lan-guage.

Example 14.7. The next example is related to the famous Pascal wager. Thestructure of Pascal’s argument is as follows. If the reward exists and I believe itexists then I get the reward. If the reward exists and I do not believe it existsthen I do not get the reward. If the reward does not exist but I believe it existsI do not get the reward. Finally if the reward does not exist and I do not believeit exists then I do not get the reward. Pascal’s conclusion is that if he believedthe reward exists then there would be a one chance in 2 that he get the rewardwhereas if he did not believe the reward exists then there would be a zero chancethat he get the reward. So he should believe the reward exists. The next exampleis a variation of Pascal’s wager showing that if one requires “sincere” belief ratherthan just belief based on logic then Pascal will not get the reward. Indeed assumethe specific axioms:

A1) If the reward exists and a person does not believe sincerely in its existencethen that person will not get the reward.

A2) If the reward does not exist then nobody gets the reward.A3) If a person believes the reward exists motivated only by Pascal’s wager then

that person does not believe sincerely.We want to prove the following

Theorem 14.8. If Pascal believes motivated only by his own wager then he willnot get the reward.

All of the above is formulated in the English language L′. We consider a simplerlanguage L and a translation of L into L′.

The new language L contains among its constant p (for Pascal) and contains 4unary relational predicates g, w, s, r whose translation in English is as follows:g is translated as “is the reward”w is translated as “believes motivated only by Pascal’s wager”s is translated as “believes sincerely”r is translated as “gets the reward”The specific axioms areA1) ∀y((∃xg(x)) ∧ (¬s(y))→ ¬r(y))A2) ∀y((¬(∃xg(x)))→ (¬r(y)))A3) ∀y(w(y)→ (¬(s(y)))).In this language Theorem 14.8 is the translation of the following:

Theorem 14.9. If w(p) then ¬r(p).

So to prove Theorem 14.8 in L′ it is enough to prove Theorem 14.9 in L. Wewill do this by using a combination of direct proof and case by case proof.

48 ALEXANDRU BUIUM

Proof of Theorem 14.9. Assume w(p). There are two cases: the first case is∃xg(x); the second case is ¬(∃xg(x)). Assume first that ∃xg(x). Since w(p), byaxiom A3 it follows that ¬s(p). By axiom A1 (∃xg(x)) ∧ (¬s(p))→ ¬r(p)). Hence¬r(p). Assume now ¬(∃xg(x)). By axiom A2 we then get again ¬r(p). So in eithercase we get ¬r(p) which ends the proof.

Exercise 14.10. Convert the above argot proof into the original (non-argotic)language.

Example 14.11. The next example is again a toy example and comes from physics.In order to present this example we do not need to introduce anything physical.But it would help to keep in mind the two slit experiment in quantum mechanics(for which I refer to Feynman’s course, say). Now there are two types of phys-ical theories that can be referred to as phenomenological and explanatory. Theyare intertwined but very different in nature. Phenomenological theories are simplydescriptions of phenomena/effects of (either actual or possible) experiments; ex-amples of such theories are those of Ptolemy, Copernicus, or that of pre-quantumexperimental physics of radiation. Explanatory theories are systems postulatingtranscendent causes that act from behind phenomena; examples of such theoriesare those of Newton, Einstein, or quantum theory. The theory below is a baby ex-ample of the phenomenological (pre-quantum) theory of radiation; our discussion istherefore not a discussion of quantum mechanics but rather it suggests the necessityof introducing quantum mechanics. The language L′ and definitions are those ofexperimental/phenomenological (rather than theoretical/explanatory) physics. Wewill not make them explicit. Later we will move to a simplified language L and willnot care about definitions.

Consider the following specific axioms (which are the translation in English ofthe phenomenological predictions of classical particle mechanics and classical wavemechanics respectively):

A1) If radiation in the 2 slit experiment consists of a beam of particles then theimpact pattern on the photographic plate consists of a series of successive flashesand the pattern has 2 local maxima.

A2) If radiation in the 2 slit experiment is a wave then the impact pattern onthe photographic plate is not a series of successive flashes and the pattern has morethan 2 local maxima.

We want to prove the following

Theorem 14.12. If in the 2 slit experiment the impact consists of a series ofsuccessive flashes and the impact pattern has more than 2 local maxima then in thisexperiment radiation is neither a beam of particles nor a wave.

The sentence reflects one of the elementary puzzles that quantum phenomenaexhibit: radiation is neither particles nor waves but something else! And thatsomething else requires a new theory which is quantum mechanics. (A commonfallacy would be to conclude that radiation is both particles and waves !!!) Ratherthan analyzing the language L′ of physics in which our axioms and sentence arestated (and the semantics that goes with it) let us introduce a simplified languageL as follows.

A FIRST COURSE IN MATHEMATICS 49

We consider the language L with constants a, b, ..., variables x, y, ..., and unaryrelational predicates p, w, f , m. Then there is a translation of L into L′ such that:• p is translated as “is a beam of particles”• w is translated as “is a wave”• f is translated as “produces a series of successive flashes”• m is translated as “produces a pattern with 2 local maxima”.Then we consider the specific axiomsA1) ∀x(p(x)→ (f(x) ∧m(x))).A2) ∀x(w(x)→ (¬f(x)) ∧ ¬m(x))).

Theorem 14.12 above is the translation of the following theorem in L:

Theorem 14.13. ∀x((f(x) ∧ (¬m(x)))→ ((¬p(x)) ∧ (¬w(x)))).

So it is enough to prove Theorem 14.13. The proof below is, as we shall see, acombination of proof by contradiction and case by case.

Proof. We proceed by contradiction. So assume there exists a such that f(a) ∧(¬m(a)) and ¬(¬p(a)∧(¬w(a))) and seek a contradiction. Since ¬(¬p(a)∧(¬w(a)))we get p(a) ∨ w(a). There are two cases. The first case is p(a); the second caseis w(a). We will get a contradiction in each of these cases separately. Assumefirst p(a). Then by axiom A1 we get f(a) ∧m(a), hence m(a). But we assumedf(a) ∧ (¬m(a)), hence ¬m(a), so we get a contradiction. Assume now w(a). Byaxiom A2 we get (¬f(a))∧ (¬m(a)) hence ¬f(a). But we assumed f(a)∧ (¬m(a)),hence f(a), so we get again a contradiction. �.

Exercise 14.14. Convert the above argot proof into the original (non-argotic)language.

Exercise 14.15. Consider the specific axioms A1 and A2 above and also the spe-cific axioms:

A3) ∃x(f(x) ∧ (¬m(x)))A4) ∀x(p(x) ∨ w(x)).Metaprove that the theory with specific axioms A1, A2, A3, A4 is inconsistent.

A3 is translated as saying that in some experiments one sees a series of successiveflashes and, at the same time, one has more that 2 maxima. Axiom A4 is translatedas saying that any type of radiation is either particles or waves. The inconsistencyof the theory says that classical (particle and wave) mechanics is not consistentwith experiment. (So a new mechanics, quantum mechanics, is needed.) Note thatnone of the above discussion has anything to do with any concrete proposal for aquantum mechanical theory; all that the above suggests is the necessity of such atheory.

Example 14.16. The next example is a logical puzzle from the Mahabharata.King Yudhishthira loses his kingdom at a game of dice against Sakuni; then hestakes himself and he loses himself; then he stakes his wife Draupadi and loses hertoo. She objects by saying that her husband could not have staked her because hedid not own her anymore after losing himself. Here is a possible formalization ofher argument.

We use a language with constants i, d, ..., variables x, y, z, ..., the relational binarypredicate “owns”, quantifiers, and equality =. We define a predicate 6= by (x 6=y)↔ (¬(x = y)). Consider the following specific axioms:

A1) For all x, y, z if x owns y and y owns z then x owns z.

50 ALEXANDRU BUIUM

A2) For all y there exists x such that x owns y.A3) For all x, y, z if y owns x and z owns x then y = z.We will prove the following

Theorem 14.17. If i does not own himself then i does not own d.

Proof. We proceed by contradiction. So we assume i does not own i and i ownsd and seek a contradiction. There are two cases: first case is d owns i; the secondcase is d does not own i. We prove that in each case we get a contradiction. Assumefirst that d owns i; since i owns d, by axiom A1, i owns i, a contradiction. Assumenow d does not own i. By axiom A2 we know that there exists j such that j ownsi. Since i does not own i it follows that j 6= i. Since j owns i and i owns d, byaxiom A1, j owns d. But i also owns d. By axiom A3, i = j, a contradiction. �

Exercise 14.18. Convert the above argot proof into the original (non-argotic)language.

Example 14.19. This example illustrates the logical structure of the Newtoniantheory of gravitation that unified Galileo’s phenomenological theory of falling bodies(the physics on Earth) with Kepler’s phenomenological theory of planetary motion(the physics of “Heaven”); Newton’s theory counts as an explanatory theory be-cause its axioms, being truly general, go beyond experiment. The language L inwhich we are going to work has variables x, y, ..., constants S,E,M (translated inEnglish as “Sun, Earth, Moon”), a constant R (translated as “the radius of theEarth”), constants 1, π, r (where r is translated as a particular rock), relationalpredicates p, c, n (translated in English as “is a planet, is a cannon ball, is a num-ber”), a binary relational predicate ◦ (whose syntax is x ◦ y and whose translationin English is “x revolves around the fixed body y”), a binary relational predicatef (where f(x, y) is translated as “x falls freely under the influence of y”), a binaryfunctional predicate d (“distance between the centers of”), a unary functional pred-icate a (“acceleration”), a unary functional predicate T (where T (x, y) translated as“period of revolution of x around y”), binary functional predicates :,× (“division,multiplication”), and all the standard connectives, quantifiers, and parentheses.Note that we have no predicates for mass and force; this is remarkable becauseit shows that the Newtonian revolution has a purely geometric content. Now weintroduce a theory T in L via its special axioms. The special axioms are as follows.First one asks that distances are numbers:

∀x∀y(n(d(x, y)))

and the same for accelerations, and times of revolution. (Note that we view allphysical quantities as measured in centimeters and seconds.) For numbers we askthat multiplication and division of numbers are numbers:

(n(x) ∧ n(y))→ (n(x : y) ∧ n(x× y))

and that the usual laws relating : and × hold. Here are two :

∀x(x : x = 1)∀x∀y∀z∀u((x : y = z : u)↔ (x× u = z × y))

Exercise: write down the all these laws. We sometimes write xy , 1/x, xy, x

2, x3, ...

in the usual sense. The above is a “baby mathematics” and this is all mathematicswe need. Next we introduce an axiom whose justification is in mathematics, indeed

A FIRST COURSE IN MATHEMATICS 51

in calculus; here we ignore the justification and just take this as an axiom. Theaxiom gives a formula for the acceleration of a body revolving in a circle around afixed body. (See the exercise after this example.) Here is the axiom:

A) ∀x∀y((x ◦ y)→ (a(x, y) = 4π2d(x,y)

T 2(x,y)

))To this one adds the following “obvious” axioms

O1) ∀x(c(x)→ d(x,E) = R)O2) M ◦ ER) c(r)K1) ∀x(p(x)→ (x ◦ S))

saying that the distance between cannon balls and the center of the Earth is theradius of the Earth; that the Moon revolves around the Earth; that the rock r is acannon ball; and that all planets revolve around the Sun. (The latter is Kepler’sfirst law in an approximate form; the full Kepler’s first law specifies the shape oforbits as ellipses, etc.) Now we consider the following sentences (NOT AXIOMS!):

G) ∀x∀y((c(x) ∧ c(y))→ (a(x,E) = a(y,E))

K3) ∀x∀y(

(p(x) ∧ p(y))→(d3(x,S)T 2(x,S) = d3(y,S)

T 2(y,S)

))N) ∀x∀y∀z

((f(x, z) ∧ f(y, z))→

(a(x,z)

1/d2(x,z) = a(y,z)1/d2(y,z)

))G represents Galileo’s great empirical discovery that all cannon balls (by whichwe mean here terrestrial airborne objects with no self-propulsion) have the sameacceleration towards the Earth. K3 is Kepler’s third law which is his empiricalgreat discovery that the cubes of distances of planets to the Sun are in the sameproportion as the squares of their periods of revolution. Kepler’s second law aboutequal areas being swept in equal times is somewhat hidden in axiom A above.N is Newton’s law of gravitation saying that the accelerations of any two bodiesmoving freely towards a fixed body are in the same proportion as the inverses ofthe squares of the respective distances to the (center of the) fixed body. Newton’sgreat invention is the creation of a binary predicate f (where f(x, y) is translated inEnglish as “x is in free fall with respect to y”) equipped with the following axioms

F1) ∀x(c(x)→ f(x,E))F2) f(M,E)F3) ∀x(p(x)→ f(x, S))

expressing the idea that cannon balls and the Moon moving relative to the Earthand planets moving relative to the Sun are instances of a the more general predicateexpressing “free falling”. Finally let us consider the definition

g = a(r, E)

and the following sentence:

X) g = 4π2d3(M,E)R2T 2(M,E)

The main results are the following theorems in T:

Theorem 14.20. N → X

Proof. See the exercise after this example.

Theorem 14.21. N → G

52 ALEXANDRU BUIUM

Proof. See the exercise after this example.

Theorem 14.22. N → K3

Proof. See the exercise after this example.

So if one accepts Newton’s N then Galileo’s G and Kepler’s K3 follow, thatis to say that N “unifies” terrestrial physics with the physics of Heaven. Thebeautiful thing is however that N not only unifies known paradigms but “predicts”new “facts”, e.g. X. Indeed one can verify X using experimental (astronomicaland terrestrial physics) data: if one enlarges our language to include numeralsand numerical computations and if one introduces axioms as below (justified bymeasurements) then X becomes a theorem. Here are the additional axioms:

g = 981 (the number of centimeters per second squared representing g)π = 314

100 (approximate value)R =number of centimeters representing the radius of the Earth (measured for

the first time by Eratosthenes using shadows at two points on Earth)d(M,E) =number of centimeters representing the distance from Earth to Moon

(measured using parallaxes).T (M,E) =number of seconds representing the time of revolution of the Moon

(the equivalent of 28 days).

The fact that X is verified with the above data is the miraculous computation doneby Newton that convinced him of the validity of his theory; see the exercise afterthis example.

Remark 14.23. This part of Newton’s early work had a series of defects: it wasbased on the circular (as opposed to elliptical) orbits, it assumed the center ofthe Earth (rather than all the mass of the Earth) as responsible for the effect onthe cannon balls, it addressed only revolution around a fixed body (which is notrealistic in the case of the Moon, since, for instance, the Earth itself is moving), anddid not explain the difference between the d3/T 2 of planets around the Sun andthe corresponding quantity for the Moon and cannon balls relative to the Earth.Straightening these and many other problems is part of the reason why Newtonpostponed publication of his early discoveries. The final theory of Newton involvesthe introduction of absolute space and time, mass, and forces. The natural todevelop it is within mathematics, as mathematical physics; this is essentially theway Newton himself presented his theory in published form. However the aboveexample suggests that the real breakthrough was not mathematical but at the levelof (pre-mathematical) logic.

Exercise 14.24.1) Justify Axiom A above using calculus or even Euclidean geometry plus the

definition of acceleration in an introductory physics course.2) Prove Theorems 14.20, 14.21, and 14.22.3) Verify that with the numerical data for g, π,R, d(M,E), T (M,E) available

from astronomy (find the numbers in astronomy books) the sentence X becomes atheorem. This is Newton’s famous computation.

A FIRST COURSE IN MATHEMATICS 53

15. ZFC

Mathematics is a particular theory Tset (called set theory) in a particular lan-guage Lset (called the language of set theory) with specific axioms called the ZFCaxioms. So Tset will be T(ZFC). We will introduce all of this presently.

Definition 15.1. The language Lset of set theory is the language with variablesx, y, z, ..., constants a, b, c, ..., A,B,C, ...,A,B,C, ..., α, β, γ, ... , no functional pred-icate, a binary relational predicate ∈, connectives ∨,∧,¬,→,↔, quantifiers ∀,∃,equality =, and typographical signs (, ), ,. We also assume a witness assignment onLset and definitions in Lset are given. We take the liberty to add, whenever it isconvenient, new constants and new predicates to Lset together with definitions foreach of these new symbols.

Remark 15.2. In the above definition the witness assignment and the definitionswere left unspecified. If one wants to pin down this concept further one can startwith the language L0 that has variables x, y, z, ..., no constants, no functional pred-icate, a binary relational predicate ∈, connectives ∨,∧,¬,→,↔, quantifiers ∀,∃,equality =, and separators (parentheses (, ) and comma); one assumes no definitionin L0. Then one can take L′ to be the witness closure of L0. Next we let L be ob-tained from L′ by adding new constants a, b, ..., together with definitions for them,and also adding new relational predicates 6=, 6∈, ⊂, 6⊂ defined by

∀x∀y((x 6= y)↔ (¬(x = y)))∀x∀y((x 6∈ y)↔ (¬(x ∈ y)))∀x∀y((x ⊂ y)↔ (∀z((z ∈ x)→ (z ∈ y))))∀x∀y((x 6⊂ y)↔ (¬(x ⊂ y)))

This L is the prototypical example we have in mind for Lset. Even with otherexamples of Lset we will maintain the above definitions of 6=, 6∈,⊂, 6⊂.

Remark 15.3. We recall the fact that Lset being a language it does not makesense to say that a sentence in it (such as, for instance, a ∈ b) is true or false.

Remark 15.4. Note that the collection of sets is “metacountable” in the sensethat it can be arranged in an (unending) sequence of symbols with commas inbetween. Later we will introduce the concept of “countable” set and we will showthat not all sets are countable. This seems to be a paradox (referred to as theSkolem paradox.) Of course this is not going to be a paradox: “metacountable”and “countable” will be two different concepts. The words “metacountable” belongto the metalanguage and can be translated in English in terms of arranging signson a piece of paper; whereas “b is countable” is a definition in set theory; we define“b is countable ↔ C(b)” where C(x) is a certain formula with free variable x in thelanguage of set theory.

Remark 15.5. There is a standard translation of the language Lset of set theoryinto the English language as follows:a, b, ... are translated as “the set a”, “the set b”,...∈ is translated as “belongs to the set” or as “is an element of the set”= is translated as “equals”⊂ is translated as “is a subset”∀ is translated as “for all sets”∃ is translated as “there exists a set”

54 ALEXANDRU BUIUM

while the connectives are translated in the standard way.

Remark 15.6. Once we have a translation of Lset into English we can speak ofargot and translation of Lset into argot; this simplifies comprehension of mathe-matical texts considerably.

Remark 15.7. The standard translation of the language of set theory into English(in the remark above) is standard only by convention. A perfectly good differenttranslation is, for instance, the one in whicha, b, ... are translated as “crocodile a”, “crocodile b”,...∈ is translated as “is dreamt by the crocodile”= is translated as “has the same taste”∀ is translated as “for all crocodiles”∃ is translated as “there exists a crocodile”One could read mathematical texts in this translation; admittedly the English

text that would result from this translation would be somewhat strange.

Remark 15.8. Note that mathematics uses other symbols as well such as

≤, ◦,+,×,∑

an,Z,Q,R,C,Fp,≡, lim an,

∫f(x)dx,

df

dx, ...

These symbols will originally be all sets (hence constants) and will be introducedthrough appropriate definitions (like the earlier definition of an elephant); they willall be defined through the predicate ∈. In particular in the language of sets + or× are NOT originally functional predicates; and ≤ is NOT originally a relationalpredicate. However we will later tacitly enlarge the language of set theory by addingpredicates (usually still denoted by) + or ≤ via appropriate definitions.

We next introduce the (specific) axioms of set theory.

Axiom 15.9. (Singleton axiom)

∀x∃y((x ∈ y) ∧ (∀z((z ∈ y)→ (z = x)))

The translation in argot is that for any set x there is a set y whose only elementis x.

Axiom 15.10. (Unordered pair axiom)

∀x∀x′∃y((x ∈ y) ∧ (x′ ∈ y) ∧ ((z ∈ y)→ ((z = x) ∨ (z = x′))))

In argot the translation is that for any two sets x, x′ there is a set that only hasthem as elements.

Axiom 15.11. (Separation axioms) For any formula P (x) in the language of sets,having a free variable x, we introduce an axiom

∀y∃z∀x((x ∈ z)↔ ((x ∈ y) ∧ (P (x)))

The translation in argot is that for any set y there is a set z whose elements are allthe elements x of y such that P (x).

Axiom 15.12. (Extensionality axiom)

∀u∀v((u = v)↔ ∀x((x ∈ u)↔ (x ∈ v)))

The translation in argot is that two sets u and v are equal if and only if they havethe same elements.

A FIRST COURSE IN MATHEMATICS 55

Axiom 15.13. (Union axiom)

∀w∃u∀x((x ∈ u)↔ (∃t((t ∈ w) ∧ (x ∈ t))))The translation in argot is that for any set w there exists a set u such that for anyx we have that x is an element of u if and only if x is an element of one of theelements of w.

Axiom 15.14. (Empty set axiom)

∃x∀y(y 6∈ x)

The translation in argot is that there exists a set that has no elements.

Axiom 15.15. (Power set axiom)

∀y∃z∀x((x ∈ z)↔ (∀u((u ∈ x)→ (u ∈ y))))

The translation in argot is that for any set y there is a set z such that a set x is anelement of z if and and only if all elements of x are elements of y.

For simplicity the rest of the axis will be formulated in argot only.

Axiom 15.16. (Axiom of choice). For any set w whose elements are pairwisedisjoint sets there is a set that has exactly one element in common with each of thesets in w. (Two sets are disjoint if they have no element in common. The elementsof a set are pairwise disjoint if any two elements are disjoint.)

Axiom 15.17. (Axiom of infinity). There exists a set x such that x contains someelement u and such that for any y ∈ x there exists z ∈ x with the property that y isthe only element of z. Intuitively this axiom guarantees the existence of “infinite”sets.

Axiom 15.18. . (Axiom of foundation). For any set x there exists y ∈ x such thatx and y are disjoint.

One finally adds a technical list of axioms (indexed by formulas P (x, y, z)) aboutthe “images of maps with parameters z”; we shall not state it precisely and we shallnot use it in the sequel.

Axiom 15.19. (Axiom of replacement). If for any z and any u we have thatP (x, y, z) “defines y as a function of x ∈ u” (i.e. for any x ∈ u there exists a uniquey such that P (x, y, z)) then for all z there is a set v which is the “image of thismap” (i.e. v consists of all y’s with the property that there is an x ∈ u such thatP (x, y, z)). Here x, z may be tuples of variables.

Exercise 15.20. Write the axioms of choice, infinity, foundation, and replacementin the language of sets.

Definition 15.21. All of the above axioms are called the ZFC system of axioms(Zermelo-Fraenkel+Choice).

Remark 15.22. Now if set theory is to have a meaning at all one needs to be ableto give a metaproof of the metasentence:

set theory is consistent

Otherwise all sentences of set theory are theorems which would make mathematicsutterly uninteresting. No such metaproof is known! The only argument in favor

56 ALEXANDRU BUIUM

of consistency of set theory seems to be the fact that nobody could prove so fara statement of the form P ∧ ¬P in this theory. And indeed we cannot expect toactually metaprove consistency of set theory because metaproof tools are too softto tackle the question. Once mathematics is introduced and mathematical logic isdeveloped one can revisit/reformulate the consistency problem for set theory as asentence C in set theory whose translation in English will be:

set theory is formally consistent

(See the last section of the course for an explanation of this sentence.) Godelproved that the sentence C is not “formally provable” (in a sense that, again, willbe explained in the last section of the course).

Remark 15.23. Note the important fact that the axioms did not involve constants.In the next section we investigate the constants, i.e. the sets.

16. Sets

We finally start discussing/proving facts about set theory proper (the Level 2set theory in the Introduction).

FROM THIS MOMENT ON ALL PROOFS IN THIS COURSE WILL BEWRITTEN IN ARGOT; ALSO, UNLESS OTHERWISE STATED, ALL PROOFSREQUIRED TO BE GIVEN IN THE EXERCISES MUST BE WRITTEN IN AR-GOT.

Recall that we introduced mathematics/set theory as being a specific theoryT(ZFC) in the language Lset, which we called Tset, where ZFC is a list of axiomsthat was described in the last section.

Recall the following:

Definition 16.1. A set is a constant in the language of set theory.

Notation 16.2. Sets will be denoted by a, b, ..., A,B, ...,A,B, ..., α, β, γ, ....

Notation 16.3. We define a new constant ∅ as being equal to the witness for theaxiom ∃x∀y(y 6∈ x); in other words ∅ is defined by

∅ = c∀y(y 6∈x)

We usually say that ∅ denotes the witness above.

Notation 16.4. If a is a set we introduce a new constant {a} defined to be thewitness for the sentence ∃yP where

P equals “(a ∈ y) ∧ (∀z((z ∈ y)→ (z = a)))”

In other words {a} is defined by

{a} = cP = c(a∈y)∧(∀z((z∈y)→(z=a)));

The sentence ∃yP is a theorem (use the singleton axiom) so the following is atheorem:

(a ∈ {a}) ∧ (∀z((z ∈ {a})→ (z = a)))

We can say (and we will usually say, by abuse of terminology) that {a} is “theunique” set containing a only among its elements; we will often use this kind ofabuse of terminology.

A FIRST COURSE IN MATHEMATICS 57

Notation 16.5. In particular {{a}} denotes the set whose only element is the set{a}, etc.

Notation 16.6. For any two sets a, b with a 6= b denote by {a, b} the set that onlyhas a and b as elements; the set {a, b} exists by the unordered pair axiom. Alsowhenever we write {a, b} we implicitly understand that a 6= b.

Notation 16.7. For any set A and any formula P (x) in the language of sets, havingone free variable x we denote by A(P ) or {a ∈ A;P (a)} the set whose elements arethe elements a ∈ A such that P (a); the set A(P ) equals by definition the witnessfor the separation axiom that corresponds to A and P .

Exercise 16.8. Explain the last claim.

Lemma 16.9. If A = {a} and B = {b, c} then A 6= B.

Proof. We proceed by contradiction. So assume A = {a}, B = {b, c} and A = Band seek a contradiction. Indeed since a ∈ A and A = B, by the extensionalityaxiom we get a ∈ B. Hence a = b or a = c. Assume a = b and seek a contradiction.(In the same way we get a contradiction by assuming a = c.) Since a = b we getB = {a, c}. Since c ∈ B and A = B, by the extensionality axiom we get c ∈ A. Soc = a. Since a = b we get b = c. But by our notation for sets (elements listed aredistinct) we have b 6= c, a contradiction. �

Exercise 16.10. Prove that:1) If {a} = {b} then a = b.2) {a, b} = {b, a}.3) {a} = {x ∈ {a, b};x 6= b}.4) There is a set b whose only elements are {a} and {a, {a}}; so b = {{a}, {a, {a}}}.

Exercise 16.11. Prove that1) A(P ∧Q) = A(P ) ∩A(Q);2) A(P ∨Q) = A(P ) ∪A(Q);3) A(¬P ) = A\A(P );

Notation 16.12. For any two sets A and B we define the set A ∪ B (called theunion of A and B) as the set such that for all c, c ∈ A ∪ B if and only if c ∈ A orc ∈ B; the set A ∪B is a witness for the union axiom.

Definition 16.13. The intersection of two sets A and B is the set

A ∩B = {c ∈ A; c ∈ B}.

The difference is the set

A\B = {c ∈ A; c 6∈ B}.

Exercise 16.14. Prove that if a, b, c are such that (a 6= b)∧ (b 6= c)∧ (a 6= c) thenthere is a set denoted by {a, b, c} whose only elements are a, b, c; in other wordsprove the following sentence:

∀x∀x′∀x′′∃y((x ∈ y)∧(x′ ∈ y)∧(x′′ ∈ y)∧(z ∈ y)→ ((z = x)∨(z = x′)∨(z = x′′)))

Hint: use the singleton axiom, the unordered pair axiom, and the union axiom,applied to the set {{a}, {b, c}}.

58 ALEXANDRU BUIUM

Notation 16.15. Similarly one defines sets {a, b, c, d}, etc. Whenever we write{a, b, c} or {a, b, c, d}, etc, we imply that the elements in each set are pairwiseunequal (pairwise distinct). Also denote by

{a, b, c, ...}any set d such that

(a ∈ d) ∧ (b ∈ d) ∧ (c ∈ d)

So the dots indicate that there may (or may not) be other elements in d other thana, b, c; also note that when we write {a, b, c, ...} we implicitly imply that a, b, c arepairwise distinct.

Definition 16.16. Two sets a and b are disjoint if a ∩ b = ∅.

Exercise 16.17.1) Prove that {∅} 6= ∅.2) Prove that {{∅}} 6= {∅}.

Remark 16.18. By the definition of the predicate ⊂ if A and B are sets then Ais a subset of B if and only if for all c, if c ∈ A then c ∈ B; we write A ⊂ B.

Remark 16.19. A = B if and only if A ⊂ B and B ⊂ A.

Exercise 16.20. Prove that:1) {a, b, c} = {b, c, a}.2) {a, b} 6= {a, b, c}; Hint: use c 6= a, c 6= b;3) {a, b, c} = {a, b, d} if and only if c = d.

Exercise 16.21. Let A = {a, b, c} and B = {c, d}. Prove that1) A ∪B = {a, b, c, d},2) A ∩B = {c}, A\B = {a, b}.

Exercise 16.22. Let A = {a, b, c, d, e, f}, B = {d, e, f, g, h}. Compute1) A ∩B,2) A ∪B,3) A\B,4) B\A,5) (A\B) ∪ (B\A).

Exercise 16.23. Prove the following:1) A ∩B ⊂ A,2) A ⊂ A ∪B,3) A ∩ (B ∪ C) = (A ∩B) ∪ (A ∩ C),4) A ∪ (B ∩ C) = (A ∪B) ∩ (A ∪ C),5) (A\B) ∩ (B\A)− ∅.

Notation 16.24. For any set A we define the set P(A) as the set whose elementsare the subsets of A; we call P(A) the power set of A; P(A) is a witness for (atheorem obtained from) the power set axiom.

Example 16.25. If A = {a, b, c} then

P(A) = {∅, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}}.

Exercise 16.26. Let A = {a, b, c, d}. Write down the set P(A).

Exercise 16.27. Let A = {a, b}. Write down the set P(P(A)).

A FIRST COURSE IN MATHEMATICS 59

Definition 16.28. (Ordered pairs) Let A and B be sets and let a ∈ A, b ∈ B.If a 6= b the ordered pair (a, b) is the set {{a}, {a, b}}. We sometimes say “pair”instead of “ordered pair”. If a = b the pair (a, b) = (a, a) is the set {{a}}. Notethat (a, b) ∈ P(P(A ∪B)).

Notation 16.29. For any sets A and B we define the set A×B as the set (calledthe product of A and B)

{c ∈ P(P(A ∪B);∃x∃y((x ∈ A) ∧ (y ∈ B) ∧ (c = (a, b)))}whose elements are exactly the pairs (a, b) with a ∈ A and b ∈ B; A×B is a witnessfor a theorem following from the separation axioms and the existence of orderedpairs.

Proposition 16.30. (a, b) = (c, d) if and only if a = c and b = d.

Proof. We need to prove that1) If a = c and b = d then (a, b) = (c, d) and2) If (a, b) = (c, d) then a = c and b = d.Now 1) is obvious. To prove 2) assume (a, b) = (c, d).Assume first a 6= b and c 6= d. Then by the definition of pairs we know that

{{a}, {a, b}} = {{c}, {c, d}}.Since {a} ∈ {{a}, {a, b}} it follows (by the extensionality axiom) that {a} ∈{{c}, {c, d}}. Hence either {a} = {c} or {a} = {c, d}. But as seen before {a} 6={c, d}. So {a} = {c}. Since a ∈ {a} it follows that a ∈ {c} hence a = c. Similarlysince {a, b} ∈ {{a}, {a, b}} we get {a, b} ∈ {{c}, {c, d}}. So either {a, b} = {c} or{a, b} = {c, d}. Again as seen before {a, b} 6= {c} so {a, b} = {c, d}. So b ∈ {c, d}.So b = c or b = d. Since a 6= b and a = c we get b 6= c. Hence b = d and we aredone in case a 6= b and c 6= d.

Assume next a = b and c = d. Then by the definition of pairs in this case wehave {{a}} = {{c}} and as before this implies {a} = {c} hence a = c so we aredone in this case as well.

Finally assume a = b and c 6= d. (The case a 6= b and c = d is treated similarly.)By the definition of pairs we get

{{a}} = {{c}, {c, d}}.We get {c, d} ∈ {{a}}. Hence {c, d} = {a} which is impossible, as seen before. Thisends the proof. �

Exercise 16.31. If A = {a, b, c} and B = {c, d} then

A×B = {(a, c), (a, d), (b, c), (b, d), (c, c), (c, d)}.Hint: by the above Proposition the pairs are distinct.

Exercise 16.32. Prove that1) (A ∩B)× C = (A× C) ∩ (B × C),2) (A ∪B)× C = (A× C) ∪ (B × C),

Exercise 16.33. Prove that there does not exist a set T such that for any set Awe have A ∈ T . (In other words there is no set of all sets.) Hint. Assume there issuch a T and derive a contradiction (which is called the Russell paradox; this is ofcourse no paradox.) To derive a contradiction consider the set

S = {A ∈ T ;A 6∈ A}.

60 ALEXANDRU BUIUM

There are two possibilities. First S 6∈ S hence, by the definition of S, and sinceS ∈ T , we get that S ∈ S, a contradiction. The second possibility is that S ∈ Shence, by the definition of S, and since S ∈ T , we get that S 6∈ S, which is again acontradiction.

17. Maps

Definition 17.1. A map of sets F : A→ B (or a function) is a subset F ⊂ A×Bsuch that for any a ∈ A there is a unique b ∈ B with (a, b) ∈ F .

Remark 17.2. We tacitly introduce the new predicate “is a map from ... to ...”by an appropriate definition in the language of sets. Also, if F,A,B are sets (henceconstants) such that “F is a map from A to B” is a theorem we may introduce anew functional predicate (still denoted by F ) by

∀x∀y((F (x) = y)↔ ((x, y) ∈ F ))

We sometimes write a 7→ F (a). Also we sometimes write AF→ B.

Example 17.3. The set

(17.1) F = {(a, a), (b, c)} ⊂ {a, b} × {a, b, c}is a map and F (a) = a, F (b) = c. On the other hand the subset

F = {(a, b), (a, c)} ⊂ {a, b} × {a, b, c}is not a map.

Definition 17.4. For any A the identity map I : A → A is defined as I(a) = a,i.e.

I = IA = {(a, a); a ∈ A} ⊂ A×A.

Definition 17.5. A map F : A→ B is injective (or an injection, or one-to-one) ifF (a) = F (c) implies a = c.

Definition 17.6. A map F : A → B is surjective (or a surjection, or onto) if forany b ∈ B there is a a ∈ A such that F (a) = b.

Example 17.7. The map (17.1) is injective and not surjective.

Exercise 17.8. Give an example of a map which is surjective and not injective.

Exercise 17.9. If F : A → B and G : B → C are two maps then there exists aunique map H : A → C (denoted by H = G ◦ F ) such that H(a) = G(F (a)) forall a. Hint: we let (a, c) ∈ H if and only if there exists b ∈ B with (a, b) ∈ F ,(b, c) ∈ G.

Exercise 17.10. Prove that if F ◦G is surjective then F is surjective. Prove thatif F ◦G is injective then G is injective.

Notation 17.11. By a commutative diagram of sets

AF→ B

U ↓ ↓ VC

G→ D

we mean a collection of sets and maps as above with the property that G◦U = V ◦F .

A FIRST COURSE IN MATHEMATICS 61

Exercise 17.12. Prove that the composition of two injective maps is injective andthe composition of two surjective maps is surjective.

Definition 17.13. A map is bijective (or a bijection) if it is injective and surjective.

Here is a fundamental theorem in set theory:

Theorem 17.14. (Bernstein’s Theorem) If A and B are sets and if there existinjective maps F : A → B and G : B → A then there exists a bijective mapH : A→ B.

The reader may attempt to prove this after he/she gets to the section on induc-tion.

Exercise 17.15. Prove that if F : A → B is bijective then there exists a uniquebijective map denoted by F−1 : B → A such that F ◦F−1 = IB and F−1 ◦F = IA.F−1 is called the inverse of F .

Exercise 17.16. Let F : {a, b, c} → {c, d, e}, F (a) = d, F (b) = c, F (c) = e. Provethat F has an inverse and compute F−1.

Exercise 17.17. Prove that if A and B are sets then there exist maps F : A×B →A and G : A×B → B such that F (a, b) = a and G(a, b) = b for all (a, b) ∈ A×B.(These are called the first and the second projection.) Hint: For G show thatG = {((a, b), c); c = b} ⊂ (A×B)×B is a map.

Exercise 17.18. Prove that (A×B)×C → A× (B×C), ((a, b), c) 7→ (a, (b, c)) isa bijection.

Notation 17.19. Write A×B×C instead of (A×B)×C and write (a, b, c) insteadof ((a, b), c). We call (a, b, c) a triple. Write A2 = A × A and A3 = A × A × A.More generally adopt this notation for arbitrary number of factors. Elements like(a, b), (a, b, c), (a, b, c, d), etc. will be called tuples.

Theorem 17.20. If A is a set then there is no bijection between A and P(A)

Proof. Assume there exists a bijection F : A → P(A) and seek a contradiction.Consider the set

B = {a ∈ A; a 6∈ F (a)} ∈ P(A)

Since F is subjective there exists b ∈ A such that B = F (b). There are two cases:either b ∈ B or b 6∈ B. If b ∈ B then b ∈ F (b) so b 6∈ B, a contradiction. If b 6∈ Bthen b 6∈ F (b) so b ∈ B, a contradiction, and we are done. �

Remark 17.21. Note the similarity between the above argument and the argumentshowing that there is no set having all sets as elements (the “Russell paradox”).

Definition 17.22. Let S be a set of sets and I a set. A family of sets in S indexedby I is a map I → S, i 7→ Ai. We sometimes drop the reference to S. We alsowrite (Ai)i∈I to denote this family. By the union axiom for any such family thereis a set (denoted by

⋃i∈I Ai, called their union) such that for all x we have that

x ∈⋃i∈I Ai if and only if there exists i ∈ I such that x ∈ Ai. Also a set (denoted by⋂

i∈I Ai, called their intersection) exists such that for all x we have that x ∈⋂i∈I Ai

if and only if for all i ∈ I we have x ∈ Ai. A family of elements in (Ai)i∈I is amap I →

⋃i∈I Ai, i 7→ ai, such that for all i ∈ I we have ai ∈ Ai. Such a family of

elements is denoted by (ai)i∈I . One defines the product∏i∈I Ai are the set of all

families of elements (ai)i∈I .

62 ALEXANDRU BUIUM

Exercise 17.23. Check that for I = {i, j} the above definitions of ∪,∩,∏

yieldthe usual definition of Ai ∩Aj , Ai ∩Aj and Ai ×Aj .

Definition 17.24. Let F : A → B be a map and X ⊂ A. Define the image of Xas the set

F (X) = {F (x);x ∈ X} = {y ∈ B;∃x ∈ X, y = F (x)} ⊂ B

If Y ⊂ B define the inverse image of Y as the set

F−1(Y ) = {x ∈ A;F (x) ∈ Y } ⊂ A.

For y ∈ B define

F−1(y) = {x ∈ A;F (x) = y}(Note that F−1(Y ), F−1(y) are defined even if the inverse map F−1 does not exist.)

Exercise 17.25. Let F : {a, b, c, d, e, f, g} → {c, d, e, h}, F (a) = d, F (b) = c,F (c) = e, F (d) = c, F (e) = d, F (f) = c, F (g) = c. Let X = {a, b, c}, Y = {c, h}.Compute F (X), F−1(Y ), F−1(c), F−1(h).

Exercise 17.26. Prove that if F : A → B is a map and X ⊂ X ′ ⊂ A are subsetsthen F (X) ⊂ F (X ′).

Exercise 17.27. Prove that if F : A → B is a map and (Xi)i∈I is a family ofsubsets of A then

F (∪i∈IXi) = ∪i∈IF (Xi),

F (∩i∈IXi) ⊂ ∩i∈IF (Xi).

If in addition F is injective show that

F (∩i∈IXi) = ∩i∈IF (Xi).

Give an example showing that the latter may fail if F is not injective.

Exercise 17.28. Prove that if F : A → B is a map and Y ⊂ Y ′ ⊂ B are subsetsthen F−1(Y ) ⊂ F−1(Y ′).

Exercise 17.29. Prove that if F : A → B is a map and (Yi)i∈I is a family ofsubsets of B then

F−1(∪i∈IYi) = ∪i∈IF−1(Yi),

F−1(∩i∈IYi) = ∩i∈IF−1(Yi).

(So here one does not need injectivity like in the case of unions.)

Notation 17.30. If A and B are sets we denote by BA ⊂ P(A×B) the set of allmaps F : A→ B; sometimes one writes Map(A,B) = BA.

Exercise 17.31. Let 0, 1 be two elements. Prove that the map {0, 1}A → P(A)sending F : A→ {0, 1} into F−1(1) ∈ P(A) is a bijection.

Exercise 17.32. Find a bijection between (CB)A and CA×B . Hint: Send F ∈(CB)A, F : A→ CB , into the set (map)

{((a, b), c) ∈ (A×B)× C; (b, c) ∈ F (a)}.

A FIRST COURSE IN MATHEMATICS 63

Definition 17.33. A topology on a set X is a subset T ⊂ P(X) of the power withthe following properties:

1) ∅ ∈ T and X ∈ T;2) If U, V ∈ T then U ∩ V ∈ T;3) If (Ui)i∈I is a family of subsets Ui ⊂ X and if for all i ∈ I we have Ui ∈ T

then⋃i∈I Ui ∈ T;

A subset U ⊂ X is called open if U ∈ T. A subset Z ⊂ X is called closed if X\Zis open.

Example 17.34. T = P(X) is a topology on X.

Example 17.35. T = {∅, X} ⊂ P(X) is a topology on X.

Exercise 17.36. Prove that if U, V ⊂ X then

T = {∅, U, V, U ∪ V,U ∩ V,X}

is a topology. Find the closed sets of X.

Exercise 17.37. Prove that if (Tj)j∈J is a family of topologies Tj ⊂ P(X) on Xthen

⋂j∈J Tj is a topology on X.

Definition 17.38. If T0 ⊂ P(X) is a subset of the power set then the intersection

T =⋃

T′⊃T0

T′

of all topologies T′ containing T0 is called the topology generated by T0.

Exercise 17.39. Let T0 = {U, V,W} ⊂ P(X). Find explicitly the topology gener-ated by T0. Find all the closed sets in that topology.

Definition 17.40. A topological space is a pair (X,T) consisting of a set X anda topology T ⊂ P(X) on X. Sometimes one writes X instead of (X,T) is T isunderstood from context.

Definition 17.41. Let (X,T) and (X,′ T′) be two topological spaces. A mapF : X → X ′ is continuous if for all V ∈ T′ we have F−1(V ) ∈ T.

Exercise 17.42. If T is a topology on X and T′ is the topology on X ′ defined byT′ = {∅, Y } then any map F : X → X ′ is continuous.

Exercise 17.43. If T is the topology T = P(X) on X and T′ is any topology onX ′ then any map F : X → X ′ is continuous.

Exercise 17.44. Prove that if (X,T), (X ′,T′), (X ′′,T′′) are topological spacesand G : X → X ′, F : X ′ → X ′′ are continuous maps then the compositionF ◦G : X → X ′′ is continuous.

Exercise 17.45. Give an example of two topological spaces (X,T) and (X ′,T′)and of a bijection F : X → X ′ such that F is continuous but F−1 is not continuous.

18. Relations

Definition 18.1. If A is a set then a relation on A is a subset R ⊂ A × A. If(a, b) ∈ R we write aRb.

64 ALEXANDRU BUIUM

Remark 18.2. We tacitly introduce the new predicate “is a relation on ...” byan appropriate definition in the language of sets. Also, if R,A are sets (henceconstants) such that “R is a relation on A” is a theorem we may introduce a newrelational predicate (still denoted by R) by

∀x∀y((xRy)↔ ((x, y) ∈ R))

Definition 18.3. A relation R is called an order if (writing a ≤ b instead of aRb)we have, for all a, b, c ∈ A, that

1) a ≤ a (reflexivity),2) a ≤ b and b ≤ c imply a ≤ c (transitivity),3) a ≤ b and b ≤ a imply a = b (antisymmetry).

Notation 18.4. One writes a < b if a ≤ b and a 6= b.

Remark 18.5. One may introduce a new predicate “is an order relation” in theobvious way. The same can be done with other concepts to be introduced later.

Definition 18.6. An order relation is called total if for any a, b ∈ A either a ≤ bor b ≤ a. Alternatively we say A is totally ordered (by ≤).

Example 18.7. For instance if A = {a, b, c, d} then

R = {(a, a), (b, b), (c, c), (d, d), (a, b), (b, c), (a, c)}is an order but not a total order.

Exercise 18.8. Let R0 ⊂ A × A be a relation and assume R0 is contained in anorder relation R1 ⊂ A×A. Let

R =⋂

R′⊃R0

R′

be the intersection of all order relations R′ containing R0. Prove that R is an orderrelation and it is the smallest order relation containing R0 in the sense that it iscontained in any order relation that contains R0.

Exercise 18.9. Let A = {a, b, c, d, e} and R0 = {(a, b), (b, c), (c, d), (c, e)}. Findan order relation containing R0. Find the smallest order relation R containing R0.Show that R is not a total order.

Exercise 18.10. Let A be a set. For any subsets X ⊂ A and Y ⊂ A write X ≤ Yif and only if X ⊂ Y . This defines a relation on the set P(A). Prove that this is anorder relation. Give an example showing that this is not in general a total order.

Definition 18.11. An ordered set is a pair (A,≤) where A is a set and ≤ is anorder relation on A.

Definition 18.12. Let (A,≤) and (A′,≤′) be ordered sets. A map F : A→ A′ iscalled increasing if for any a, b ∈ A with a ≤ b we have F (a) ≤′ F (b).

Exercise 18.13. Prove that if (A,≤), (A′,≤′), (A′′,≤′′) are ordered sets andG : A→ A′, F : A′ → A′′ are increasing then F ◦G : A→ A′′ is increasing.

Definition 18.14. Let A be a set with an order ≤. We say α ∈ A is a minimalelement of A if for all a ∈ A such that a ≤ α we must have a = α.

Definition 18.15. Let A be a set with an order ≤. We say β ∈ A is a maximalelement of A if for all b ∈ A such that β ≤ b we must have β = b.

A FIRST COURSE IN MATHEMATICS 65

Definition 18.16. Let A be a set with an order ≤ and let B ⊂ A. We say m ∈ Bis a minimum element of B if for all b ∈ B we have m ≤ b. If a minimum elementexists it is unique (check!) and we denote it by min B. Note that min B, if itexists, then by definition it belongs to B.

Definition 18.17. Let A be a set with an order ≤ and B ⊂ A. We say M ∈ B isa maximum element of B if for all b ∈ B we have b ≤ M . If a maximum elementexists it is unique and we denote it by max B. Again, if max B exists then bydefinition it belongs to B.

Definition 18.18. Let A be set with an order ≤ and let B ⊂ A. An element u ∈ Ais called an upper bound for B if b ≤ u for all b ∈ B. We also say that B is boundedfrom above by b. An element l ∈ A is called a lower bound for B if l ≤ b for allb ∈ B; we also say B is bounded from below by l. If the set of upper bounds of Bhas a minimum element we call it the supremum of B and we denote it by sup B;if the set of lower bounds of B has a maximum element we call it the infinimumof B and we denote it by inf B. (Note that if one of sup B and inf B exists thatelement is by definition in A but does not necessarily belong to B.) We say B isbounded if it has both an upper bound and a lower bound.

Exercise 18.19. Consider the set A and the order ≤ defined by the relation Rin Exercise 18.9. Does A have a maximum element? Does A have a minimumelement? Are there maximal elements in A? Are there minimal elements in A?List all these elements in case they exist. Let B = {b, c}. Is B bounded? List allthe upper bounds of B. List all the lower bounds of B. Does the supremum of Bexist? If yes does it belong to B? Does the infimum of B exist? Does it belong toB?

Definition 18.20. A well ordered set is an ordered set (A,≤) such that any non-empty subset B ⊂ A has a minimum element.

Exercise 18.21. Prove that any well ordered set is totally ordered.

Remark 18.22. Later, when we will have introduced the ordered set of integersand the ordered set of rational numbers we will see that the non-negative integersare well ordered but the non-negative rationals are not well ordered.

The following theorems can be proved (but their proof is behind the scope ofthese notes):

Theorem 18.23. (Zorn’s lemma) Assume (A,≤) is an ordered set. Assume thatany non-empty totally ordered subset B ⊂ A has an upper bound in A. Then A hasa maximal element.

Theorem 18.24. (Well ordering principle) Let A be a set. Then there exists anorder relation ≤ on A such that (A,≤) is well ordered.

Remark 18.25. It can be proved that if one removes from the axioms of set theorythe axiom of choice then the axiom of choice, Zorn’s lemma, and the well orderingprinciple are all equivalent.

Exercise 18.26. Let (A,≤) and (B,≤) be totally ordered sets. Define a relation≤ on A×B by

((a, b) ≤ (a′, b′))↔ ((a ≤ a′) ∨ ((a = a′) ∧ (b ≤ b′)))

66 ALEXANDRU BUIUM

Prove that ≤ is an order on A × B (ir is called the lexicographic order) and that(A×B,≤) is totally ordered. (Explain how this order is being used to order wordsin a dictionary).

Definition 18.27. A relation R is called an equivalence relation if (writing a ∼ binstead of aRb) we have, for all a, b, c ∈ A, that

1) a ∼ a (reflexivity),2) a ∼ b and b ∼ c imply a ∼ c (transitivity),3) a ∼ b implies b ∼ a (symmetry);

we also say that ∼ is an equivalence relation.

Exercise 18.28. Prove that if R′ and R′′ are equivalence relations on A thenR′ ∩R′′ is an equivalence relation.

Exercise 18.29. Let R0 ⊂ A×A be a relation and let

R =⋂

R′⊃R0

R′

be the intersection of all equivalence relations R′ containing R0. Prove that R isan equivalence relation and it is the smallest equivalence relation containing R0 inthe sense that it is contained in any other equivalence relation that contains R0.

Definition 18.30. Given an equivalence relation ∼ as above for any a ∈ A we mayconsider the set a := {c ∈ A; c ∼ a} called the equivalence class of a.

Notation 18.31. Sometimes, instead of a, one writes a or [a].

Exercise 18.32. Prove that a = b if and only if a ∼ b.

Exercise 18.33. Prove that:1) if a ∩ b 6= ∅ then a = b;2) A =

⋃a∈A a.

Definition 18.34. If A is a set a partition of A is a family (Ai)i∈I if subsets Ai ⊂ Asuch that:

1) if i 6= j then Ai ∩Aj = ∅2) A =

⋃i∈I Ai.

Exercise 18.35. Let A be a set and ∼ an equivalence relation on it. Prove that:1) There exists a subset B ⊂ A which contains exactly one element of each

equivalence class (such a set is called a system of representatives; hint: use theaxiom of choice).

2) The family (b)b∈B is a partition of A.

Exercise 18.36. Let A be a set and (Ai)i∈I a partition of A. Define a relation Ron A as follows:

R = {(a, b) ∈ A×A;∃i((i ∈ I) ∧ (a ∈ Ai) ∧ (b ∈ Ai))}

Prove that R is an equivalence relation.

Exercise 18.37. Let A be a set. Prove that there is a bijection between the set ofequivalence relations on A and the set of partitions of A. Hint: use the above twoexercises.

A FIRST COURSE IN MATHEMATICS 67

Definition 18.38. The set of equivalence classes

{α ∈ P(A);∃a((a ∈ A) ∧ (α = a)}is denoted by A/ ∼ and is called the quotient of A by the relation ∼.

Example 18.39. For instance if A = {a, b, c} and

R = {(a, a), (b, b), (c, c), (a, b), (b, a)}

then R is an equivalence relation, a = b = {a, b}, c = {c}, and A/ ∼= {{a, b}, {c}}.

Exercise 18.40. Let A = {a, b, c, d, e, f} and R0 = {(a, b), (b, c), (d, e)}. Findthe smallest equivalence relation R containing R0. Call it ∼. Write down the

equivalence classes a, b, c, d, e, f . Write down the set A/ ∼.

Exercise 18.41. Let S be a set of sets. For any sets X,Y ∈ S write X ∼ Y if andonly if here exists a bijection F : X → Y . This defines a relation on S. Prove thatthis is an equivalence relation.

Exercise 18.42. Let S = {A,B,C,D}, A = {a, b}, B = {b, c}, C = {x, y}, D = ∅.Let ∼ be the equivalence relation on S defined in the previous exercise. Write down

the equivalence classes A, B, C, D and write down the set S/ ∼.

Definition 18.43. An affine plane is a pair (A,L) where A is a set and L ⊂ P(A)is a set of subsets of A satisfying a series of axioms which we now explain. It isconvenient to introduce some terminology as follows. A is the called the affineplane. The elements of A are called points. The elements L of L are called lines;so each such L ⊂ A. We say a point P lies on a line L if P ∈ L; we also say that Lpasses through P . We say that two lines intersect if they have a point in common;we say that two lines are parallel if they either coincide or they do not intersect.We say that 3 points are collinear if they lie on the same line. Here are the axiomsthat we impose:

1) There exist 3 points which are not collinear and any line has at least 2 points.2) Any 2 distinct points lie on exactly one line.3) If L is a line and P is a point not lying on L there exists exactly one line

through P which parallel to L.

Remark 18.44. Note that we have not defined 2 or 3 yet; this will be donelater when we introduce integers. The meaning of these axioms is however clearlyexpressible in terms that were already defined. For instance axiom 2 says that forany points P and Q with P 6= Q there exists a line through P and Q; we do notneed to define the symbol 2 to express this. The same holds for the use of thesymbol 3.

Exercise 18.45. Prove that any two distinct non-parallel lines intersect in exactlyone point.

Exercise 18.46. Let A = {a, b} × {a, b} and let L ⊂ P(A) consist of all subsetsof 2 elements; (there are 6 of them). Prove that (A,L) is an affine plane. (Againone can reformulate everything without reference to the symbols 2 or 6; one simplyuses 2 or 6 letters and writes that they are 6=.)

Exercise 18.47. Let A = {a, b, c}×{a, b, c} . Find all subsets L ⊂ P(A) such that(A,L) is an affine plane. (This is tedious !)

68 ALEXANDRU BUIUM

Definition 18.48. A projective plane is a pair (A,L) where A is a set and L ⊂P(A) is a set of subsets of A satisfying a series of axioms which we now explain.Again it is convenient to introduce some terminology as follows. A is the called theprojective plane. The elements of A are called points, P . The elements L of L arecalled lines; so each such L ⊂ A. We say a point P lies on a line L if P ∈ L; we alsosay that L passes through P . We say that two lines intersect if they have a pointin common; we say that two lines are parallel if they either coincide or they do notintersect. We say that 3 points are collinear if they lie on the same line. Here arethe axioms that we impose:

1) There exist 3 points which are not collinear and any line has at least 3 points.2) Any 2 distinct points lie on exactly one line.3) Any 2 distinct lines meet in exactly one point.

Example 18.49. One can attach to any affine plane (A,L) a projective plane

(A,L) as follows. We introduce the relation ‖ on L by letting L ‖ L′ is and only

if L and L′ are parallel. This is an equivalence relation (check!). Denote by L theequivalence class of L. Then we consider the set of equivalence classes, L∞ = L/ ‖;call this set the line at infinity. We let A = A ∪ L∞ (disjoint union). Define a line

in A to be either L∞ or set of the form L = L ∪ {L}. Finally define L to be theset of all lines in A.

Exercise 18.50. Check that (A,L) is a projective plane.

Exercise 18.51. Describe the projective plane attached to the affine plane inExercise 18.46; how many points does it have? How many lines?

19. Operations

Definition 19.1. A binary operation ? on a set A is a map ? : A × A → A,(a, b) 7→ ?(a, b).

Notation 19.2. We usually write a ? b instead of ?(a, b).

Example 19.3. For instance, we write (a ? b) ? c instead of ?(?(a, b), c). Instead of? we sometimes use notation like +,×, ◦, ...

Remark 19.4. We tacitly introduce the new predicate “is a binary operation on...” by an appropriate definition in the language of sets. Also, if ?,A are sets (henceconstants) such that “? is a binary operation on A” is a theorem we may introducea new functional predicate (still denoted by ?) by

∀x∀y∀z((x ? y = z)↔ (((x, y), z) ∈ ?))

Definition 19.5. A unary operation ′ on a set A is a map ′ : A→ A, a 7→ ′(a).

Notation 19.6. We usually write a′ or ′a instead of ′(a). Instead of ′ we sometimesuse notation like −, i, ...

Example 19.7. Let S = {0, 1} where 0, 1 are two symbols (with no meaning).Then there are 3 interesting binary operations on S denoted by ∧,∨,+ (and calledsupremum, infimum, and addition) defined as follows:

0 ∧ 0 = 0, 0 ∧ 1 = 0, 1 ∧ 0 = 0, 1 ∧ 1 = 1;

0 ∨ 0 = 0, 0 ∨ 1 = 1, 1 ∨ 0 = 1, 1 ∨ 1 = 1;

A FIRST COURSE IN MATHEMATICS 69

0 + 0 = 0, 0 + 1 = 1, 1 + 0 = 1, 1 + 1 = 0.

The ∧ is also denoted by × or · is is referred to as multiplication. + is also denotedby ∆. Also there is a unary operation ¬ on S defined by

¬1 = 0, ¬1 = 0.

Note that if we denote 0 and 1 by F and T then the operations ∧,∨,¬ on {0, 1}correspond exactly to the “logical operations” on F and T defined in the sectionon tables. This is not a coincidence!

Exercise 19.8. Compute ((0 ∧ 1) ∨ 1) + (1 ∧ (0 ∨ (1 + 1)).

Definition 19.9. A Boolean algebra is a tuple

(A,∨,∧,¬, 0, 1)

where ∧,∨ are binary operations, ¬ is a unary operation and 0, 1 ∈ A such that forall a, b, c ∈ A the following axioms are satisfied:

1) a ∧ (b ∧ c) = (a ∧ b) ∧ c, a ∨ (b ∨ c) = (a ∨ b) ∨ c,2) a ∧ b = b ∧ a, a ∨ b = b ∨ a,3) a ∧ 1 = a, a ∨ 0 = a,4) a ∧ (b ∨ c) = (a ∧ b) ∨ (a ∧ c), a ∨ (b ∧ c) = (a ∨ b) ∧ (a ∨ c)5) a ∧ (¬a) = 0, a ∨ (¬a) = 1.

Definition 19.10. A commutative unital ring (or simply a ring) is a tuple

(R,+,×,−, 0, 1)

(sometimes referred to simply as R) where R is a set, 0, 1 ∈ R, +,× are two binaryoperations (write a × b = ab) and − is a unary operation on R such that for anya, b, c ∈ R the following hold:

1) a+ (b+ c) = (a+ b) + c, a+ 0 = a, a+ (−a) = 0, a+ b = b+ a;2) a(bc) = a(bc), 1a = a, ab = ba,3) a(b+ c) = ab+ ac.

Notation 19.11. We write a+ b+ c instead of (a+ b) + c and abc for (ab)c. Wewrite a− b instead of a+ (−b).

Definition 19.12. An element a of a ring R is invertible if there exists a′ ∈ Rsuch that aa′ = 1; this a′ is then easily proved to be unique, it is called the inverseof a, and is denoted by a−1. A ring R is called a field if 0 6= 1 and for any non-zeroelement is invertible.

Definition 19.13. A Boolean ring is a commutative unital ring such that 1 6= 0and for all a ∈ A we have a2 = a.

Exercise 19.14. Prove that in a Boolean ring A we have a+ a = 0 for all a ∈ A.

Exercise 19.15. Prove that1) ({0, 1},∨,∧,¬, 0, 1) is a Boolean algebra.2) ({0, 1},+,×, I, 0, 1) is a Boolean ring and a field (I is the identity map).

Exercise 19.16. Prove that if a Boolean ring A is a field then A = {0, 1}.

70 ALEXANDRU BUIUM

Definition 19.17. Let A be a set and let S = P(A) be the power set of A. Definethe following operations on S:

X ∧ Y = X ∩ YX ∨ Y = X ∪ YX∆Y = (X ∪ Y )\(X ∩ Y )¬X = CX = A\X

Exercise 19.18. Prove that1) (P(A),∨,∧,¬, ∅, A) is a Boolean algebra;2) (P(A),∆,∧, I, ∅, A) is a Boolean ring (I is the identity map).Hint: For any a ∈ A one can define a map ψa : P(A) → {0, 1} by setting

ψa(X) = 1 if and only if a ∈ X. Note that1) ψa(X ∧ Y ) = ψa(X) ∧ ψa(Y ),2) ψa(X ∨ Y ) = ψa(X) ∨ ψa(Y ),3) ψa(X∆Y ) = ψa(X) + ψa(Y ),4) ψa(¬X) = ¬ψa(X).

Next note that X = Y if and only if ψa(X) = ψa(Y ) for all a ∈ A. Use thesefunctions to reduce the present exercise to the previous exercise.

Definition 19.19. Given a subset X ⊂ A one can define the characteristic functionχX : A → {0, 1} by letting χX(a) = 1 if and only if a ∈ X; in other wordsχX(a) = ψa(X).

Exercise 19.20. Prove that1) χX∨Y (a) = χX(a) ∨ χY (a),2) χX∧Y (a) = χX(a) ∧ χY (a),3) χX∆Y (a) = χX(a) + χY (a),4) χ¬X(a) = ¬χX(a).

Definition 19.21. An algebraic structure is a tuple (A, ?, •, ...,¬,−, ..., 0, 1, ...)where A is a set, ?, •, ... are binary operations, ¬,−, ... are unary operations, and1, 0, ... are given elements of A. (Some of these may be missing; for instance wecould have only one binary operation, one given element, and no unary operations.)Assume we are given two algebraic structures

(A, ?, •, ...,¬,−, ..., 0, 1, ...) and (A′, ?′, •′, ...,¬′,−′, ..., 0′, 1′, ...)(with the same number of corresponding operations). A map F : A→ A′ is calleda homomorphism if for all a, b ∈ A we have:

1) F (a ? b) = F (a) ?′ F (b), F (a • b) = F (a) •′ F (b),...2) F (¬a) = ¬′F (a), F (−a) = −′F (a),...3) F (0) = 0′, F (1) = 1′,...

Example 19.22. A map F : A → A′ between two commutative unital rings is acalled a homomorphism (of commutative unital rings) if for all a, b ∈ A we have:

1) F (a+ b) = F (a) + F (b) and F (ab) = F (a)F (b),2) F (−a) = −F (a) (prove that this is automatic !),3) F (0) = 0 (prove that this is automatic !) and F (1) = 1.

Definition 19.23. A subset A ⊂ P(A) is called a Boolean algebra of sets if thefollowing hold:

1) ∅ ∈ A, A ∈ A;2) If X,Y ∈ A then X ∩ Y ∈ A, X ∪ Y ∈ A, CX ∈ A.

A FIRST COURSE IN MATHEMATICS 71

(Hence (A,∨,∧,C, ∅, A) is a Boolean algebra.)

Exercise 19.24. Prove that if A is a Boolean algebra of sets then for any X,Y ∈ A

we have X∆Y ∈ A. Prove that (A,∆,∩, I, ∅, A) is a Boolean ring.

Definition 19.25. A subset B ⊂ P(A) is called a Boolean ring of sets if thefollowing properties hold:

1) ∅ ∈ B, A ∈ B;2) If X,Y ∈ B then X ∩ Y ∈ B, X∆Y ∈ B.

(Hence (A,∆,∨, I, ∅, A) is a Boolean ring.)

Exercise 19.26. Prove that any Boolean ring of sets B is a Boolean algebra ofsets.

Definition 19.27. A commutative unital ordered ring (or simply an ordered ring)is a tuple

(R,+,×,−, 0, 1,≤)

where(R,+,×,−, 0, 1)

is a ring, ≤ is a total order on R, and for all a, b, c ∈ R the following axioms aresatisfied

1) If a < b then a+ c < b+ c;2) If a < b and c > 0 then ac < bc.

Exercise 19.28. Prove that if R is a finite ring then it cannot have a structure ofordered ring. In particular the unique order on {0, 1} with 0 ≤ 1 does not makethe tuple ({0, 1},+,×,−, 0, 1,≤) an ordered ring!

Remark 19.29. We cannot give examples yet of ordered rings; in fact we will laterpostulate the existence of the ordered ring of integers; all other ordered rings willbe constructed from the integers.

Definition 19.30. Let R be an ordered ring and let R+ = {a ∈ R; a ≥ 0}. A finitemeasure space is a triple (A,A, µ) where A is a set, A ⊂ P(A) is a Boolean algebraof sets, and µ : A → R+ satisfying the following property for any X,Y ∈ A withX ∩ Y = ∅ we have

µ(X ∪ Y ) = µ(X) + µ(Y ).

If in addition µ(A) = 1 we say (A,A, µ) is a finite probability measure. We saythat X,Y ∈ A are independent if µ(X ∩ Y ) = µ(X) · µ(Y ).

Exercise 19.31. Prove that in a finite measure space µ(∅) = 0 and for any X,Y ∈A we have

µ(X ∪ Y ) = µ(X) + µ(Y )− µ(X ∩ Y ).

Exercise 19.32. Let (A,∨,∧,¬, 0, 1) be a Boolean algebra. For any a, b ∈ A set

a+ b = (a ∨ b) ∧ (¬(a ∧ b)).Prove that (A,+,∧, I, 0, 1) is a Boolean ring (I the identity map.)

Exercise 19.33. Let (A,+,×,−, 0, 1) be a Boolean ring. For any a, b ∈ A let

a ∨ b = a+ b− aba ∧ b = ab¬a = 1− a.

Prove that (A,∨,∧,¬, 0, 1) is a Boolean algebra.

72 ALEXANDRU BUIUM

Exercise 19.34. Let (X,T) be a topological space and let A ⊂ P(A) be the setof all X ∈ P(A) such that X ∈ T and CX ∈ T. (I.e. A is the collection of all setswhich are both open and closed.) Then A is a Boolean algebra of sets.

20. Integers

Definition 20.1. A well ordered ring is an ordered ring (R,+,×, 0, 1,≤) with 1 6= 0having the property that any non-empty subset of R which is bounded from belowhas a minimum element.

Remark 20.2. If (R,+,×, 0, 1,≤) is a well ordered ring then (R,≤) is not wellordered. But if R>0 = {a ∈ R; a > 0} then (R>0,≤) is a well ordered set.

We have the following remarkable theorem in set theory Tset:

Theorem 20.3. There exists a well ordered ring.

Remark 20.4. The above theorem is formulated in argot; but it should be under-stood as being a sentence Z in Lset of the form

∃r∃s∃p∃o∃u∃l(...)

where we take a variable r to stand for the ring, a variable s for the sum, p forthe product, o for 0, u for 1, l for ≤, and the dots stand for the correspondingconditions in the definition of a well ordered ring, written in the language of sets.The sentence Z is complicated so we preferred to give the theorem not a sentencein Lset but as a sentence in argot. This kind of abuse is very common.

Remark 20.5. We are not going to prove Theorem 20.3 but rather sketch theidea of the proof which is rather involved. For the reader who feels he/she cannotaccept such a basic theorem without proof one can simply proceed as follows: addthis theorem to the ZFC and replace Tset by the theory with axioms this enrichedsystem of axioms. This is what all working mathematicians essentially do anyway.

Notation 20.6. We let Z,+,×, 0, 1,≤ be the witnesses for the sentence Z above;we call Z the ring of integers. In particular the conditions in the definition of rings(associativity, commutativity, etc.) and order (transitivity, etc.) become theoremsfor Z. We also set N = {a ∈ Z; a > 0} and we call N the set of natural numbers.Later we will prove the “essential uniqueness” of Z.

Remark 20.7. The only predicate in the language Lset of sets is ∈ and the con-stants in this language are called sets. In particular when we consider the orderedring of integers (Z,+,×, 0, 1,≤) the symbols Z,+,×, 0, 1,≤,N are all constants(they are sets). In particular +,× are not originally functional predicates and ≤is not originally a relational predicate. But, according to our conventions, we mayintroduce functional predicates (still denoted by +,×) and a relational predicate(still denoted by ≤) via appropriate definitions. (This is because “the set + is aninary operation on Z” is a theorem, etc.)

Exercise 20.8. Prove that 1 ∈ N. Hint: use (−1)× (−1) = 1.

Exercise 20.9. Prove that if a, b ∈ Z and ab = 0 then either a = 0 or b = 0. Givean example of a ring where this is not true. (For the latter consider the Booleanalgebra P(A).)

A FIRST COURSE IN MATHEMATICS 73

Exercise 20.10. Prove that if a ∈ Z then the set {x ∈ Z; a− 1 < x < a} is empty.Hint (actually a complete proof): It is enough to show that S = {x ∈ Z; 0 < x < 1}is empty. Assume S is non-empty and let m = minS. Then 0 < m2 < m, hence0 < m2 < 1 and m2 < m, a contradiction.

Exercise 20.11. Prove that if a ∈ N then a = 1 or a − 1 ∈ N. Conclude thatminN = 1. Hint :use previous exercise.

In what follows we sketch the main idea behind the proof of Theorem 20.3. Webegin with the following:

Definition 20.12. Define a Peano triple to be a triple (N, 1, σ) where N is a set,1 ∈ N , and σ : N → N is a map such that

1) σ is injective;2) σ(N) = N\{1} and3) for any subset S ⊂ N if 1 ∈ S and σ(S) ⊂ S then S = N .The conditions 1,2,3 are called “Peano’s axioms”.

Remark 20.13. Given a well ordered ring one can easily prove there exists a Peanotriple; cf. Exercise 20.14 below. Conversely given a Peano triple one can prove thereexists a well ordered ring; this is tedious and will be addressed in Exercise 20.15.The idea of proof of Theorem 20.3 is to prove in set theory (i.e. ZFC) that thereexists a Peano triple; here the axiom of infinity is crucial. Then the existence of awell ordered ring follows.

Exercise 20.14. Prove that if there exists a well ordered ring Z then σ : N → Nis σ(x) = x+ 1 then (N, 1, σ) is a Peano triple.

Exercise 20.15. This exercise gives some steps towards showing how to constructa well ordered ring from a given Peano triple. Assume (N, 1, σ) is a Peano triple.For y ∈ N let

Ay = {τ ∈ NN ; τ(1) = σ(y),∀x(τ(σ(x)) = σ(τ(x)))}

1) Prove that Ay has at most one element. Hint: if τ, η ∈ Ay and S = {x; τ(x) =η(x)} then 1 ∈ S and σ(S) ⊂ S; so S = N .

2) Prove that for any y, Ay 6= ∅. Hint: If T = {y ∈ N ;Ay 6= ∅} then 1 ∈ T andσ(T ) ⊂ T ; so T = N .

3) By 1 and 2 we may write Ay = {τy}. Then define + on N by x+ y = τy(x).4) Prove that x+ y = y + x and (x+ y) + z = x+ (y + z) on N .5) Prove that if x, y ∈ N , x 6= y, then there exists z ∈ N such that either

y = x+ z or x = y + z.6) Define N ′ = {−} × N , Z = N ′ ∪ {0} ∪ N where 0 and − are two symbols.

extend naturally + to Z.7) Define × on N and then on Z in the same style as for +.8) Define ≤ on N and prove (N,≤) is well ordered. Extend this to Z.9) Prove that (Z,+,×, 0, 1,≤) is a well ordered ring.

From now on we accept Theorem 20.3 (either as a theorem whose proof wesummarily sketched or as an additional axiom for set theory).

74 ALEXANDRU BUIUM

Definition 20.16. Define the natural numbers 2, 3, ..., 9 by

2 = 1 + 13 = 2 + 1

....9 = 8 + 1

Define 10 = 2 × 5. Define 102 = 10 × 10, etc. Define symbols like 423 as being4 × 102 + 2 × 10 + 3, etc. This is called a decimal representation. (We will laterprove that any natural number has a decimal representation.)

Exercise 20.17. Prove that 12 = 9 + 3. Hint (actually a complete proof): wehave:

12 = 10 + 2= 2× 5 + 2= (1 + 1)× 5 + 2= 1× 5 + 1× 5 + 2 = 5 + 5 + 2= 5 + 5 + 1 + 1 = 5 + 6 + 1 = 5 + 7 = 4 + 1 + 7= 4 + 8 = 3 + 1 + 8 = 3 + 9 = 9 + 3

Exercise 20.18. Prove that 18 + 17 = 35. Prove that 17× 3 = 51.

Remark 20.19. In Kant’s Critique of pure reason statements like the ones in theprevious exercise were viewed as synthetic; in contemporary mathematics, hence inthe approach we follow, all these statements are, on the contrary, analytic state-ments. (The definition of analytic/synthetic is taken here in the sense of Leibnizand Kant; there is another pair of notions, a priori/a posteriori, whose definitionin Leibniz is different from the one of Kant; we did not discuss and will not discussthis latter pair of notions here.)

Exercise 20.20. Prove that 7 ≤ 20.

Notation 20.21. For any integers a, b ∈ Z the set {x ∈ Z; a ≤ x ≤ b} willbe denoted, for simplicity, by {a, ..., b}. This set is clearly empty if a > b. Ifother numbers in addition to a, b are specified then the meaning of our notationwill be clear from the context; for instance {0, 1, ..., n} means {0, ..., n} whereas{2, 4, 6, ..., 2n} will mean {2x; 1 ≤ x ≤ n}, etc. A similar convention applies if thereare no numbers after the dots.

Example 20.22. {−2, ..., 11} = {−2,−1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}.

Recall that a subset A ⊂ N is bounded (equivalently bounded from above) ifthere exists b ∈ N such that a ≤ b for all a ∈ A; we say that A is bounded by bfrom above.

Exercise 20.23. Prove that N is not bounded.

Exercise 20.24. Prove that any subset of Z bounded from above has a maximum.Hint: If A is bounded from above by b consider the set {b− x;x ∈ A}.

Definition 20.25. An integer a is even if there exists an integer b such that a = 2b.An integer is odd if it is not even.

Exercise 20.26. Prove that if a is odd then a − 1 is even. Hint: consider theset {b ∈ N; 2b ≥ a}, and let c be the minimum element of S. Then show that2(c− 1) < a. Finally show that this implies a = 2c− 1.

A FIRST COURSE IN MATHEMATICS 75

Exercise 20.27. Prove that if a and b are odd then ab is odd. Hint: write a = 2c+1and b = 2d+ 1 (cf. the previous exercise) and compute (2c+ 1)(2d+ 1).

Exercise 20.28. Consider the following sentence: There is no bijection betweenN and Z. Explain the mistake in the following wrong proof:

Proof. Assume there is a bijection f : N → Z. Define f(x) = x. Then f is notsurjective so it is not a bijection.

Exercise 20.29. Prove that there is a bijection between N and Z.

21. Induction

Let P (x) be a formula in the language Lset of sets, with free variable x. For eachsuch P (x) we have:

Proposition 21.1. (Induction Principle) Assume1) P (1).2) For all n ∈ N if n 6= 1 and P (n− 1) then P (n).

Then for all n ∈ N we have P (n).

The above is expressed, as usual, in argot. The same expressed as a sentence inLset reads:

(P (1) ∧ (∀x((x ∈ N) ∧ (x 6= 0) ∧ P (x− 1))→ P (x)))→ (∀x((x ∈ N)→ P (x)))

We refer to the above as induction on n. For each explicit P (x) this is a gen-uine theorem. Note that the above Proposition does not say “for all P somethinghappens”; that would not be a sentence in the language of sets.

Proof. Let S = {n ∈ N;¬P (n)}. Let m be the minimum of S. By 1) m 6= 1. ByExercise 20.11 m − 1 ∈ N. By minimality of m, we have P (m − 1). By 2) we getP (m), a contradiction. �

Exercise 21.2. Define n2 = n × n and n3 = n2 × n for any integer n. Provethat for any natural n there exists an integer m such that n3 − n = 3m. (Laterwe will say that 3 divides n3 − n.) Hint: proceed by induction on n as follows.Let P (n) be the sentence: for all natural n there exists an integer m such thatn3 − n = 3m. P (1) is true because 13 − 1 = 3 × 0. Assume now that P (n − 1) istrue i.e. (n − 1)3 − (n − 1) = 3q for some integer q and let us check that P (n) istrue i.e. that n3−n = 3m for some integer m. The equality (n−1)3− (n−1) = 3qreads n3 − 3n2 + 3n− 1− n+ 1 = 3q. Hence n3 − n = 3(n2 − n) and we are done.

Exercise 21.3. Define n5 = n3 × n2. Prove that for any natural n there exists aninteger m such that n5 − n = 5m.

Proposition 21.4. If there exists a bijection {1, ..., n} → {1, ...,m} then n = m.

Proof. We proceed by induction on n. Let P (n) be the statement of the Propo-sition. Clearly P (1) is true; cf. the Exercise below. Assume now P (n − 1) is trueand let’s prove that P (n) is true. So consider a bijection F : {1, ..., n} → {1, ...,m};we want to prove that n = m. Let i = F (n) and define the map G : {1, ..., n−1} →{1, ...,m}\{i} by G(j) = F (j) for all 1 ≤ j ≤ n− 1. Then clearly G is a bijection.Now consider the map H : {1, ...,m}\{i} → {1, ...,m− 1} defined by H(j) = j for1 ≤ j ≤ i−1 and H(j) = j−1 for i+1 ≤ j ≤ m. (The definition is correct because

76 ALEXANDRU BUIUM

for any j ∈ {1, ...,m}\{i} either j ≤ i− 1 or j ≥ i+ 1; cf. Exercise 20.10.) ClearlyH is a bijection. We get a bijection

H ◦G : {1, ..., n− 1} → {1, ...,m− 1}.Since P (n− 1) is true we get n− 1 = m− 1. Hence n = m and we are done. �

Exercise 21.5. Check that P (1) is true in the above Proposition.

Remark 21.6. Note the general strategy of proofs by inductions. Say P (n) is“about n objects”. There are two steps. The first step is the verification of P (1)i.e. one verifies the statement “for one object”. For the second step (called theinduction step) one considers a situation with n objects; one “removes” from thatsituation “one object” to get a “situation with n−1 objects”; one uses the “inductionhypothesis” P (n− 1) to conclude the claim for the “situation with n− 1 objects”.Then one tries to “go back” and prove that the claim is true for the situation withn objects. So the second step is performed by “removing” one object from anarbitrary situation with n objects and NOT by adding one object to an arbitrarysituation with n − 1 objects. Below is an example of a fallacious reasoning byinduction based on “adding” instead of “subtracting” an object.

Example 21.7. Here is a WRONG argument for the induction step in the proofof Proposition 21.4.

Let G : {1, ..., n − 1} → {1, ...,m − 1} be any bijection and let F : {1, ..., n} →{1, ...,m} be defined by F (i) = G(i) for i ≤ n − 1 and F (n) = m. Clearly F is abijection. Now by the induction hypothesis n−1 = m−1. Hence n = m. This endsthe proof.

The mistake is that the above does not end the proof: the above argumentonly covers bijections F : {1, ..., n} → {1, ...,m} constructed from bijections G :{1, ..., n − 1} → {1, ...,m − 1} in the special way described above. In other wordsan arbitrary bijection F : {1, ..., n} → {1, ...,m} does not always arise the way wedefined F in the above “proof”. In some sense the mistake we just pointed out isthat of defining the same constant twice (cf. Example 20.28): we were supposedto define the symbol F as being an arbitrary bijection but then we redefined Fin a special way through an arbitrary G. The point is that if G is arbitrary andF is defined as above in terms of G then F will not be arbitrary (because F willalways send n into m). THIS KIND OF MISTAKE IS VERY COMMON WITHBEGINNERS IN MATHEMATICS.

Definition 21.8. A set A is finite if there exists an integer n ≥ 0 and a bijectionF : {1, ..., n} → A. (n is then unique by Proposition 21.4.) We write |A| = n andwe call this number the cardinality of A or the number of elements of A. (Note that|∅| = 0.) If F (i) = ai we write A = {a1, ..., an}. A set is infinite if it is not finite.

Exercise 21.9. Prove that |{2, 4,−6, 9,−100}| = 5.

Exercise 21.10. For any finite sets A and B we have that A ∪B is finite and

|A ∪B|+ |A ∩B| = |A|+ |B|.Hint: Reduce to the case A∩B = ∅. Then if F : {1, ..., a} → A and G : {1, ..., b} →B are bijections prove that H : {1, ..., a + b} → A ∪ B defined by H(i) = F (i) for1 ≤ i ≤ a and H(i) = G(i− a) for a+ 1 ≤ i ≤ a+ b is a bijection.

A FIRST COURSE IN MATHEMATICS 77

Exercise 21.11. For any finite sets A and B we have that A×B is finite and

|A×B| = |A| × |B|.

Hint: Induction on |A|.

Exercise 21.12. Let F : {1, ..., n} → Z be an injective map and write F (i) = ai.We refer to such a map as a (finite) family of numbers. Prove that there exists aunique map G : {1, ..., n} → Z such that G(1) = a1 and G(k) = G(k − 1) + ak for2 ≤ k ≤ n. Hint: induction on n.

Definition 21.13. In the notation of the above Exercise define the (finite) sum∑ni=1 ai as the number G(n). We also write a1 + ...+ an for this sum. If a1 = ... =

an = a the sum a1 + ...+ an is written as a+ ...+ a (n times).

Exercise 21.14. Prove that for any a, b ∈ N we have

a× b = a+ ...+ a (b times) = b+ ...+ b (a times).

Exercise 21.15. Define in a similar way the (finite) product∏ni=1 ai (which is

also denoted by a1...an = a1 × ... × an). Prove the analogues of associativity anddistributivity for sums and products of families of numbers. Define ab for a, b ∈ Nand prove that ab+c = ab × ac and (ab)c = abc.

Exercise 21.16. Prove that if a is an integer and n is a natural number then

an − 1 = (a− 1)(an−1 + an−2 + ...+ a+ 1).

Hint: induction on n.

Exercise 21.17. Prove that if a is an integer and n is an integer then

a2n+1 + 1 = (a+ 1)(a2n − a2n−1 + a2n−2 − ...− a+ 1).

Hint. Set a = −b.

Exercise 21.18. Prove that a subset A ⊂ N is bounded if and only if it is finite.Hint: To prove that bounded sets are finite assume this is false and let b be theminimum natural number with the property that there is a set A bounded fromabove by b and infinite. If b 6∈ A then A is bounded from above by b− 1 (Exercise20.10) and we are done. If b ∈ A then, by minimality of b, that there is a bijectionA\{b} → {1, ...,m} and one constructs a bijection A → {1, ...,m + 1} which isa contradiction. To prove that finite sets are bounded assume this is false andlet n be minimum natural number with the property that there is a finite subsetA ⊂ N of cardinality n which is not bounded. Let F : {1, ..., n} → A be a bijection,ai = F (i). Then {a1, ..., an−1} is bounded from above by some b and conclude thatA is bounded from above by either b or an.

Exercise 21.19. Prove that any subset of a finite set is finite. Hint: use theprevious exercise.

Definition 21.20. Let A be a set and n ∈ N. Define the set An to be the setA{1,...,n} of all maps {1, ..., n} → A. Call

A? =⋃n=1

An

the set of words with letters in A.

78 ALEXANDRU BUIUM

Notation 21.21. If f : {1, ..., n} → A and f(i) = ai we write f as a “tuple”(a1, ..., an) and sometimes as a “word” a1...an.

Exercise 21.22. Show that the maps An ×Am → An+m,

((a1, ..., an), (b1, ..., bm)) 7→ (a1, ..., an, b1, ..., bm)

(called concatenations) are bijections. They induce a non-injective binary operationA? ×A? → A?, (u, v)→ uv. Prove that u(vw) = (uv)w.

22. Rationals

Definition 22.1. For any a, b ∈ Z with b 6= 0 define the fraction ab to be the set of

all pairs (c, d) with c, d ∈ Z, d 6= 0 such that ad = bc. Call a and b the numeratorand the denominator of the fraction a

b . Denote by Q the set of all fractions. So

a

b= {(c, d) ∈ Z× Z; d 6= 0, ad = bc},

Q = {ab

; a, b ∈ Z, b 6= 0}.

Example 22.2.6

10= {(6, 10), (−3,−5), (9, 15), ...} ∈ Q.

Exercise 22.3. Prove that ab = c

d if and only if ad = bc. Hint: assume ad = bcand let us prove that a

b = cd . We need to show that a

b ⊂cd and that c

d ⊂ab . Now if

(x, y) ∈ ab then xb = ay; hence xbd = ayd. Since ad = bc we get xbd = bcy. Hence

b(xd − cy) = 0. Since b 6= 0 we have xd − cy = 0 hence xd = cy hence (x, y) ∈ cd .

We proved that ab ⊂

cd . The other inclusion is proved similarly. So the equality

ab = c

d is proved. Conversely if one assumes ab = c

d one needs to prove ad = bc; Ileave this to the reader.

Exercise 22.4. On the set A = Z× (Z\{0}) one can consider the relation: (a, b) ∼(c, d) if and only if ad = bc. Prove that ∼ is an equivalence relation. Then observethat a

b is the equivalence class

(a, b)

of (a, b). Also observe that Q = A/ ∼ is the quotient of A by the relation ∼.

Notation 22.5. Write a1 = a; this identifies Z with a subset of Q

Definition 22.6. Define ab + c

d = ad+bcbd , a

b ×cd = ac

bd .

Exercise 22.7. Show that the above definition is correct (i.e. if ab = a′

b′ , cd = c′

d′

then ad+bcbd = a′d′+b′c′

b′d′ and similarly for the product.)

Exercise 22.8. Prove that Q (with the operations + and × defined above andwith the elements 0, 1) is a field.

Definition 22.9. For ab ,

cd with b, d > 0 write a

b ≤cd if ad − bc ≤ 0. Also write

ab ≤

cd if a

b ≤cd and a

b 6=cd .

Exercise 22.10. Prove that Q equipped with ≤ is an ordered ring but it is not awell ordered ring.

A FIRST COURSE IN MATHEMATICS 79

Exercise 22.11. Let A be a finite set and define µ : P(A)→ Q by

µ(X) =|X||A|

.

1) (A,P(A), µ) is a finite probability measure space.2) Prove that if X = Y 6= A then X and Y are not independent.3) Prove that if X∩Y = ∅ and X 6= ∅, Y 6= ∅ then X and Y are not independent.4) Prove that if A = B × C, X = B′ × C, Y = B × C ′, B′ ⊂ B, C ′ ⊂ C, then

X and Y are independent.

Exercise 22.12. Prove by induction the following equalities:

1 + 2 + ...+ n =n(n+ 1)

2.

12 + 22 + ...+ n2 =n(n+ 1)(2n+ 1)

6

13 + 23 + ...+ n3 =n2(n+ 1)2

4.

23. Combinatorics

Definition 23.1. For n ∈ N define the factorial of n (read n factorial) by

n! = 1× 2× ...× n ∈ N.

Also set 0! = 1.

Definition 23.2. For 0 ≤ k ≤ n in N define the binomial coefficient(nk

)=

n!

k!(n− k)!∈ Q.

One also reads this “n choose k”.

Exercise 23.3. Prove that (nk

)=

(n

n− k

)and (

n0

)= 1,

(n1

)= n.

Exercise 23.4. Prove that(nk

)+

(n

k + 1

)=

(n+ 1k + 1

).

Hint: direct computation with the definition.

Exercise 23.5. Prove that (nk

)∈ Z.

Hint: fix k and proceed by induction on n; use Exercise 23.4.

80 ALEXANDRU BUIUM

Exercise 23.6. For any a, b in any ring we have

(a+ b)n =

n∑k=0

(nk

)akbn−k.

Here if c is in a ring R and m ∈ N then mc = c+ ...+ c (m times). Hint: Inductionon n and use Exercise 23.4.

Exercise 23.7. (Subsets) Prove that if |A| = n then |P(A)| = 2n. (A set with nelements has 2n subsets.) Hint: induction on n; if A = {a1, ..., an+1} use

P(A) = {B ∈ P(A); an+1 ∈ B} ∪ {B ∈ P(A); an+1 6∈ B}.

Exercise 23.8. (Combinations) Let A be a set with |A| = n, let 0 ≤ k ≤ n andset

Comb(k,A) = {B ∈ P(A); |B| = k}Prove that

|{Comb(k,A)| =(nk

).

I.e. a set of n elements has exactly

(nk

)subsets with k elements. A subset of A

having k elements is called a combination of k elements from the set A.Hint: Fix k and proceed by induction on n. If A = {a1, ..., an+1} use Exercise

23.4 plus the fact that Comb(k,A) can be written as

{B ∈ P(A); |B| = k, an+1 ∈ B} ∪ {B ∈ P(A); |B| = k, an+1 6∈ B}.

Exercise 23.9. (Permutations) For a set A let Perm(A) ⊂ AA be the set of allbijections F : A→ A. Prove that if |A| = n then

|Perm(A)| = n!.

The elements of Perm(A) are called permutations of A so the exercise says that aset of n elements has n! permutations. Hint: Let |A| = |B| = n and let Bij(A,B)be the set of all bijections F : A → B; it is enough to show that |Bij(A,B)| = n!.Proceed by induction on n; if A = {a1, ..., an+1}, B = {b1, ..., bn+1} then use thefact that

Bij(A,B) =

n+1⋃k=1

{F ∈ Bij(A,B);F (a1) = bk}.

For d ∈ N and X a set write Xd be the set of all maps {1, ..., d} → X.

Exercise 23.10. (Combinations with repetition) Let

Combrep(n, d) = |{(x1, ..., xd) ∈ Zd;xi ≥ 0, x1 + ...+ xd = n}Prove that

|Combrep(n, d)| =(n+ d− 1d− 1

)Hint: Let A = {1, ..., n+ d− 1}. Prove that there is a bijection

Comb(d− 1, A)→ Combrep(n, d)

The bijection is given by attaching to any subset

{i1, ..., id−1} ⊂ {1, ..., n+ d− 1}(where i1 < ... < id−1) the tuple (x1, ..., xd−1) where

A FIRST COURSE IN MATHEMATICS 81

1) x1 = |{i ∈ Z; 1 ≤ i < i1}|,2) xk = |{i ∈ Z; ik < i < ik+1}|, for 2 ≤ k ≤ d− 1, and3) xd = |{i ∈ Z; id−1 < i ≤ n+ d− 1}|.

24. Sequences

Definition 24.1. A sequence in a set A is a map F : N→ A. If we write F (n) = anwe also say that a1, a2, , ... is a sequence in A.

Theorem 24.2. (Recursion theorem) Let A be a set, a ∈ A an element, and letF1, F2, ... be a sequence of maps A → A. Then there is a unique map G : N → Asuch that F (1) = a and G(n+ 1) = Fn(G(n)) for all n ∈ N.

Sketch of proof. We construct G as a subset of N × A. Let Y be the set of allsubsets Y ⊂ N×A with the property that

((1, a) ∈ Y ) ∧ (n, x) ∈ Y )→ ∀n((n+ 1, Fn(x)) ∈ Y )

and we defineG =

⋂Y ∈Y

Y.

Then one proves by induction on n that for any n ∈ N there exists unique x ∈ Asuch that (n, x) ∈ G. So G is a map. One then checks that G has the desiredproperty. �

Here are some applications of recursion.

Proposition 24.3. Let (A,≤) be an ordered set that has no maximal element.Then there is a sequence F : N → A such that for all n ∈ N we have F (n) <F (n+ 1).

Proof. Let B = {(a, b) ∈ A × A; a < b} where a < b means, of course, a ≤ band a 6= b. By hypothesis the first projection F : B → A, (a, b) 7→ a is surjective.By the axiom of choice there exists G : A → B such that F ◦ G = IA. ThenG(a) > a for all a. By the recursion theorem there exists F : N → A such thatF (n+ 1) = G(F (n)) for all n and we are done. �

Exercise 24.4. (Uniqueness of the ring of integers) Let Z and Z ′ be two orderedrings satisfying the well ordering axiom. Prove that there exists a unique ringhomomorphism F : Z → Z ′; prove that this F is bijective and increasing. Hint: LetZ+ be the set of all elements in Z which are > 0 and similarly for Z ′. By recursion(which is, of course, valid in any ordered ring satisfying the well ordering axiom:explain!) there is a unique F : Z+ → Z ′+ satisfying F (1) = 1 and F (n+ 1) = F (n).Define F on Z by F (−n)− F (n) for −n ∈ Z+.

Definition 24.5. A set A is countable if there exists a bijection F : N→ A.

Example 24.6. The set of all squares S = {n2;n ∈ N} is countable; indeedF : Z→ S, F (n) = n2 is a bijection.

Exercise 24.7. Any subset of a countable set is countable. Hint: It is enough toshow that any subset A ⊂ N is countable. Let F ⊂ N× N be the set

F = {(x, y); y = min(B ∩ {z ∈ N; z > x})which is of course a map. By the recursion theorem there exists G : N → N suchthat G(n+ 1) = F (G(n)). One checks that G is injective and its image is B.

82 ALEXANDRU BUIUM

Exercise 24.8. Prove that N × N is countable. Hint: One can find injectionsN× N→ N ; e.g. (m,n) 7→ (m+ n)2 +m.

Exercise 24.9. Prove that Q is countable.

Example 24.10. P(N) is not countable. Indeed this is a consequence of the moregeneral theorem we proved that there is no bijection between a set A and its powerset P(A). However it is interesting to give a reformulation of the argument inthis case (Cantor’s diagonal argument). Assume P(N) is countable and seek acontradiction. Since P(N) is in bijection with {0, 1}N we get that there is a bijectionF : N→ {0, 1}N. Denote F (n) by Fn : N→ {0, 1}. Construct a map G : N→ {0, 1}by the formula

G(n) = ¬(Fn(n))

where ¬ : {0, 1} → {0, 1}, ¬0 = 1, ¬1 = 0. (The definition of G does not needthe recursion theorem; one can define G as a “graph” directly (check!)). Since F issurjective there exists m such that G = Fm. In particular:

G(m) = Fm(m) = ¬G(m)

a contradiction.

Remark 24.11. Consider the following statement called the continuum hypothesis:

For any set A if there exists an injection A → P(N) then either there exists aninjection A→ N or there exists a bijection A→ P(N).

Answering this question (raised by Cantor) lead to important investigations in settheory. The answer (given by two theorems of Godel and Cohen in the frameworkof mathematical logic rather than logic) turned out to be rather surprising; see thelast section of this course.

25. Reals

Definition 25.1. (Dedekind). A real number is a subset u ⊂ Q of the set Q ofrational numbers with the following properties:

1) u 6= ∅ and u 6= Q,2) if x ∈ u, y ∈ Q, and x ≤ y then y ∈ u.

Denote by R the set of real numbers.

Example 25.2.1) Any rational number x ∈ Q can be identified with the real number

ux = {y ∈ Q;x ≤ y}.

It is clear that ux = ux′ for x, x′ ∈ Q implies x = x′. We identify any rationalnumber x with ux. So we may view Q ⊂ R.

2) One defines, for instance, for any n ∈ N,√n = {x ∈ Q;x ≥ 0, x2 ≥ n}.

Definition 25.3. If u and v are real numbers we write u ≤ v if and only if v ⊂ u.For u, v ≥ 0 define

u+ v = {x+ y;x ∈ u, y ∈ v}u× v = uv = {xy;x ∈ u, y ∈ v}.

Note that this extends addition and multiplication on the non-negative rationals.

A FIRST COURSE IN MATHEMATICS 83

Exercise 25.4. Naturally extend the definition of addition + and multiplication× of real numbers to the case when the numbers are not necessarily ≥ 0. Provethat (R,+,×,−, 0, 1) is a field. Naturally extend the order ≤ on Q to an order onR and prove that R with ≤ is an ordered ring.

Exercise 25.5. Define the sum and the product of a family of real (or complex)numbers indexed by a finite set. Hint: use the already defined concept for integers(and hence for the rationals).

Exercise 25.6. Prove that (√n)2 = n.

Exercise 25.7. Prove that for any r ∈ R with r > 0 there exists a unique number√r ∈ R such that

√r > 0 and (

√r)2 = r.

Exercise 25.8. Prove that√

2 6∈ Q. Hint: assume there exists a rational number xsuch that x2 = 2 and seek a contradiction. Let a ∈ N be minimal with the property

that x = ab for some b. Now a2

b2 = 2 hence 2b2 = a2. Hence a2 is even. Hence a is

even (because if a were odd then a2 would be odd.) Hence a = 2c for some integerc. Hence 2b2 = (2c)2 = 4c2. Hence b2 = 2c2. Hence b2 is even. Hence b is even.Hence b = 2d for some integer d. Hence x = 2c

2d = cd and c < a. This contradicts

the minimality of a which ends the proof.

Remark 25.9. The above proof is probably one of the “first” proofs by contra-diction in the history of mathematics; this proof appears, for instance, in Aristotle(4th century BC), and it is believed to have been discovered by the Pythagoreans.

The irrationality of√

2 was translated by the Greeks as evidence that arithmetic isinsufficient to control geometry (

√2 is the length of the diagonal of a square with

side 1) and arguably created the first crisis in the history of mathematics, leadingto a separation of algebra and geometry that lasted until Descartes (17th century).

Exercise 25.10. Prove that the set

{r ∈ Q; r > 0, r2 < 2}

has no suppremum in Q.

Remark 25.11. Later we will prove that R is not countable.

Definition 25.12. For a < b in R we define the open interval

(a, b) = {c ∈ R; a < c < b} ⊂ R

(Not to be confused with the pair (a, b) ∈ R × R which is denoted by the samesymbol.) A subset U ⊂ R is called open if for any x ∈ U there exists an openinterval containing x and contained in U : x ∈ (a, b) ⊂ U . Let T ⊂ P(R) be the setof all open sets of R.

Exercise 25.13. Prove that T is a topology on R; we call this the Euclideantopology.

26. Complexes

Once one has the notion of real number that of complex number is easy tointroduce:

84 ALEXANDRU BUIUM

Definition 26.1. A complex number is a pair (a, b) where a, b ∈ R. We denote byC the set of complex numbers. Hence C = R×R. Define the sum and the productof two complex numbers by

(a, b) + (c, d) = (a+ c, b+ d)

(a, b)× (c, d) = (ac− bd, ad+ bc).

Remark 26.2. Identify any real number a ∈ R with the complex number (a, 0) ∈C; hence write a = (a, 0). In particular 0 = (0, 0) and 1 = (1, 0).

Exercise 26.3. Prove that C equipped with 0, 1 above and with the operations+,× above is a field. Also note that the operations + and × on C restricted to Rare the “old” operations + and × on R.

Notation 26.4. We set i = (0, 1).

Remark 26.5. i2 = −1. Indeed

i2 = (0, 1)× (0, 1) = (0× 0− 1× 1, 0× 1 + 1× 0) = (−1, 0) = −1.

Remark 26.6. For any complex number (a, b) = a+ bi. Indeed

(a, b) = (a, 0) + (0, b) = (a, 0) + (b, 0)(0, 1) = a+ bi.

Definition 26.7. For any complex number z = a+ bi we define its absolute value

|z| =√a2 + b2.

Definition 26.8. For any complex number number z = a + bi ∈ C and any realnumber r > 0 we define the open disk with center z and radius r,

D(z, r) = {w ∈ C; |w − z| < r} ⊂ C

A subset U ⊂ C is called open if for any z ∈ U there exists an open disk centeredat z and contained in U . Let T ⊂ P(C) be the set of all open sets of C.

Exercise 26.9. Prove that T is a topology on C; we call this the Euclidean topology.

Exercise 26.10. Prove that C cannot be given the structure of an ordered ring.

27. Residues

First we develop some “arithmetic” in Z.

Definition 27.1. For integers a and b we say a divides b if there exists an integern such that b = an. We write a|b. We also say a is a divisor of b. If a does notdivide b write a 6 |b.

Example 27.2. 4|20; 6 6 |20.

Exercise 27.3. Prove that1) if a|b and b|c then a|c;2) if a|b and a|c then a|b+ c.3) a|b defines an order relation on N but not on Z.

Theorem 27.4. (Euclid division). For any a ∈ Z and b ∈ N there exist uniqueq, r ∈ Z such that a = bq + r and 0 ≤ r < b.

A FIRST COURSE IN MATHEMATICS 85

Proof. We prove the existence of q, r. The uniqueness is left to the reader. Wemay assume a ∈ N. We proceed by contradiction. So assume there exists b anda ∈ N such that for all q, r ∈ Z with 0 ≤ r < b we have a 6= qb + r. Fix such ab. We may assume a is minimum with the above property. If a < b we can writea = 0× b+ a, a contradiction. If a = b we can write a = 1× a+ 0, a contradiction.If a > b set a′ = a − b. Since a′ < a, there exist q′, r ∈ Z such that 0 ≤ r < b anda′ = q′b+ r. But then a = qb+ r, where q = q′ + 1, a contradiction. �

Notation 27.5. For a ∈ Z denote 〈a〉 the set {na;n ∈ Z} of integers divisible by a.For a, b ∈ Z denote by 〈a, b〉 the set {ma+nb;m,n ∈ Z} of all numbers expressibleas a multiple of a plus a multiple of b.

Proposition 27.6. For any integers a, b there exists an integer c such that 〈a, b〉 =〈c〉.

Proof. If a = b = 0 we can take c = 0. Assume a, b are not both 0. Then theset S = 〈a, b〉 ∩ N is non-empty. Let c be the minimum of S. Clearly 〈c〉 ⊂ 〈a, b〉.Let us prove that 〈a, b〉 ⊂ 〈c〉. Let u = ma + nb and let us prove that u ∈ 〈c〉. ByEuclidean division u = cq + r with 0 ≤ r < c. We want to show r = 0. Assumer 6= 0 and seek a contradiction. Write c = m′a+ n′a. Then r ∈ N and also

r = u− cq = (ma+ nb)− (m′a+ n′b)q = (m−m′q)a+ (n− n′q)b ∈ 〈a, b〉.Hence r ∈ S. But r < c. So c is not the minimum of S, a contradiction. �

Proposition 27.7. If a and b are integers and have no common divisor > 1 thenthere exist integers m and n such that ma+ nb = 1.

Proof. By the above Proposition 〈a, b〉 = 〈c〉 for some c ≥ 1. In particular c|aand c|b. The hypothesis implies c = 1 hence 1 ∈ 〈a, b〉. �

One of the main definitions of number theory is

Definition 27.8. An integer p is prime if p > 1 and if its only positive divisors are1 and p.

Proposition 27.9. If p is a prime and a is an integer such that p 6= a then thereexist integers m,n such that ma+ np = 1.

Proof a and p have no common divisor > 1 and we conclude by Proposition27.7. �

Proposition 27.10. (Euclid Lemma) If p is a prime and p|ab for integers a andb then either p|a or p|b.

Proof Assume p|ab, p 6 |a, p 6 |b, and seek a contradiction. By Proposition 27.9ma+np = 1 for some integers m,n and m′b+n′p = 1 for some integers m′, n′. Weget

1 = (ma+ np)(m′b+ n′p) = mm′ab+ p(nm′ + n′m+ nn′).

Since p|ab we get p|1, a contradiction. �

Exercise 27.11. Prove that any integer > 1 can be written uniquely as a productof primes. (This is the Fundamental Theorem of Arithmetic). Hint: prove existenceof decomposition by contradiction by considering the smallest number which is not aproduct of primes. Prove uniqueness by contradiction by assuming n is the smallestinteger such that p1p2... = q1q2... with pi and qj prime, p1 ≤ p2 ≤ ..., q1 ≤ q2 ≤ ....

86 ALEXANDRU BUIUM

Definition 27.12. Fix an integer m 6= 0. Define a relation ≡m on Z by a ≡m b ifand only if m|a − b. Say a congruent to b mod m, instead of a ≡m b one usuallywrites (following Gauss):

a ≡ b mod m.

Example 27.13. 3 ≡ 17 mod 7.

Exercise 27.14. Prove that ≡m is an equivalence relation. Prove that the equiv-alence class a of a consists of all the numbers of the form mb+ a where m ∈ Z.

Example 27.15. If m = 7 then 3 = 10 = {...,−4, 3, 10, 17, ...}.

Definition 27.16. For the equivalence relation ≡m on Z the set of equivalenceclasses Z/ ≡m is denoted by Z/mZ. The elements of Z/mZ are called residueclasses mod m.

Exercise 27.17. Prove that

Z/mZ = {0, 1, ...,m− 1}.So the residue classes mod m are: 0, 1, ...,m− 1. Hint: use Euclid division.

Definition 27.18. Define operations +,×,− on Z/mZ by

a+ b = a+ b

a× b = ab

−a = −1

Exercise 27.19. Check that the above definitions are correct, in other words thatif a = c and b = d then

a+ b = c+ d

ab = cd

−a = −c.Furthermore check that (Z/mZ,+,×,−, 0, 1) is a ring.

Notation 27.20. If p is a prime we write Fp in place of Z/pZ.

Exercise 27.21. Prove that Fp is a field. Hint: use Proposition 27.9.

28. p-adics

p-adic numbers were invented by Hensel at the turn of the 20th century. Theyare “alternative worlds” with respect to the reals and, in a precise sense, the “onlyalternative worlds”.

Definition 28.1. Let p be a prime. Let S be the set of all sequences (an) ofintegers an ∈ Z such that for all n

an ≡ an+1 mod pn.

Say that two sequences (an) and (bn) as above are equivalent (write (an) ∼p (bn))if for all n

an ≡ bn mod pn.

Denote by [an] the equivalence class of the sequence (an). Denote by

Zp = S/ ∼p

A FIRST COURSE IN MATHEMATICS 87

the set of equivalence classes. Then Zp is a ring with addition and multiplicationdefined by

[an] + [bn] = [an + bn]

[an][bn] = [anbn];

0 and 1 in this ring are the classes of the constant 0, respectively 1 sequences. Zp iscalled the ring of p-adic integers. (N.B. Zp is the standard notation for this objectin number theory; however in many books Zp is used to denote what we denoted byFp; one should be aware of this discrepancy between generally accepted notation.)

Exercise 28.2. Check that the ring axioms are satisfied.

Exercise 28.3. Consider a sequence (bn) and the sequence

an = b1 + pb2 + p2b3 + ...+ pn−1bn.

Prove that (an) ∈ S. Also, any element in S can be written like this.

Remark 28.4. There is a surjective natural homomorphism Zp → Fp, [an] 7→ a1.

Remark 28.5. We summarize the main rings of numbers we have encountered sofar; they are the main types of numbers in mathematics:

Z → Q → R → C

Zp

Fp

Exercise 28.6. Prove that there are no ring homomorphisms C → Fp and thereare no ring homomorphisms Fp → C. (Morally the worlds of Fp and C do not“communicate directly”, although they “communicate via Z).

Remark 28.7. There exist (many) injective ring homomorphisms Zp → C butthey are not “natural” in any way.

29. Algebra

Algebra is the study of algebraic structures, i.e. sets with operations on them.We already introduced, and constructed, some elementary examples of algebraicstructures such as rings and, in particular, fields. With rings/fields at our disposalone can study some other fundamental algebraic objects such as groups, matrices,vector spaces, polynomials, etc. In what follows we briefly survey some of these.

In some sense groups are more fundamental than rings and fields; but in orderto be able to look at more interesting examples we found it convenient to postponethe discussion of groups until this point.

88 ALEXANDRU BUIUM

Definition 29.1. A group is a tuple (G, ?,′ , e) consisting of a set G, a binaryoperation ? on G, a unary operation ′ on G (write ′(x) = x′), and an element e ∈ G(called the identity element) such that for any x, y, z ∈ G the following axioms aresatisfied:

1) x ? (y ? z) = (x ? y) ? z;2) x ? e = e ? x = x;3) x ? x′ = x′ ? x = e.

If in addition x ? y = y ? x for all x, y ∈ G we say G is commutative (or Abelian inhonor of Abel).

Notation 29.2. Sometimes one writes e = 1, x ? y = xy, x′ = x−1, x ? ... ? x = xn

(n ≥ 1 times). In the Abelian case one sometimes writes e = 0, x ? y = x + y,x′ = −x, x ? ... ? x = nx (n ≥ 1 times). These notations depend on the context andare justified by the following examples.

Example 29.3. If R is a ring then R is an Abelian group with e = 0, x?y = x+y,x′ = −x. Hence Z,Z/mZ,Zp,Q,R,C are groups “with respect to addition”.

Example 29.4. If R is a field then R× = R\{0} is an Abelian group with e =1, x ? y = xy, x′ = x−1. Hence Q×,R×,C×,F×p are groups “with respect tomultiplication”.

Example 29.5. The set S(X) of bijections σ : X → X from a set X into itself isa (non-Abelian) group with e = 1X (the identity map), σ ?τ = σ ◦ τ (composition),σ−1 =inverse map. If X = {1, ..., n} then one writes Sn = S(X) and call this groupthe symmetric group. If σ(1) = i1, ..., σ(n) = in one usually writes

σ =

(1 2 ... ni1 12 ... in

).

Exercise 29.6. Compute(1 2 3 4 53 2 5 4 3

)◦(

1 2 3 4 55 4 2 1 3

).

Also compute (1 2 3 4 53 2 5 4 3

)−1

Example 29.7. A 2 × 2 matrix with coefficients in a field R is an element A =(a, b, c, d) ∈ R4 = R×R×R×R. We usually write such an A as

A =

(a bc d

)Define the sum and the product of two matrices by(

a bc d

)+

(a′ b′

c′ d′

)=

(a+ a′ b+ b′

c+ c′ d+ d′

)(a bc d

)·(a′ b′

c′ d′

)=

(aa′ + bc′ ab′ + bd′

ca′ + dc′ cb′ + dd′

)Define the product of an element r ∈ R with a matrix A =

(a bc d

)by

r ·(a bc d

)=

(ra rbrc rd

)

A FIRST COURSE IN MATHEMATICS 89

Say a matrix A =

(a bc d

)is invertible if δ = ad−bc 6= 0 and we define its inverse

by

A−1 = δ−1

(d −b−c a

)Define the identity matrix by

I =

(1 00 1

)and the zero matrix by

O =

(0 00 0

)Let M2(R) be the set of all matrices and GL2(R) be the set of all invertible matrices.Then the following are true:

1) M2(R) is a group with respect to addition of matrices;2) GL2(R) is a group with respect to multiplication of matrices;3) (A+B)C = AC +BC and C(A+B) = CA+ CB for any matrices A,B,C;4) There exist matrices A,B such that AB 6= BA.

Exercise 29.8. Prove 1), 2), 3), 4) above.

Example 29.9. Groups are examples of algebraic structures so there is a welldefined notion of homomorphism of groups (or group homomorphism). Accordingto the general definition a group homomorphism is a map between the two groupsF : G→ G′ such that for all a, b ∈ G:

1) F (a ? b) = F (a) ? F (b);2) F (a−1) = F (a)−1 (this is automatic !),3) F (e) = e (this is, again, automatic !).Here we used ? for both operations in G and G′.

Next we discuss vector spaces.

Definition 29.10. Let R be a field. A vector space is an abelian group (V,+,−, 0)together with a map R × V → V , (a, v) 7→ av satisfying the following conditionsfor all a, b ∈ R and all u, v ∈ V :

1) (a+ b)v = av + bv2) a(u+ v) = au+ av3) a(bv) = (ab)v4) 1v = v

Definition 29.11. The elements u1, ..., un ∈ V are linearly independent if when-ever a1, ..., an ∈ R satisfies (a1, ..., an) 6= (0, ..., 0) it follows that a1u1+...+anun 6= 0.

Definition 29.12. The elements u1, ..., un ∈ V generate V if for any u ∈ V thereexist a1, ..., an ∈ R such that u = a1u1 + ..., anun.

Definition 29.13. The elements u1, ..., un ∈ V are a basis of V if they are linearlyindependent and generate V .

Exercise 29.14. If V has a basis u1, ..., un elements then the map Rn → V ,(a1, ..., an) 7→ a1u1 + ...+ anun is bijective. Hint: directly from definitions.

90 ALEXANDRU BUIUM

Exercise 29.15. If V is generated by u1, ..., un then V has a basis consisting of atmost n elements. Hint: considering a subset of {u1, ..., un} minimal with the prop-erty that it generates V we may assume that any subset obtained from {u1, ..., un}does not generate V . We claim that u1, ..., un are linearly independent. Assumenot. Hence there exists (a1, ..., an) 6= (0, ..., 0) such that a1u1 + ... + anun = 0.We may assume a1 = 1. Then one checks that u2, ..., un generate V , contradictingminimality.

Exercise 29.16. Assume R = Fp and V has a basis with n elements. Then|V | = pn.

Theorem 29.17. If V has a basis u1, ..., un and a basis v1, ..., vm then n = m.

Proof. We prove m ≤ n; similarly one has n ≤ m. Assume m > n and seeka contradiction. Since u1, ..., un generate V we may write v1 = a1u1 + ... + anunwith not all a1, ..., an zero. Renumbering u1, ..., un we may assume a1 6= 0. Hencev1, u2, ..., un generates V . Hence v2 = b1v1 + b2u2 + ...+ bnun. But not all b2, ..., bncan be zero because v1, v2 are linearly independent. So renumbering u2, ..., unwe may assume b2 6= 0. So v1, v2, u3, ..., un generates V . Continuing (one needsinduction) we get that v1, v2, ..., vn generates V . So vn+1 = d1vn + ...+ dnvn. Butthis contradicts the fact that v1, ..., vm are linearly independent. �

Exercise 29.18. Give a quick proof of the above theorem in case R = Fp. Hint:We have pn = pm hence n = m.

Exercise 29.19. Give an example of a vector space which, for all n, does not havea basis with n elements.

Definition 29.20. Assume V has a basis u1, ..., un. Then we define the dimen-sion of V to be n; write dimV = n. (The definition is correct due to the aboveProposition.)

Definition 29.21. If V and W are vector spaces a map F : V →W is called linearif for all a ∈ K, u, v ∈ V we have:

1) F (au) = aF (u),2) F (u+ v) = F (u) + F (v).

Exercise 29.22. Prove that F : V → W is a linear map of vector spaces thenV ′ = F−1(0) and V ′′ = F (V ) are vector spaces (with respect to the obviousoperations) and

dimV = dimV ′ + dimV ′′.

Hint: construct corresponding bases.

Here is the link with matrices:

Remark 29.23. If F : V → W is a linear map of vector spaces and v1, ..., vn andw1, ..., wm are bases of V and W respectively then one can write uniquely F (vi) =∑mj=1 aijwj ; one obtains nm numbers aij ∈ R which completely characterize F .

Finally we mention polynomials and algebraic equations.

Definition 29.24. Ler R be a ring and n ∈ Z, n ≥ 0. A function f : R → Ris called a polynomial function of degree n if there exist elements a0, ..., an ∈ R,an 6= 0, such that for all b ∈ R,

f(b) = anbn + ...+ a1b+ a0.

A FIRST COURSE IN MATHEMATICS 91

A function R → R is called a polynomial function if it is either the zero function(the function that sends R into 0) or if it is a polynomial function of some degree.An element c ∈ R is called a root of f (or a zero of f) if f(c) = 0. (Sometimes wesay “a root in R” instead of “a root”.)

The study of roots of polynomial functions is one of the main concerns of algebra.Here are two of the main basic theorems about roots; their proof is beyond the scopeof this course.

Theorem 29.25. (Lagrange) If R is a field then any polynomial function of degreen ≥ 1 has at most n roots in R.

Theorem 29.26. (Gauss) If R = C is the complex field then any polynomialfunction of degree n ≥ 1 has at least one root in C.

Remark 29.27. The main problems about roots are:1) Find the number of roots; in case R = Fp this leads to some of the most subtle

problems in number theory.2) Find (if possible) formulae that compute roots (such as formulae involving

radicals); this leads to Galois theory.

30. Geometry

Geometry is the study of shapes such as lines and planes, or, more generally,curves and surfaces, etc. There are two paths towards this study: the syntheticone and the analytic (or algebraic) one. Synthetic geometry is geometry with-out algebra. Analytic geometry is geometry through algebra. Synthetic geometryoriginates with the Greek mathematics of antiquity (e.g. the treatise of Euclid).Analytic geometry was invented by Fermat and Descartes in the 17th century. Wealready encountered the synthetic approach in the discussion of the affine plane andthe projective plane which were purely combinatorial objects. Here we introducesome of the most elementary structures of analytic geometry.

Definition 30.1. Let R be a field. The affine plane A2 = A2(R) over R is the setR2 = R × R. A point P = (x, y) in the plane is an element of R × R. A subsetL ⊂ R×R is called a line if there exist a, b, c ∈ R such that (a, b) 6= (0, 0) and

L = {(x, y) ∈ R2; ax+ by + c = 0}

We say a point P lies on the line L (or we say L passes through P ) if P ∈ L. Twolines are said to be parallel if they either coincide or their intersection is empty (inthe last case we say they don’t meet). 3 points are collinear if they lie on the sameline.

Notation 30.2. We sometimes write L = L(R) if we want to stress that coordi-nates are in R.

Exercise 30.3. Prove that:1) There exist 3 points which are not collinear.2) For any two distinct points P1 and P2 there exists a unique line L (called

sometimes P1P2) passing through P1 and P2. Hint: If P1 = (x1, y1), P2 = (x2, y2),and if

m = (y2 − y1)(x2 − x1)−1

92 ALEXANDRU BUIUM

then the unique line through P1 and P2 is:

L = {(x, y) ∈ R×R; y − y1 = m(x− x1)}.In particular any two non-parallel distinct lines meet in exactly one point.

3) Given a line L and a point P there exists exactly one line L′ passing throughP and parallel to L. (This is called Euclid’s fifth postulate but in our expositionhere this is not a postulate).

Hence A2 = R2 = R×R together with the set L of all lines (in the sense above)is an affine plane in the sense of Definition 18.43.

Remark 30.4. Not all affine planes in the sense of Definition 18.43 are affineplanes over a field in the sense above. Hilbert proved that an affine plane is theaffine plane over some field if and only if the theorems of Desargues (Parts I andII) and Pappus hold. See below for the “only if direction”.

Exercise 30.5. Prove that any line in Fp × Fp has exactly p points.

Exercise 30.6. How many lines are there in the plane Fp × Fp?

Exercise 30.7. (Desargues’ Theorem, Part I) Let A1, A2, A3, A′1, A

′2, A

′3 be distinct

points in the plane. Also for all i 6= j assume AiAj and A′iA′j are not parallel and

let Pij be their intersection. Assume the 3 lines A1A′1, A2A

′2, A3A

′3 have a point in

common. Then prove that the points L12, L13, L23 are collinear (i.e. line on someline). Hint: Consider the “space” R × R × R and define planes and lines in thisspace. Prove that if two planes meet and don’t coincide then they meet in a line.Then prove that through any two points in space there is a unique line and throughany 3 non-collinear points there is a unique plane. Now consider the projectionR×R×R→ R×R, (x, y, z) 7→ (x, y) and show that lines project onto lines. Nextshow that configuration of points Ai, A

′i ∈ R×R can be realized as the projection

of a similar configuration of points Bi, B′i ∈ R × R × R not contained in a plane.

(Identifying R × R with the set of points in space with zero third coordinate wetake Bi = Ai, B

′i = A′i for i = 1, 2, we let B3 have a nonzero third coordinate and

then we choose B′3 such that the lines B1B′1, B2B

′2, B3B

′3 have a point in common.)

Then prove Desargues in space (by noting that if Qij is the intersection of BiBjwith B′iB

′j then Qij is in the plane containing B1, B2, B3 and also in the plane

containing B′1, B′2, B

′3; hence Qij is in the intersection of these planes which is a

line.) Finally deduce the original plane Desargues by projection.

Exercise 30.8. (Desargues’ Theorem, Part II) Let A1, A2, A3, A′1, A

′2, A

′3 be dis-

tinct points in the plane. Assume the 3 lines A1A′1, A2A

′2, A3A

′3 have a point in

common or they are parallel. Assume A1A2 is parallel to A′1A′2 and A1A3 is parallel

to A′1A′3. Prove that A2A3 is parallel to A′2A

′3. Hint: compute coordinates. There

is an alternative proof that reduces Part II to Part I by using the “projective planeover our field”.

Exercise 30.9. (Pappus’ Theorem) Let P1, P2, P3 be points on a line L and letQ1, Q2, Q3 be points on a line M 6= L. Let A1 be the intersection of P2Q3 withP3Q2, and define A2, A3 similarly. (Assume the lines in question are not parallel.)Then prove that A1, A2, A3 are collinear. Hint for the case L and M meet. Thenone can assume L = {(x, 0);x ∈ R}, M = {(0, y); y ∈ R} (explain why). Let thepoints Pi = (xi, 0), and Qi = (0, yi) and compute the coordinates of Ai. Thencheck that the line through A1 and A2 passes through A3.

A FIRST COURSE IN MATHEMATICS 93

Remark 30.10. One can identify the projective plane (A2,L) attached to theaffine plane (A2,L) with the pair (P2, P2) defined as follows. Let P2 = R3/ ∼ where(x, y, z) ∼ (x′, y′, z′) if and only if there exists 0 6= λ ∈ R such that (x′, y′, z′) =(λx, λy, λz). Denote the equivalence class of (x, y, z) by (x : y : z). Identify a point(x, y) in the affine plane A2 = R2 = R×R with the point (x : y : 1) ∈ P2. Identifya point (x0 : y0 : 0) in the complement P2\A2 with the class of lines in A2 parallelto the line y0x− x0y = 0. This allows one to identify the complement P2\A2 with

the line at infinity L∞ of A2. Hence we get an identification of P2 with A2. Finallydefine a line in P2 as a set of the form

L = {(x : y : z); ax+ by + cz = 0}So under the above identifications,

L = {(x : y : 1); ax+ by + c = 0} ∪ {(x : y : 0); ax+ by = 0} = L ∪ {L}where L is the line in R2 defined by ax+ by + c = 0. Then define P2 to be the setof all lines L in P2. We get an identification of P2 with L.

So far we were concerned with points, lines, planes. Let us tackle “curvedshapes”, e.g. “higher degree curves”.

Definition 30.11. The circle of center (a, b) ∈ R×R and radius r is the set

C(R) = {(x, y) ∈ R×R; (x− a)2 + (y − b)2 = r2}.

Definition 30.12. A line is tangent to a circle if it meets it in exactly one point.(We say that the line is tangent to the circle at that point). Two circles are tangentif they meet in exactly one point.

Exercise 30.13. Prove that for any circle and any point on it there is exactly oneline tangent to the circle at that point.

Exercise 30.14. Prove that:1) A circle and a line meet in at most 2 points.2) Two circles meet in at most 2 points.

Exercise 30.15. How many points does a circle of radius 1 in F13 have? Sameproblem for F11.

Exercise 30.16. Prove that the circle C(R) with center (0, 0) and radius 1 is anAbelian group with e = (1, 0), (x, y)′ = (x,−y), and group operation

(x1, y1) ? (x2, y2) = (x1x2 − y1y2, x1y2 + x2y1).

Exercise 30.17. Consider the circle C(F17). Show that (3, 3), (1, 0) ∈ C(F17) andcompute (3, 3) ? (1, 0) and 2(1, 0) (where the latter is of course (1, 0) ? (1, 0)).

Circles are special cases of conics:

Definition 30.18. A conic is a subset Q ⊂ R×R of the form

Q = Q(R) = {(x, y) ∈ R×R; ax2 + bxy + cy2 + dx+ ey + f = 0}for some (a, b, c, d, e, f) ∈ R× ...×R, where (a, b, c) 6= (0, 0, 0).

We refer to (a, b, c, d, e) as the equation of the quadric and if the correspond-ing quadric passes through a point we say that the equation of the quadric passesthrough the point. We sometimes say “quadric” instead of “equation of the quadric”.

94 ALEXANDRU BUIUM

Exercise 30.19. (Uses the concept of vector space and dimension). Prove that if5 points are given in the plane such that no 4 of them are collinear then there existsa unique quadric passing through these given 5 points. Hint: consider the vectorspace of all (equations of) quadrics that pass through a given set S of points. Nextnote that if one adds a point to S the dimension of this space of quadrics eitherstays the same or drops by one. Since the space of all quadrics has dimension 6 itis enough to show that for r ≤ 5 the quadrics passing through r points are fewerthan those passing through r − 1 of the r points. For r = 4, for instance, this isdone by taking a quadric that is a union of 2 lines.

Definition 30.20. Let R be a field in which 2 := 1 + 1 6= 0, 3 := 1 + 1 + 1 6= 0.A subset Z = Z(R) ⊂ R×R is called an affine elliptic curve if there exist a, b ∈ Rwith 4a3 + 27b2 6= 0 such that

Z(R) = {(x, y) ∈ R×R; y2 = x3 + ax+ b}.We call Z(R) the elliptic curve over R defined by the equation y2 = x3 + ax + b.Next we define the projective elliptic curve defined by the equation y2 = x3 +ax+b.as the set

E(R) = Z(R) ∪ {∞}where ∞ is an element not belonging to Z(R). (We usually drop the word “pro-jective” and we call ∞ the point at infinity on E(R).) If (x, y) ∈ E(R) de-fine (x, y)′ = (x,−y). Also define ∞′ = ∞. Next we define a binary opera-tion ? on E(R) called the chord-tangent operation; we will see that E(R) be-comes a group with respect to this operation. First define (x, y) ? (x,−y) = ∞,∞?(x, y) = (x, y)?∞ = (x, y), and∞?∞ =∞. Also define (x, 0)?(x, 0) =∞. Nextassume (x1, y1), (x2, y2) ∈ E(R) with (x2, y2) 6= (x1,−y1) . If (x1, y1) 6= (x2, y2)we let L12 to be the unique line passing through (x1, y1) and (x2, y2). Recall thatexplicitly

L12 = {(x, y) ∈ R×R; y − y1 = m(x− x1)}where

m = (y2 − y1)(x2 − x1)−1.

If (x1, y1) = (x2, y2) we let L12 be the “line tangent to Z(R) at (x1, y1)” which isby definition given by the same equation as before except now m is defined to be

m = (3x21 + a)(2y1)−1.

(This definition is inspired by the definition of slope in analytic geometry.) Finallyone defines

(x1, y1) ? (x2, y2) = (x3,−y3)

where (x3, y3) is the “third point of intersection of E(R) with L12”; more precisely(x3, y3) is defined by solving the system consisting of the equations defining E(R)and L12 as follows: replacing y in y2 = x3 + ax + b by y1 + m(x − x1) we get acubic equation in x:

(y1 +m(x− x1))2 = x3 + ax+ b

which can be rewritten as

x3 −m2x2 + ... = 0.

x1, x2 are known to be roots of this equation. We define x3 to be the third rootwhich is then

x3 = m2 − x1 − x2;

A FIRST COURSE IN MATHEMATICS 95

so we define

y3 = y1 +m(x3 − x1).

Summarizing, the definition of (x3, y3) is

(x3, y3) =((y2 − y1)2(x2 − x1)−2 − x1 − x2, y1 + (y2 − y1)(x2 − x1)−1(x3 − x1)

)if (x1, y1) 6= (x2, y2), (x1, y1) 6= (x2,−y2) and

(x3, y3) =((3x2

1 + a)2(2y1)−2 − x1 − x2, y1 + (3x21 + a)(2y1)−1(x3 − x1)

)if (x1, y1) = (x2, y2), y1 6= 0.

Then E(R) with the above definitions is an Abelian group.

Exercise 30.21. Check the last statement. (N.B. Checking associativity is a verylaborious exercise.)

Exercise 30.22. Consider the group E(F13) defined by the equation y2 = x3 + 8.Show that (1, 3), (2, 4) ∈ E(F13) and compute (1, 3) ? (2, 4) and 2(2, 4) (where thelatter is of course (2, 4) ? (2, 4)).

Affine elliptic curves are special examples of cubics:

Definition 30.23. A cubic is a subset X = X(R) ⊂ R×R of the form

X(R) = {(x, y) ∈ R×R; ax3 +bx2y+cxy2 +dy3 +ex2 +fxy+gy2 +hx+iy+j = 0}

where (a, b, c, ..., j) ∈ R× ...×R, (a, b, c, d) 6= (0, ..., 0).

As usual we refer to the tuple (a, b, c, ..., j) as the equation of a cubic (or, byabuse, simply a cubic).

Exercise 30.24. (Three cubics theorem) Prove that if two cubics meet in exactly9 points and if a third cubic passes through 8 of the 9 points then the third cubicmust pass through the 9th point. Hint: First show that if r ≤ 8 and r points aregiven then the set of cubics passing through them is strictly larger than the set ofcubics passing through r−1 of the r points. (To show this show first that no 4 of the9 points are on a line. Then in order to find, for instance, a cubic passing throughP1, ..., P7 but not through P8 one considers the cubics Ci = Q1234i+Ljk, {i, j, k} ={5, 6, 7}, where Q1234i is the unique quadric passing through P1, P2, P3, P4, Pi andLjk is the unique line through Pj and Pk. Assume C5, C6, C7 all pass through P8

and derive a contradiction as follows. Note that P8 cannot lie on 2 of the 3 lines Ljkbecause this would force us to have 4 collinear points. So we may assume P8 doesnot lie on either of the lines L57, L67. Hence P8 lies on both Q12345 and Q12346.So these quadrics have 5 points in common. Form here one immediately gets acontradiction) Once this is proved let P1, ..., P9 be the points of intersection of thecubics with equations F and G. We know that the space of cubics passing throughP1, ..., P8 has dimension 2 and contains F and G. So any cubic in this space is alinear combination of F and G, hence will pass through the P9.

Exercise 30.25. (Pascal’s Theorem) Let P1, P2, P3, Q1, Q2, Q3 be points on a conicC. Let A1 be the intersection of P2Q3 with P3Q2, and define A2, A3 similarly.(Assume the lines in question are not parallel.) Then prove that A1, A2, A3 arecollinear. Hint: The cubics

Q1P2 ∪Q2P3 ∪Q3P1 and P1Q2 ∪ P2Q3 ∪ P3Q1

96 ALEXANDRU BUIUM

pass through all of the following 9 points:

P1, P2, P3, Q1, Q2, Q3, A1, A2, A3

On the other hand the cubic C ∪ A2A3 passes through all these points exceptpossibly A1. Then by the Three cubics theorem C ∪ A2A3 passes through A1.Hence A2A3 passes through A1.

Exercise 30.26. Note that Pascal’s Theorem implies Pappus’ Theorem.

Remark 30.27. An extended version of the Three cubics theorem implies theassociativity of the chord-tangent operation on a cubic. (The extended version, tobe explained below follows from the usual version by passing through the “projectiveplane”.) The rough idea is as follows. Let E be the elliptic curve and Q,P,R pointson it. Let

PQ ∪ E = {P,Q,U}

∞U ∪ E = {∞, U, V }

V R ∪ E = {V,R,W}

PR ∩ E = {P,R,X}

∞X ∩ E = {∞, X, Y }.

Here ∞A is the vertical passing through a point A. Note that

Q ? P = V, V ? R = W ′, P ? R = Y.

We want to show that

(Q ? P ) ? R = Q ? (P ? R)

This is equivalent to

V ? R = Q ? Y

i.e. that

W ′ = Q ? Y

i.e. that Q,Y,W are collinear. Now the two cubics

E and PQ ∪WR ∪ Y X

both pass through the 9 points

P,Q,R,U, V,W,X, Y,∞

On the other hand the cubic

Γ = UV ∪ PR ∪QY

passes through all 9 points except W . By the Three cubics theorem (applied topoints which may be∞) we get that Γ passes through W hence QY passes throughW . The above argument only applies to chords and not to tangents. When dealingwith tangents one needs to repeat the argument and look at multiplicities. Somaking the argument rigorous becomes technical.

A FIRST COURSE IN MATHEMATICS 97

31. Analysis

Analysis is the study of “passing to the limit”. The key words are sequences,convergence, limits, and later differential and integral calculus. We survey heresome of the most basic structures of analysis. For any a ∈ R we let |a| be a or −aaccording as a ≥ 0 or a ≤ 0 respectively.

Definition 31.1. A sequence in R is a map F : N → R; if F (n) = an we saya1, a2, ... is the sequence. Or (an) is the sequence. We let F (N) be denoted by{an;n ≥ 1}; this is a subset of R

Definition 31.2. A subsequence of a sequence F : N → R is a sequence of theform F ◦G where G : N→ N is an increasing map. If a1, a2, a3, ... is F then F ◦Gis ak1 , ak2 , ak3 , ... (or (akn)) where G(n) = kn.

Definition 31.3. A sequence (an) is convergent to a0 ∈ R if for any real numberε > 0 there exists an integer N such that for all n ≥ N we have |an − a0| < ε. Wewrite an → a0 and we say a0 is the limit of (an).

Definition 31.4. A sequence is called convergent if there exists a ∈ R such thatthe sequence converges to a.

Exercise 31.5. Prove that an = 1n converges to 0.

Exercise 31.6. Prove that an = n is not convergent.

Exercise 31.7. Prove that an = (−1)n is not convergent.

Definition 31.8. A sequence F is bounded if the set F (N) ⊂ R is bounded.

Definition 31.9. A sequence F is increasing if F is increasing. A sequence F isdecreasing if −F is increasing.

Definition 31.10. A sequence (an) is Cauchy if for any real ε > 0 there exists aninteger N such that for all integers m,n ≥ N we have |an − am| < ε.

Exercise 31.11. Prove that any convergent sequence is Cauchy. Hint: easy.

Exercise 31.12. Prove the following statements in the prescribed order:1) Any Cauchy sequence is bounded.2) Any bounded sequence contains a sequence which is either increasing or de-

creasing.3) Any bounded sequence which is either increasing or decreasing is convergent.4) Any Cauchy sequence which contains a convergent subsequence is itself con-

vergent.5) Any Cauchy sequence is convergent.Hints: For 1 let ε = 1, let N correspond to this ε and get that |an − aN | < 1 for

all n ≥ N ; conclude from here. For 2 consider the sets An = {am;m ≥ n}. If atleast one of these sets has no maximal element we get an increasing subsequence byProposition 24.3. If each An has a maximal element bn then bn = akn for some knand the subsequence akn is decreasing. For 3 we view each an ∈ R as a Dedekindcut ie. as a subset an ⊂ Q; the limit will be either the union of the sets an or theintersection. Statement 4 is easy. Statement 5 follows by combining the previousstatements.

Exercise 31.13. Prove that any subset in R which is bounded from below has aninfimum; and any subset in R which is bounded from above has a supremum.

98 ALEXANDRU BUIUM

Definition 31.14. A function F : R → R is continuous at a point a0 ∈ R if forany sequence (an) converging to a0 we have that the sequence (F (an)) convergesto F (a0).

Exercise 31.15. Prove that a map F : R → R is continuous (for the Euclideantopology on both the source and the target) if and only if it is continuous at anypoint of R.

Definition 31.16. Let (an) be a sequence and sn =∑nk=1 an. If (sn) is convergent

we say∑∞k=1 an is convergent and we also denote by

∑∞k=1 an the limit.

Exercise 31.17. Prove that1)∑∞n=1

1n is not convergent.

2)∑∞n=1

1n2 is convergent.

Exercise 31.18. Let S ⊂ {0, 1}N be the set of all sequences (an) such that thereexist N with an = 1 for all n ≥ N . Prove that the map

{0, 1}N\S → R, (an) 7→∞∑n=1

an2n

is (well defined and) an injective. Conclude that R is not countable.

Real analysis (analysis of sequences, continuity, and other concepts of calculuslike differentiation and integration of functions on R) can be extended to complexanalysis. A more surprising path is towards p-adic analysis (which is crucial tonumber theory). Recall the ring of p-adic numbers Zp whose elements are denotedby [an].

Exercise 31.19. Say that pe divides α = [an] if there exists β = [bn] such that[an] = [pebn]; write pn|α.

Definition 31.20. For any 0 6= α = [an] ∈ Zp let v = v(α) be the unique integersuch that pn|an for n ≤ v and pv+1 6 |av+1. Then define the norm of α by theformula |α| = p−v(α). We also set |0| = 0.

Exercise 31.21. Prove that if α = [an] and β = [bn] then

|α+ β| ≤ max{|α|, |β|}.

Definition 31.22. Consider a sequence [an1], [an2], [an3], ... of elements in Zp whichfor simplicity we denote by α1, α2, α3, ....

1) α1, α2, α3, ... is called a Cauchy sequence if is called Cauchy if for any real (or,equivalently, rational) ε > 0 there exists an integer N such that for all m,m′ ≥ Nwe have |αm − αm′ | ≤ ε.

2) We say that α1, α2, α3, ... converges to some α0 ∈ Zp if for any real (or,equivalently, rational) ε > 0 there exists an integer N such that for all m ≥ N wehave |αm − α0| ≤ ε. We say α0 is the limit of (αn) and we write αn → α0.

Exercise 31.23. Prove that a sequence in Zp is convergent if and only if it isCauchy.

The following is in deep contrast with the case of R:

A FIRST COURSE IN MATHEMATICS 99

Exercise 31.24. Prove that if (αn) is a sequence in Zp with αn → 0 then thesequence

sn =

n∑k=1

αk

is convergent in Zp; the limit of the latter is denoted by∑∞n=1 αn. In particular,

for instance,∑∞n=1 p

n−1 is convergent. Prove that∑∞n=1 p

n−1 is the inverse of 1−pin Zp.

Exercise 31.25. Prove that if α ∈ Zp has |α| = 1 then α is invertible in Zp. Hint:Use the fact that if p does not divide an integer a ∈ Z then there exist integersm,n ∈ A such that ma+ np = 1; then use the previous exercise.

32. Metamodels

In this section we briefly discuss the problem of translating various theories intoset theory. One can give the following definition which is not standard but expressesa standard idea about the interaction between mathematics and various fields ofknowledge (including mathematics itself):

Definition 32.1. Let T be a theory in a language L with specific axioms A,B, ....

Assume in addition that one may be given a “domain” formulaD(x) in Lfset with onefree variable x. (We allow that D be not present.) By a set theoretical metamodel

with domain D (or simply a metamodel) for T we mean data L, T,˜ consisting of

i) a language L obtained from the language of set theory Lset by adding variousconstants and predicates plus definitions for each of the predicates;

ii) a D-translation ˜ of L into the language L and

iii) a theory T in L such that:

1) T contains Tset;2) T contains the translations A, B, ... of the specific axioms A,B, ...;

3) T contains D(t) for all terms t in L without free variables.

Remark 32.2. It is easy to check that the D-translations of all background axiomsof T are contained in T; therefore the D-translation of any theorem in T will be atheorem in T.

Remark 32.3. If D in the above definition is not present then D-translations aresimply translations, condition 3) can be dropped, etc.

Remark 32.4. The concept of metamodel introduced above is completely differentfrom (but also an inspiration for) the concept of model to be introduced in thesection on models. Metamodels are metasets; models in the section on models aresets.

Example 32.5. We illustrate metamodels by considering the “toy” theory T ofgravitation in Example 14.19. This example we will not involve any “domain for-mula D”. Let L be the language in Example 14.19; recall the it contains constantsS,E,M,R, 1, π, g relational predicates p, c, n, ◦, f , functional predicates d, a, T, :,×.We need to first specify a language L. We let L be the language obtained fromLset by adding symbols S, ....T (all the symbols above with a tilde on top) The

translation of L into L is then just “putting ˜ on top of the symbols”. We let F

be a fixed subset of the set of infinitely differentiable maps R→ R3 in the sense ofanalysis; see a textbook in analysis. Then we consider definitions such as:

100 ALEXANDRU BUIUM

1) π = 314 : 100, g = 981,...2) ∀x(n(x)↔ ((x ∈ R)) ∧ (x > 0))

3) ∀x∀y(f(x, y)↔ ((x ∈ F) ∧ (y ∈ R)))4) ◦ expresses movement in a circle (formalized mathematically, i.e. there exists

a center and a radius such that...).5) a(x) is the “absolute value of the second derivative of the map x” (in the

sense of analysis; later one may want to drop the “absolute value”), etc.

Finally T is the theory with specific axioms those of Tset to which one addsthe translations in L of the axioms in T in Example 14.19. The above yields ametamodel of T.

Exercise 32.6. Attempt to make the above more explicit and to complete thedefinitions above. Analyze what becomes of the theorems in Example 14.19 whentranslated in T. There will be inconsistencies related to the fact that the Earthis sometimes viewed as fixed (when the Moon and the cannon balls are lookedat), sometimes viewed as moving (when viewed as a planet). Try to fix theseinconsistencies, for instance by considering two different translations. There isa way out of these problems by replacing the purely kinematic theory T with adynamical theory. (Kinematic means geometric i.e. not invoking forces whereasdynamical means involving forces.)

Remark 32.7. Other physical theories can be given metamodels in a similar way.The logical content of this operation is never made explicit because it is too la-borious, and often leads to contradictions if not done in minute detail. But it isimportant to understand how things work at least in principle. Problems such asthese are not uncommon: set theory acts sometimes like a “straight jacket” forphysical theories.

Exercise 32.8. (For those who took courses in physics)1) Give a metamodel for Newton’s dynamical theory of gravitation (based on

forces).2) Give a metamodel for classical mechanics of particles traveling in a field.3) Give a metamodel for relativistic mechanics of particles moving freely.4) Give a metamodel for a simple situation in quantum theory.

Next we consider the example of Peano arithmetic. The Peano axiom is thesentence in Lset that there exists a Peano triple. But if one tries to write this axiomin a language that only conns, say, a constant 0, binary functional predicates +,×,and unary functional predicate s, together with equality = then one is bound tofail because there is no way to say, in this formal language, that “for any subsetsomething happens”. Nevertheless there is a formal Peano system of axioms in sucha language which we are going to consider below, together with a metamodel of theassociated theory.

Example 32.9. Consider the language Lar with constant 0, binary functionalpredicates +,×, and unary functional predicate s, together with equality = andthe standard connectives and separators. This is called the language of Peano

A FIRST COURSE IN MATHEMATICS 101

arithmetic. We consider the following specific axioms:

A1) ∀x(s(x) 6= 0)A2) ∀x((x 6= 0)→ ∃y(s(y) = x)))A3) ∀x(x+ 0 = x)A4) ∀x∀y(x+ (s(y)) = s(x+ y))A5) ∀x(x× 0 = 0)A6) ∀x∀y(x× s(y) = (x× y) + x)AP ) ∀w((P (0, w) ∧ ∀v(P (v, w))→ P (s(v), w)))→ ∀x(P (x,w)))

where P runs through all formulas in Lfar with the appropriate number of freevariables. So this is a metacountable set of axioms; the latter series of axiomsstands for a weak form of induction.

Exercise 32.10. Let D(x) be the formula “x ∈ N ∪ {0}” in Lfset. Let Tar be thetheory in Lar with specific axioms A1, ..., A6, AP , ...; we call this theory the Peanoarithmetic theory. Show that the theory Tar has a metamodel with domain D inwhich L, T coincide with Lset,Tset.

Here is another example which is a variant of Grothendieck’s universes; this isone of the way out of the logical difficulties of category theory, to be discussed later.

Example 32.11. Fix a set U (i.e. a constant in Tset) referred to in what follows asa universe and introduce a predicate still denoted by U equipped with the definition

∀x(U(x)↔ (x ∈ U))

Let L be obtained from Lset by adding the above predicate. Also let ˜ be thenatural U-translation of formulas from Lset into L. Finally let T be the theoryin the language L generated by the axioms in ZFC, the collection ZFC˜ of theirU-translations, and all sentences of the form U(c) where c is a constant in Lset.The above define a metamodel for Tset with domain U. (So we are dealing herewith a U-translation of the language of set theory essentially into itself which isnot the “identity”! See the exercise below. We will see a similar, but also verydifferent, procedure when we discuss models of set theory in the section on models.)

Intuitively T is obtained from the set theory Tset by adding axioms saying that “alloperations with sets that belong to U lead to sets that belong to U” and axiomssaying that “all named sets in set theory are in U”. Of course the consistency of Tis even more problematic than the consistency of set theory Tset.

Exercise 32.12. Write down the axioms in ZFC . Hint: if S is the singletonaxiom

∀x∃y((x ∈ y) ∧ (∀z((z ∈ y)→ (z = x)))

then its U-translation S is:

∀x((x ∈ U)→ (∃y((y ∈ U) ∧ (x ∈ y) ∧ (∀z((z ∈ U) ∧ (z ∈ y))→ (z = x))))))

33. Categories

Categories are one of the most important unifying concepts of mathematics. Inparticular they allow to create bridges between various parts of mathematics. Herewe will only explore their definition and we give some examples. We begin with thesimpler concept of correspondence.

102 ALEXANDRU BUIUM

Definition 33.1. A correspondence on a set X(0) is a triple (X(1), σ, τ) whereσ, τ : X(1) → X(0) are maps called source and target.

Example 33.2. If R ⊂ A × A is a relation and σ : R → A and τ : R → A aredefined by σ(a, b) = a, τ(a, b) = b then (R, σ, τ) is a correspondence.

Example 33.3. Let X(0) the set of all sets (belonging to a given set u, referredto as the universe) and let X(1) be the set of all triples (A,B, F ) with A,B ∈ X(0)

and F : A→ B a map (where we assume that all maps from A to B belong to u).Let σ(A,B, F ) = A and τ(A,B, F ) = B. This data define a correspondence.

Example 33.4. If in the above example we insist that all F s are bijective, say,then we get another example of a correspondence.

Definition 33.5. Assume (X(1), σ, τ) is a correspondence on X(0). Then one candefine sets

X(2) = {(a, b) ∈ X2; τ(b) = σ(a)}

X(3) = {(a, b, c) ∈ X3; τ(c) = σ(b), τ(b) = σ(a)}, etc.

We have natural maps p1, p2 : X(1) → X(0), p1(a, b) = a, p2(a, b) = b.

Definition 33.6. A small category is a tuple (X(0), X(1), σ, τ, µ, ε) where X(0) is aset, (X(1), σ, τ) is a correspondence on X(0), and µ : X(2) → X(1), ε : X(0) → X(1)

are maps. We postulate that σ ◦ ε = τ ◦ ε = I, the identity of X(0). Also wepostulate that the following diagrams are commutative:

X(2) µ→ X(1)

p1 ↓ ↓ τX(1) τ→ X(0)

X(2) µ→ X(1)

p2 ↓ ↓ σX(1) σ→ X(0)

X(3) µ×1→ X(2)

1× µ ↓ ↓ µX(2) µ→ X(0)

Finally we postulate that the compositions

X(1) 1×(ε◦σ)−−−−−→ X(2) µ→ X(1)

and

X(1) (ε◦τ)×1−−−−−→ X(2) µ→ X(1)

are the identity of X(1), where (1× (ε ◦ σ))(a) = (a, ε(σ(a)) and ((ε ◦ τ)× 1)(a) =(ε(τ(a), a).

The set X(0) is called the set of objects of the category. The set X(1) is calledthe set of arrows or morphisms. The map µ is called composition and we writeµ(a, b) = a ? b. The maps σ and τ are called the source and the target maprespectively. The map ε is called the identity. We set ε(x) = 1x for all x. The firstcommutative diagram says that the target of a ? b is the target of a. The second

A FIRST COURSE IN MATHEMATICS 103

diagram says that the source of a? b is the source of b. In the third diagram (calledassociativity diagram) the map µ×1 is defined as (a, b, c) 7→ (a?b, c) while the map1×µ is defined by (a, b, c) 7→ (a, b?c); the diagram then says that (a?b)?c = a?(b?c).For x, y ∈ X(0) one denotes by Hom(x, y) the set of all morphisms a ∈ X(1) withσ(a) = x and τ(a) = y. We say a ∈ Hom(x, y) is an isomorphism if there existsa′ ∈ Hom(y, x) such that a ? a′ = 1y and a′ ? a = 1x. (Then a′ is unique.) Acategory is called a groupoid if all morphisms are isomorphisms.

In what follows we give some basic examples of categories. Some of the examplesinvolve universes. For those examples we let U be a fixed universe and fix themetamodel L, T,˜ of set theory Tset attached to U; cf. Example 32.11. (So T isobtained from Tset by adding the corresponding axioms related to the universe.) Forthe various examples below that involve universes the correctness of the definitionsdepend on certain claims (typically that certain sets are maps, etc.) Those claims

cannot a be proved a priori in set theory Tset but they are trivially proved in T; thisshows that the correctness of the definition of some of the categories below requiresthe axioms of T hence more axioms than ZFC! The question of consistency of Tbecomes then acute: one has to believe that T is consistent in order to believe thatthe categories below are all well defined.

Example 33.7. Define the category of sets (in a given universe U) denoted by

{sets}

as follows:

X(0) = U,X(1) = {(A,B, F ) ∈ U3;F ∈ BA}σ(A,B, F ) = A, τ(A,B, F ) = B.

The following is a (trivially proved) theorem in T:

(*) The set

µ = {(B,C, F ), (A,B,G), (A,C,H)) ∈ X(2) ×X(1);H = F ◦G}

is a map X(2) → X(1).

(see the Exercise below). Similarly we may define the map ε : X(0) → X(1),ε(A) = IA (identity of A). With above definitions we get a category. In theexamples below we will encounter from time to time the same kind of phenomenon(where the correctness of definitions depends on theorems in T); we will not repeatthe corresponding discussion but we will simply add everywhere the words “in agiven universe”.

Exercise 33.8. Prove that the above sentence (*) is a theorem in T. Hint: use

the fact that the axioms in T involving U imply that if F,G ∈ U then F ◦G ∈ U.

Example 33.9. If in the example above we insist that all F s are bijections we geta category called

{sets + bijections}.

This category is a groupoid.

104 ALEXANDRU BUIUM

Example 33.10. Let A be a set and consider the category denoted by

{bijections of A}

defined as follows. We let X(0) = {x} be a set with one point, we let X(1) be theset of all bijections F : A → A, we let σ and τ be the constant map F 7→ x, welet µ be defined again by µ(F,G) = F ◦G (compositions of functions), and we letε(F ) = IA. This category is a groupoid.

Example 33.11. Define the category

{ordered sets}

as follows. We take X(0) the set of all ordered sets (A,≤) with A in a given universe,we take X(1) to be the set of all triples ((A,≤), (A′,≤′), F ) with ((A,≤), (A′,≤′) ∈X(0) and F : A → A′ increasing, we take µ to be again, composition, and we takeε(A,≤) = IA.

Example 33.12. Let (A,≤) be an ordered set. Define the category

{(A,≤)}

as follows. We let X(0) = A, we let X(1) be the set ≤ viewed as a subset of A×A, welet σ(a, b) = a, τ(a, b) = b, we let µ((a, b), (b, c)) = (a, c), and we let ε(a) = (a, a).

Example 33.13. Equivalence relations give rise to groupoids. Indeed let A be aset with an equivalence relation R ⊂ A × A on it which we refer to as ∼. Definethe category

{(A,∼)}

as follows. We let X(0) = A, we let X(1) = R, we let σ(a, b) = a, τ(a, b) = b, we letµ((a, b), (b, c)) = (a, c), and we let ε(a) = (a, a). This category is a groupoid.

Example 33.14. We fix the type of algebraic structures below. (E.g. two binaryoperations, one unary operation, two given elements.) Define the category

{algebraic structures}

as follows. X(0) is the set of all algebraic structures (A, ?, ...,¬, ..., 1, ...) of the giventype with A in a given universe, X(1) is the set of all triples

((A, ?, ...,¬, ..., 1, ...), (A′, ?′, ...,¬′, ..., 1′, ...), F )

with F a homomorphism, σ and τ are the usual source and target, and ε is theusual identity.

Example 33.15. Here is a variation on the above example. Consider the categoryof rings:

{commutative unital rings}

as follows. The set of objects X(0) is the set of all commutative unital rings(A,+×, 0, 1) (usually referred to as A) in a given universe and the set of arrowsX(1) is the set of all triples (A,B, F ) where A,B are rings and F : A → B is aring homomorphism. Also the target, source and identity are the obvious ones; thecomposition map is the usual composition.

A FIRST COURSE IN MATHEMATICS 105

Example 33.16. Define the category

{topological spaces}

as follows. We let X(0) be the set of all topological spaces (X,T) where X is aset in a given universe; we take X(1) to be the set of all triples ((X,T), (X ′,T′), F )with F : X → X ′ continuous, we let µ be given by usual composition of maps, σand τ the usual source and target maps, and ε the usual identity.

Example 33.17. Define the category of groups:

{groups}

defined as follows. The set X(0) of objects is the set of all groups whose set is in agiven universe, and the set X(1) is the set of all the triples consisting of two groupsG,H and a homomorphism between them. The source, target, and identity are theobvious ones.

Example 33.18. If in the above example we restrict ourselves to groups which areAbelian we get the category of Abelian groups

{Abelian groups}.

Example 33.19. A fixed group can be viewed as a category as follows. Fix agroup (G, ?,′ , e). Then one can consider the small category

{G}

where X(0) = {x} is a set consisting of one element, X(1) = G, µ(a, b) = a ? b, thesource and the target are the constant maps, and ε(x) = e.

Example 33.20. Define the category

{vector spaces}

as follows.The set of objects X(0) is the set of all vector spaces in a given universe;the set X(1) of morphisms consists of all triples (V,W,F ) with F : V →W a linearmap; source, target, and identity are defined in the obvious way.

Exercise 33.21. Check that, in all examples above, the axioms in the definitionof a category are satisfied. (This is long and tedious but straightforward.)

34. Models

We briefly indicate here how one can create a “mirror” of logic within mathemat-ics (indeed largely within algebra). What results is a subject called mathematicallogic (also referred to as formal logic); cf. the Introduction to this course. A spe-cial place in mathematical logic is played by models/model theory which we shallexplain below in some detail. In particular we will discuss models of ZFC itself(and hence of mathematics itself); this concept of model is entirely different fromthat of metamodels in the section on metamodels. We will discuss models of other(simpler) theories as well such that the theory of Peano arithmetic.

The mirror of logic in mathematics that we are considering here is not entirelyaccurate: there is no one to one correspondence between “metamathematical” logicand mathematical logic. But as a general principle if a concept (say crocodile) ispresent in logic its mirror in mathematical logic will acquire the adjective formalin front (e.g. will be called a formal crocodile). The dichotomy non-formal/formal

106 ALEXANDRU BUIUM

indicates the difference between Level 2 and Level 3 theories (cf. the Introduction).Within each levels one has the syntactic/semantic dichotomy; semantic means ofcourse coming from translation whereas syntactic means independent of translation.

Let T be a set. By a T -partitioned set we mean in what follows a set S togetherwith a map S → T . We let St the preimage of t in S. In the definition below wetake

T = {c, f1, f2, f3..., r1, r2, r3, ...}.Also we consider a set

V = {x1, x2, x3, ...}called the set of variables, and a set

W = {∨,∧,¬,∀,∃,=, (, )}called the set of logic symbols. We sometimes write x, y, z, ... instead of x1, x2, x3, ....

Definition 34.1. Let S be a T -partitioned set; the elements of Sc are called con-stant symbols; the elements of Sfn are called n-ary function symbols; the elementsof Srn are called n-ary relation symbols. For any such S we consider the set

ΛS = V ∪W ∪ S.(referred to as the formal language attached to S).Then one considers the set ofwords Λ?S with letters in ΛS . One defines (in an obvious way, imitating the defini-tions in the sections on “metamathematical” logic) what it means for an elementϕ ∈ Λ?S to be an S-formula or an S-formula without free variables (the latter are

referred to as sentences). One denotes by ΛfS ⊂ Λ∗S the set of all S-formulas and

by ΛsS ⊂ ΛfS the set of all S-sentences.

Example 34.2. Assume ρ ∈ Sr3 , a ∈ Sc, and x, z ∈ V . Then

ϕ = ∀x∃z(ρ(x, z, a))

is a formula. But ∃a∀z(ρ(x, z, a)) is not a formula because constants cannot havequantifiers ∀,∃ in front of them. Also ∀x∃z(ρ(x, a)) is not a formula because ρ is“supposed to have 3 arguments”. If � ∈ Sf2 and a, x, z are as above then

ϕ = ∀x∃z(�(z, a) = z)

is a formula.

Remark 34.3. One can define binary operations ∧∗ and ∨∗ on ΛsS and a unaryoperation ¬∗ on ΛsS . E.g. ∧∗ : ΛsS × ΛsS → ΛsS is defined by ϕ ∧∗ ψ = (ϕ) ∧ (ψ).We have ϕ ∧∗ (ψ ∧∗ η) = (ϕ) ∧ ((ψ) ∧ (η)) 6= (ϕ ∧∗ ψ) ∧∗ η. So ΛsS is not a Booleanalgebra with respect to these operations.

Example 34.4. Let M be a set. We let Sc(M) = M . For n ∈ N we set Srn(M) =P(Mn), the set of n-ary relations on M and Sfn(M) ⊂ P(Mn+1) the set of mapsMn →M . We consider the T -partitioned set S(M), union of the above. Then wecan consider the formal language ΛS(M). An assignment in M is a map

µ : V = {x1, x2, x3, ...} →M

For any assignment µ there exists a unique map vM,µ : ΛfS(M) → {0, 1} which is

a homomorphism with respect to ∨∗,∧∗,¬∗, is compatible (in an obvious sense)with ∀,∃, and satisfies obvious properties with respect to relational and functionalsymbols, and also with µ. If ϕ has no free variables we write vM (ϕ) in place ofvM,µ(ϕ).

A FIRST COURSE IN MATHEMATICS 107

Definition 34.5. Next one discusses semantics. By a translation of ΛS into ΛS′

we understand a map

m : S → S′

which is compatible with the partitions (in the sense that m(St) ⊂ S′t for all t ∈ T ).

For any ϕ ∈ ΛfS one can form, in an obvious way, a formula m(ϕ) ∈ ΛfS′ obtainedfrom ϕ by replacing the constants and relational and functional symbols by theirimages under m respectively. So we get a map

m : ΛfS → ΛfS′

which is a homomorphism with respect to ∨∗,∧∗,¬∗, compatible with ∀,∃.An S-structure (or simply a structure if S is understood) is a pair M = (M,m)

where M is a set and m is a translation

m : S → S(M)

So we get a map

m : ΛfS → ΛfS(M)

which is a homomorphism with respect to ∨∗,∧∗,¬∗, compatible with ∀,∃. Fix anassignment µ and set vM,µ = vM . We have a natural map

vM : ΛfS(M) → {0, 1}

which is again a homomorphism compatible with ∀,∃. So we may consider thecomposition

vM = vM ◦m : ΛfS → {0, 1}.We say that a sentence ϕ ∈ ΛsS is satisfied in the structure M if vM(ϕ) = 1. Thisconcept is independent of µ. This is essentially Tarski’s famous syntactic definitionof truth: one can define in set theory the predicate is true in M by the definition:

∀x((x is true in M)↔ ((x ∈ ΛsS) ∧ (vM(x) = 1)))

We will continue to NOT use the word true in what follows. If vM(ϕ) = 1 we alsosay that M is a model of ϕ and we write M |= ϕ. We say a set Φ ⊂ ΛsS of sentencesis satisfied in a structure (write M |= Φ) if all the formulas in Φ are satisfied in thisstructure. We say that a sentence ϕ ∈ ΛsS is a semantic formal consequence of a setof sentences Φ ⊂ ΛsS (and we write Φ |= ϕ) if ϕ is satisfied in any structure in whichΦ is satisfied. Here the word semantic is used because we are using translations; andthe word formal is being used because we are in formal logic rather than in ordinarylogic. A sentence ϕ ∈ ΛsS is valid (or a formal tautology) if it is satisfied in anystructure, i.e. if ∅ |= ϕ. We say a sentence ϕ is satisfiable if there is a structure inwhich it is satisfied. Two sentences ϕ and ψ are semantically formally equivalent ifeach of them is a semantic formal consequence of the other, i.e ϕ |= ψ and ψ |= ϕ;write ϕ ≈ ψ. Note that the quotient ΛsS/ ≈ is a Boolean algebra in a naturalway. Moreover each structure M defines a homomorphism vM : ΛsS/ ≈→ {0, 1} ofBoolean algebras.

Example 34.6. Algebraic structures can be viewed as models. Here is an example.Let

S = {?, ι, e}

108 ALEXANDRU BUIUM

with e a constant symbol, ? a binary function symbol, and ι a unary functionsymbol. Let Φgr be the set of S-formulas Φgr = {ϕ1, ϕ2, ϕ3} where

ϕ1 = ∀x∀y∀z(x ? (y ? z) = (x ? y) ? z)ϕ2 = ∀x(x ? e = e ? x = x)ϕ3 = ∀x(x ? ι(x) = ι(x) ? x = x).

Then a group is simply a model of Φgr above. We also say that Φgr is a set ofaxioms for the formalized theory of groups.

More generally

Definition 34.7. A formalized (or formal) system of axioms is a pair (S,Φ) whereS is a T -partitioned set and Φ is a subset of ΛsS ; then

Θ := {ϕ ∈ ΛsS ; Φ |= ϕ}is called the formal theory with (or generated by) the system (S,Φ).

Exercise 34.8. Write down a set of symbols S and a set of formulas Φ which is aset of axioms for the formalized theory of:

1) commutative unital rings;2) fields;3) ordered sets;4) vector spaces over a given field;5) Boolean algebras;6) sets;

Can one find such a pair S,Φ in the following cases?7) well ordered sets;8) topological spaces;9) the ring of integers.

Definition 34.9. Formalized (or formal) set theory is the formal theory Θset gen-erated by (Sset,Φset) where Sset = {∈} and Φset consists of the ZFC axioms ofLevel 2 set theory viewed as elements of Λset = Λ{∈}.

Remark 34.10. Unlike the language Lset of set theory, which has infinitely manyconstants (e.g. the witnesses), Λset above has no constants! Also, of course, Lsetis a metaset whereas Λset is a set. Similarly the theory Tset is a metaset while theformal theory Θset is a set. However it is possible to see any sentence in Lsset thatdoes not involve constants as a sentence in Λsset and vice-versa. We will assumethis identification from now on.

Formalized set theory is the Level 3 theory in the Introduction. Set theory isthe level 2 theory in the Introduction.

Definition 34.11. A formal theory Θ (generated by a system of axioms (S,Φ)) iscalled formally inconsistent if there exists a sentence ϕ ∈ ΛsS such that ϕ ∈ Θ and¬∗ϕ ∈ Θ. The theory is called formally consistent if it is not formally inconsistent.The theory is called formally complete if for any sentence ϕ ∈ ΛsS either ϕ ∈ Θ or¬∗ϕ ∈ Θ. A theory is called formally incomplete if it is not formally complete.

Let ϕcon ∈ Λsset be the sentence expressing the formal consistency of formal settheory Θset (of Level 3 set theory). Then ϕcon 6∈ Θset is a sentence in Λset andhence it corresponds to a sentence in Lset which we still denote by the same stringof symbols; we will constantly adopt this convention.

A FIRST COURSE IN MATHEMATICS 109

Godel proved the following theorem in Tset.

Theorem 34.12. ϕcon 6∈ Θset.

With regards to the continuum hypothesis we have the following result the twohalves of which are due to Godel and Cohen respectively; this is, again a theoremin Tset. Let ϕCH ∈ Λsset be the sentence expressing the continuum hypothesis.

Theorem 34.13. ϕcon → ((ϕCH 6∈ Θset) ∧ (¬∗ϕCH 6∈ Θset))

Corollary 34.14. If Θset is formally consistent then it is formally incomplete.

Next we want to analyze the integers in mathematical logic.

Remark 34.15. The Peano axiom in Level 2 set theory is the sentence that thereexists a Peano triple. One can view this as a sentence in Λset = Λ{∈} of Level 3formal set theory; indeed the words “there exists a set z such that for any subset y...” can be written as

∃z∀y((∀x(x ∈ y)→ (x ∈ z))→ ...)

But if one tries to write this axiom in the formal language Λar = Λ{+,×,s,0} where+,× are binary functions, s is a unary function, and 0 is a constant then one isbound to fail because there is no way to say, in this formal language, that forany subset something happens. (If one says ∀x then in any S-structure (M,m) thevariable x is translated as an element of M and not as a subset of M .) Neverthelessthere is a formal Peano system of axioms in the formal language Λar which we aregoing to consider below.

Definition 34.16. Consider the set S = Sar = {+,×, s, 0} where +,× are binaryfunctional predicates, s is a unary functional predicate, and 0 is a constant. LetΦar be the sentences in ΛS (called the formal Peano axioms) obtained by viewingthe axioms A1, ..., A6, AP of Tar as elements of ΛS . In other words Φar consists of:

ϕ1 = ∀x(s(x) 6= 0)ϕ2 = ∀x((x 6= 0)→ ∃y(s(y) = x)))ϕ3 = ∀x(x+ 0 = x)ϕ4 = ∀x∀y(x+ (s(y)) = s(x+ y))ϕ5 = ∀x(x× 0 = 0)ϕ6 = ∀x∀y(x× s(y) = (x× y) + x)ϕφ = ∀w((φ(0, w) ∧ ∀v(φ(v, w))→ φ(s(v), w)))→ ∀x(φ(x,w)))

where φ runs through all formulas in Λfar with the appropriate number of freevariables. So this is a metacountable set of axioms; the latter series of axioms standsfor a weak form of induction. The formal system of axioms of Peano arithmeticis the pair (Sar,Φar). The theory Θar generated by (Sar,Φar) is called the Peanoarithmetic.

Here is the famous Godel’s incompleteness theorem of arithmetic. It is a theoremin Tset.

Theorem 34.17. If Θar is formally consistent then it is formally incomplete.

Remark 34.18. Theorems 34.12, 34.13, 34.17 are sentences (or argot translationsof sentences) in Lset. According to our conventions we did not define the reference,meaning, and truth of sentences. In particular, for us, these theorems refer to

110 ALEXANDRU BUIUM

nothing, have no meaning, and it does not make sense to say they are true. But ofcourse, once a philosophical exploration of the reference/meaning/truth of sentencesis undertaken, these theorems can be said to refer to mathematics itself, they canbe ascribed extremely interesting meanings, and they should be regarded as true;none of this is makes sense, however, within mathematics.