introduction to the modal logic

7/22/2019 introduction to the modal logic

1/66

Modal LogicSummer, 2011

1 Introduction

These notes attempt to summarize a selection of basic material about modal logic in a veryclear and motivated way.

There are a number of reasons to be interested in modal logic in general. One, the mostobvious reason perhaps, is because the subject may be interesting in its own right. Second,one could value it for its relationship with other fields such as linguistics, philosophy, artificialintelligence, and mathematics.

These notes begin from the definition of some common modal languages. In giving asemantics for these languages, we focus on Kripke semantics. A frame for the basic modallanguage is a collection of worlds with an accessibility relation (a directed graph), and amodel on such a frame supplies the additional information of what atomic facts are true

at each world (a valuation). In the model context, modal logic can be understood as afragment of first order logic (via the standard translation). This raises the question of whichfirst order formulas are equivalent to modal formulas, which is answered by Van Benthemscharacterization theorem, using the notion of bisimulation.

However, we can also understand modal formulas as asserting something about the un-derlying frame. For example, p p is valid only on transitive frames. So the class oftransitive frames is modally definable. This leads to the question of which classes of framesare modally definable, and this is answered by the Goldblatt-Thomason theorem.

Another result we consider is a modal version of Lindstroms theorem. The originalLindstrom theorem yields first order logic as a maximal logic among certain abstract logicspossessing the compactness and skolem properties, while the version considered here yieldsmodal logic as a maximal logic among certain abstract logics possessing a notion of finitedegree and a property involving bisimulations.

These results are abstract ways of characterizing what modal logic is, and what it iscapable of expressing. One direction of the Van Benthem theorem is that modal formu-las cant tell the difference between bisimilar models, and the other direction says thatany first order formula which also cant make such distinctions is a modal formula. TheGoldblatt-Thomason theorem describes what modal formulas are capable of expressing fromthe underlying frame-, rather than model-, perspective. Finally the Lindstrom-type theoremsuggests that having a notion of finite degree and being invariant under bisimulations insome sense characterize modal logic.

So far we havent been considering deductive consequence, and we should remedy that.As boolean algebras correspond to propositional logic, so do boolean algebras with opera-tors correspond to modal logics. The Jonsson-Tarski theorem is the generalization of Stonesrepresentation theorem for boolean algebras to boolean algebras with operators. This theo-rem is one way to approach completeness theorems for modal logics. But we will also use thenotion of a canonical model. At any rate, we shall be interested in supplying completeness

1


2/66

proofs for a few common modal logics (K, S4, and S5). We shall also consider a couple ofexamples of incompleteness (KL and KtThoM). A completeness result is a matching of asemantic notion of consequence ( |= ) with a syntactic notion of consequence ( ).From the point of view of elucidating the semantic consequence relation, this matching isuseful because it yields a compactness theorem and demonstrates recursive enumerability.

From the point of view of elucidating the syntactic consequence relation, the matching givesus a more intuitive understanding of the relation. As for incompleteness results, there aretwo types, which I will describe later.

Finally, we will conclude the notes with a glimpse of modal first order logic. We definethe language, give a constant domain semantics, and supply a completeness proof.

The material for these notes comes primarily from Modal Logicby Blackburn, Rijke, andVenema, but I also made use ofA New Introduction to Modal Logicby Hughes and Cresswell,and the Handbook of Modal Logicedited by Blackburn, Van Benthem, and Wolter.

2 Modal Languages and Kripke Semantics

2.1 Definition of Modal Language

What is a modal language? Let {1, 2, . . .} be a collection of modal operator symbols withspecified arities (n1, n2, N). Typically, we will be interested in the case where thereis just one modal operator symbol () of arity 1. However, arguments tend to generalizeto the more general case of having more than one operator and operators of arity otherthan 1 without too much trouble, and theres good reason to do so as Ill explain with someexamples in a moment. Let ={, , , 1, 2, . . .}. That is, consists of a collection ofmodal operator symbols of specified arities along with the usual propositional logic operatorsymbols (falsehood, arity 0), (negation, arity 1), and (disjunction, arity 2). Modal

logic for us will always be an extension of propositional logic.Let = {p1, p2, . . .} be a countable collection of proposition letters (atomic sentences).

LetFbe the free -algebra with as generators. I.e., Fis the smallest thing containing and and closed under the following rules of formation:

1. If F, then so is

2. If1, 2 F, then so is 1 2

3. If and has arityn, and1, . . . , n are in F, then(1, . . . , n) F

The elements ofFare called terms or formulas or sentences depending on the context, butI suppose typically well call them modal formulas. For example, if3 has arity 2 and 7has arity 1, then 3(p14, p6) 7is a modal formula. The case where={, , ,}iscalled the basic modal language. Here the modal formulas, the elements ofF, are things like

p1, p12, , , p1 (p12),p1, p1, and so on. We employ the usual abbreviations such as for, for ,X Y for(X Y), etc. Further, to each modal operator symbol regardless of arity, we may define the dual by: if has arity n, then (x1, . . . , xn) isthe same as(x1, . . . , xn).

2


3/66

2.2 Examples of Modal Languages

These modal languages can be put to varying uses, but to get started its useful to have someintuitive notion of them. Consider the basic modal language. If is some modal formula,then can mean something like is possibly true. Similarly, would then mean is necessarily true.

What about other modal languages other than the basic one? Why dont we just limitour attention to the basic modal language? Well, for starters theres the basic temporal logic.Here we have two diamonds, that is, there are two modal operator symbols both of arity1. Well write them as P and F, and their duals (boxes) will be written H and G. So, P means something like was true at some point in the past. F means something like will be true at some point in the future. Similarly, H means something like has beentrue at every point in the past and G is going to be true at every point in the future.(The boldfaced letters are suggestions on how to remember the naming conventions.)

What about a modal language with a modal operator symbol of arity more than 1?What can that mean? Well, suppose were talking about some directed arrows with a modal

operator of arity 2. We could set up our semantics, for example, so that (1, 2) holdsat an arroww just in case w can be decomposed into two smaller arrows x1and x2 such that1 holds atx1 and 2 holds atx2.

2.3 Frames, Models, and Truth

Now that we have some sense of what a modal language is, we turn our attention to oneway of capturing the intended meaning mathematically. First fix some modal signature={1, 2, . . .} (weve omitted reference to the always present propositional symbols). A-frame is a set W together with an (n+ 1)-ary relation R on W for each n-ary modaloperator symbol . You can think of this relation Ras deciding which nodes are close

to other nodes. A valuation Von a frame Wis a mapping from to P(W), the power setofW. One thinks of the valuation as deciding what atomic facts hold at each node. E.g.,a node x being in V(p) means that p is true at x. A model M = (W, V) is just a frame Wtogether with a valuation V.

Every valuation Vcan be extended uniquely to a map V: FP(W) such that

1. V(p) =V(p) for each p

2. V() =

3. V() =W V() for each F

4. V(1 2) = V(1) V(2) for all 1, 2 F

5. V(1 n) = {w W | x1 xn such that xi V(i) andRwx1 xn} foreach modal operator symbol (of arity n) and all 1, . . . , n F

3


4/66

This is our recursive truth definition. It tells us which modal formulas are true at whichnodes, given a valuation (an assignment of the basic atomic facts). E.g., ifV(p) contains anodex, thenpis considered to be true atx. The boolean connectives arent that exciting:of note is how each acts as a kind of local (existential) quantification.

We may also describe our definition ofVin a more algebraic way as follows. Its essentially

the same definition, but couched in different terminology. P(W) may be considered a -algebra in the following way: is interpreted as P(W) the empty set, : P(W) P(W)is complementation, : P(W)2 P(W) is union, and for each modal operator symbol of arityn we have the operation : P(W)n P(W) defined by:

(X1, . . . , X n) :={w W | x1X1 xn Xn such that Rwx1 xn}

Then, as F is the free -algebra, there exists a unique -homomorphism V: F P(W)which extendsV: P(W).

Let M = (W, V) be a -model, let w Wbe a node of the frame, and let F bea modal formula. We writeM, w |= if w V(), and in this case we say is true at

w or holds at w. We write M |= if M, w |= for every w W. We write W |= if(W, V), w |= for every valuation V on W and every w W. If (W, V), w |= for allchoices ofW, V, and w, then we say is valid. For all of these, we may also replace thesingle modal formulaby a collection of modal formulas in the obvious way. E.g., M |= means thatM |= for every .

2.4 Semantic Consequence

Theres two notions of semantic consequence that well have occassion to discuss. One isglobal in that it depends on all the nodes of the frame and will be indicated with a g,whereas the other is local in that it considers each node separately and will be indicatedwith an l. Let be a collection of modal formulas, and let be a modal formula. Wewrite |=g if: for all modelsM, ifM |= thenM |=. We write |=lif: for all modelsM= (W, V) and all nodes w W, ifM, w|= then M, w|=. We will be considering thelocal version somewhat more often, so the default is for |= to mean |=l.

I would like to give an example showing that global and local consequence are not thesame. Of course, local consequence implies global consequence, but not the other way around.Considerp |= p. As a global consequence, this is ok, but its not ok as a local consequence.

2.5 Intuitive Examples

Lets see how these definitions play out in the case of the basic modal language {}. Aframe W is simply a directed graph: it is a set, whose elements are called nodes or worlds,with a binary relation R W2. Ifx, y W and Rxy then we say that x sees y, or thaty is accessible from x. Truth and falsity of the modal formulas are evaluated at each node.There is a recursive truth definition. A valuationV: P(W) decides which atomic factsare true at each node. E.g., ifV(p7) containsx W, thenp7 is considered true atx. As for

4


5/66

the inductive steps, Boolean connectives are handled as usual, internally to each node. E.g.,is true at a node x just in case is false atx. However, to get the truth of, we makeuse of the accessibility relationR. We say that is true at a node x just in case x can seesome nodey such that is true aty . I.e.,x |= just in case there exists y such that Rxyand y |=. Similarly, a node x thinks that is true just in case is true at every node

thatx sees. I.e. x|= just in case for ally such thatRxy we have y |=.A nice way to think about and is that the correspond to and, except that they

are local versions. Whereasz(z) says that there exists some zsomewheresuch that(z),says that there is a nearbynode that satisfies .

What are some examples of valid formulas? Well, all of the usual propositional validitiesare valid still. For an example of a validity involving , as is interpreted as falsehood andis true at no node, we have |= . Also, we have |= (1 2) (1 2) forevery modal formula 1 and 2. If a node x can see some node at which either 1 or 2 istrue, then x can either see a node at which 1 is true or it can see a node at which 2 istrue, and vice versa.

Lets now consider the basic temporal language {P, F}. Recall thatP is supposed tomean was true at some point in the past, and F that will be true at some point inthe future. The duals (boxes) are writtenH and G respectively. Now, in order to get theintended meaning, its reasonable to only consider frames which have the binary relationsRPandRFconverses of each other, i.e. we should haveRPxyiffRFyx. Its actually possibleto express this condition modally in the basic temporal language. The class of frames thatvalidate p GP p and p HF p are exactly the frames where P and Fare converses ofeach other.

To see this, first take a moment to internalize what these formulas are saying: p GP pis saying that ifp is true at some point in time, then at every future point in time after that,it will be true that at some point in the past p was true. p H F p says a similar thing in

reverse. So, suppose these two formulas are valid on some frameW. Letx, y Wsuch thatRFxy. We wish to show that RPyx. So we are assuming that y is more in the future thanx and we are trying to show x is more in the past than y. Let Vbe a valuation such thatV(p) = {x}. Thenx |= p and so by assumption we get x |= GP p. Since y is more in thefuture than x, RFxy, we get y |=P p. Since x is the only node which makes p true, we seethat we must have Xmore in the past than y, RPyx. A similar argument works to showthat RPxy implies RFyx. Now, suppose that RP and RF are converses of each other. Weneed to see that the two formulas are valid. Consider some node x. Now lets try to showthatx |=p GP p (the other is similar). So assume x |=p. Now lety be a node such thatRFxy. We need to show thaty |= P p. Well, as RF and RP are converses, we have RPyx.Thus, asx|=p, we have y |=P p.

Now consider a modal language{} where is a 2-ary modal operator symbol. Supposewere interested in talking about directed arrows. The nodes of our frames will themselvesbe directed arrows. Now, R is supposed to be a 3-ary relation on the frame. Our intendedmeaning is that Rxyz iff x is the composition ofy and z; i.e., iff y ends where z beginsand x is the same as y followed by z. Then 1 2 (written in the familiar infix notation)

5


6/66

would be true of those arrows x which can be decomposed into two arrows y and zsuch thaty|=1 and z|=2. One axiom we might want to make about the frames is:

[p1 (p2 p3)] [(p1 p2) p3]

3 The Van Benthem Characterization Theorem

In this section we show how modal formulas about models (frames with valuations) can beunderstood as first order formulas. However, not every first order formula (in the relevantlanguage) is equivalent to a modal formula. The question is which first order formulas areequivalent to modal formulas. The answer is the ones invariant under bisimulation. Thisis the Van Benthem characterization theorem. We shall define bisimulation and prove thetheorem.

3.1 Standard Translation

Fix some modal signature ={1, 2, . . .}. We can convert it into a first order signatureas follows: for each modal operator symbol of arityn we introduce an (n + 1)-ary relationsymbolR. Furthermore, for each proposition letterp we introduce a unary relation symbolP. The result is a first order signature{P1, P2, . . . , R1, R2, . . .}whose models are exactly thesame things as are models for the modal signature as defined above. In detail, given a modalmodelM= (W, V) we form a first order model by defining the Ri as theRi given from theframeWand setting the extension ofPi equal to V(pi). Likewise, given a first order modelwe may form a modal model using the same matching.

Now, although these two languages talk about the same things, the first order one is moreexpressive than the modal one. Every modal formula has an equivalent first order formula.

This is the standard translation of the modal formula and we define it recursively. Letsdescribe the standard translation in the case of the basic modal language for the sake ofclarity. We defineSTx() andS Ty() wherex and y are two distinct variables by inductionon .

1. STz() = forz= x, y

2. STz(p) =P(z) forz= x, y

3. STz() =STz() forz= x, y

4. STz(1 2) =STz(1) STz(2) for z= x, y

5. STx() =y(Rxy STy()) and similarly with x and y reversed

Heres an example of the translation at work: consider = (p1 p2). Its standardtranslation is: STx() = [y(Rxy (P1(y) x(Ryx P2(x))))]. Note that weve reusedthe variable x, but thats ok.

6


7/66

To see that weve gotten this translation right, one may prove by induction that thetranslations are equivalent to the originals. I.e., ifM is a model (both a modal and a firstorder model), then for any w Mand any modal formula we have

M, w|= M |=STx()[w]

3.2 Bisimulation

So every modal formula is equivalent to a first order formula, but is the reverse true? No. Forexample we have a first order formula asserting theres exactly one element in the model.There is no corresponding modal formula. How can we be sure? Well, lets consider theconcept of bisimulation. Well limit our discussion to the basic modal language.

Let (M, w) be a model with a node selected, and similarly (M, w). Now, a bisimulationZbetween these two pointed models is a relation ZM M such that

1. (w, w) Z

2. (x, x) Z implies thatxandx agree on atomic facts, i.e. p V(x) p V(x)for each proposition letterp

3. If (x, x) Zandy is some node such thatRxy, then there is some node y such thatRxy and (y, y) Z

4. If (x, x) Zandy is some node such thatRxy, then there is some nodey such thatRxy and (y, y) Z

We might say something likewand w are modally back-and-forth equivalent, or that theyrebisimilar.

Now, if two nodes are bisimilar, then they have to in fact agree on all modal formulas.We can prove this by induction. We show that ifZis a bisimulation then for all (x, x) Z,xand x agree on every modal formula by induction on . The atomic case is obvious, asare the boolean steps. So suppose x |= . We wish to show thatx |= . As x |= ,introducey such thatRxy andy |=. Then by clause 3 in the definition of bisimulation weget a y such that Rxy and (y, y) Z. By the inductive hypothesis, we get that y |= and so x |= . The other direction is similar.

Getting back to our example of showing that theres first order formulas that arentmodal, recall we were trying to show that theres no modal formula which is true of a node

just in case its in a model with exactly one element. We can imagine a model with justone element not connected to itself and then another model with two elements also with an

empty accessibility relation. We can pick one of the nodes in the two element model andsee that theres a bisimulation between it and the node in the one element model locallytheres no way to tell these two nodes apart.

Now, it is true that if two nodes w andw are bisimilar, then they agree about all modalformulas, i.e. they are modally equivalent, but the reverse is not in general true: it is nottrue that modal equivalence implies bisimilarity. However, under appropriate assumptions

7


8/66

it does. We say that a model is M-saturated if every node w has the following property: if is some collection of modal formulas which is finitely satisfiable among children ofw, thenthere is some child ofw which satisfies all of . I.e., if for every finite subset 0 of there issome y such that Rwy andy|= 0, then there is some y such that Rwy andy|= . Whenthe models under consideration are M-saturated, modal equivalence does imply bisimilarity.

Suppose w M and w M are modally equivalent nodes from two M-saturated modelsM andM. Let

Z:={(x, x)| x M, x M andxand x are modally equivalent}

We claim that Z is a bisimulation of w and w. The only thing that really needs to bechecked is the back and forth conditions. Suppose (x, x) ZandRxy. We need to find ay

such thatRxy andy and y are modally equivalent. Well, let be the complete modal typeofy , i.e. the collection of all modal formulas that y thinks are true. is finitely satisfiableamong children ofx sincex and x are modally equivalent. In detail, if 0 is a finite subsetof , then we note that x |=

0 and so x |=

0. Since M is M-saturated, we get

our desired child y ofx withy |= .This property of M-saturated is relevant to our theorem below, as it happens in any -

saturated model. Recall an-saturated model is one in which every consistent type with onlyfinitely many parameters is realized. One such type isq(x) :={Rwx} {STx()| }.This type has one parameter (w) and it is consistent if is finitely satisfiable among thechildren ofw.

3.3 Van Benthem Characterization Theorem

A first order formula(x) is said to be invariant under bisimulations ifwand w are bisimilarimplies that (w) (w). I.e., cant tell the difference between two bisimilar nodes.We just saw that every formula which is equivalent to a modal formula must be invariantunder bisimulations. It turns out that this is the only obstacle to a first order formula beingequivalent to a modal formula.

Theorem 1 (Van Benthem Characterization). Let(x)be a first order formula (in a trans-lated modal signature). Then is equivalent to a modal formula iff is invariant underbisimulations.

Proof. One direction of this we already saw. So, let(x) be invariant under bisimulations.We want to find some modal formula that is equivalent to . LetM Cdenote all the modalconsequences of . That is, MC := {STx() | (x) |= STx()}. If we can show that

MC |=, then were done. Because, if so, then by the compactness theorem for first orderlogic there is some finite subset T ofMC such that T |= and so

T is equivalent to .

Tis in turn of course equivalent to a modal formula.So we would like to show MC |= . So let M be some model and w M such that

M |= MC[w]. We show M |= [w]. Let C(x) := {STx() | M, w |= }. I.e. C is thecomplete modal type ofw. Note thatC(x) {(x)}is consistent: otherwise there would be

8


9/66

some finite subsetC0 ofCsuch that |=

C0 by the compactness theorem for first orderlogic. Then

C0 would be in M C, but this contradicts the fact that M |=M C[w].

So, as C(x) {(x)} is consistent, we may introduce a model N with a node v Nsuch that N |=C[v] and N |= [v]. In fact, using methods from model theory (such as anelemenatary chain argument, or by taking a suitable ultrapower), we may ensure that such

an N is -saturated. Next, we may introduce an -saturated elementary extension M ofM. Since w in M and v in Nare modally equivalent, and were dealing with -saturatedmodels, it follows that w and v must be bisimilar. From the assumption that is invariantunder bisimulations, we get thatM |=[w], whence M |=[w] as desired.

Once again, the point of this theorem was to determine a theoretical distinction betweenthe first order formulas which are equivalent to a modal formula, and the first order formulaswhich are inherently non-modal.

4 Definability of Classes of Frames

We continue in this section trying to understand the expressivity of modal language, butfrom a slightly different perspective. Here we give ourselves a class of frames and ask whetherits possible to give a collection of modal formulas which yields exactly this class. But wait!What does it mean for a modal formula to talk about a frame? Didnt we define thingsso that we could evaluate the truth/falsity of a modal formula at a model (a frame with avaluation) and a specified node? How could a modal formula say something about a framewith no valuation given? Theres actually a standard way to do this, as we defined in aprevious section. A frame W |= iff for every valuation V on W and for every w W,(W, V), w|=. Another way to think about this definition is in terms of second order logic.We take it that there is a second order universal quantifier (p) at the beginning of the

formula for each proposition letter p that occurs in the formula. E.g., p p, whenconsidered as a modal formula describing a frame W, actually means for every possiblevaluation of p, the model with this valuation on W makes p p true everywhere.This is second order since V(p) is a subset ofW.

4.1 Modal versus Second Order versus First Order

4.1.1 The Basic Layout

Figure 1 shows how modal formulas, first order formulas, and second order formulas arerelated in the context of describing frames. Every first order formula is of course also a

second order formula, so the first order definable classes of frames (the elementary classes)are contained within the second order definable classes. The modal classes are also all secondorder classes. This is obvious from how we defined how the modal formula talks about aframe: its the usual first order standard translation, except that we add a second orderquantifier at the beginning for each proposition letter p that occurs in the formula.

9


10/66

Figure 1: Classes of Frames

The more interesting part of the picture is the relationship between the elementary classesand the modal classes. Its not a simple relationship as they intersect non-trivially. Thereis a bunch of theorems that get at understanding this picture better, but well be focussingon just one: the Goldblatt-Thomason theorem.

The Goldblatt-Thomason theorem answers the question which elementary classes are also

modal classes. It says that ifKis an elementary class of frames, then Kis modally definableiffKis closed under taking bounded morphic images, generated subframes, disjoint unions,and reflects ultrafilter extensions. Well define all these notions shortly, but we should alreadynote something about the general character of the theorem. Were taking a linguistic notion(whether there exists some modal formulas with such and such properties) and convertingit into a structural notion (whether a class of structures is closed under certain structure-building operations). We can say this helps us understand the expressivity of modal logic.

4.1.2 Some Modal Formulas With First Order Correspondents

Before we get to the Goldblatt-Thomason theorem, it will be in our interest to consider

some examples. Now, the Goldblatt-Thomason theorem concerns itself with finding themodal out of the first order, but we might ask the reverse as well take, for example, theformulap p. Does this modal formula correspond to a first order condition on frames?Remember were not asking the question whether the modal formula corresponds to a firstorder condition on models we already know the answer to this is yes and we even havean effective way of going from the modal formula to the first order formula (the standard

10


11/66

translation). The question were asking here is instead whether there is a correspondingfirst order sentence only in the language of frames (i.e. we can use the relation symbols Rcorresponding to each modal operator , but not any relation symbols P corresonding tothe proposition letters p).

It turns out that p p does correspond to a first order sentence. Any sentence that

asserts that the frame is reflexive works, e.g. xRxx. Lets check that this does indeedmatch. SupposeWis a frame such thatW |=p p. We want to show that W |=xRxx.Well, letw Wbe given. We showRww. LetVbe a valuation onWsuch thatV(p) ={w}.Then, asW, w|=p, andW |=p p, we getW, w|= p. Thus there is some w W suchthat Rww and W, w |= p. However, we stipulated that V(p) = {w}, so w must equal w.ThusRww.

Now we show the other direction. Suppose W |= xRxx. To show W |= p p,let w be some node in W and V some valuation such that (W, V), w |= p. We show that(W, V), w|= p. Well, since Rww, this follows immediately.

So thats nice; weve found a modal formula (sometimes called T) that corresponds to thefirst order condition of reflexivity. What are some other modal formulas that correspond toa nice first order condition? Well, the same kind of proof as just above can be used to showthat p p(sometimes called 4) corresponds to transitivity, p p(5) correspondsto right-Euclidean (xyz(Rxy Rxz)Ryz), p p (B) corresponds to symmetry,p p (D) corresponds to right-unboundedness (xyRxy), and p p (CD) cor-responds to every element having a unique R-successor or none at all. Some examples ofmodal formulas not in the basic modal language that have first order correspondents include:(p)(p) pp and (12p 21p) 1(p 2p).

4.1.3 Some Modal Formulas Without First Order Correspondents

The Godel-Lob Formula Given this long list of modal formulas that do have first ordercorrespondents, one might wonder whether all modal formulas have them. Yet the answeris indeed no. Lets consider the Godel-Lob formula p (p p) (L). (Note thisformula is also commonly written as the equivalent (on frames) (p p) p.) Iclaim (L) doesnt correspond to a first order condition. In fact, the frames that validate(L) are exactly the transitive frames which have no infinite ascending chain x0Rx1Rx2R (note the elements of the sequence dont have to be distinct). Given this, (L) cant cor-respond to a collection of first order sentences because of compactness: the collection {c0Rc1R Rcn| n N}where theci are new constant symbols would be finitely satis-fiable as{0, . . . , n}withR interpreted as strict inequality would work for any finite amount.But then compactness would give a frame satisfying with an infinite ascending chain.

So, lets see that (L) defines the transitive, reverse well-founded (in the sense of having noinfinite ascending chains) frames. As a preliminary general comment, note that the formulap (p p) could be paraphrased as saying if p is possible at some point to theright, then there is a point to the right at which p happens but after which no more p canbe found. LetWbe a transitive, reverse well-founded frame. Letw be a node in W andV a valuation such that (W, V), w |= p. We show that (W, V), w |= (p p). Well,

11


12/66

Figure 2: An ad hoc frame for showing (M) is not first order

suppose to get a contradiction that (W, V), w |= (p p). Then, as (W, V), w |= p, wemay introduce an x0 W such that Rwx0 and (W, V), x0 |= p. Then, using the fact thatw |= (p p), we get x0 |= p. Thus, we may introduce an x1 W such that Rx0x1and x1 |= p. Since the frame is transitive, we haveRwx1 and we conclude as before thatx1 |= p. We continue in this way and obtain an infinite ascending chain x0Rx1Rx2 ,violating reverse well-foundedness.

Now lets see the other direction. Assume that W validates (L). We show that W istransitive and reverse well-founded. LetRxy andRyz. We show Rxz. LetV be a valuationsuch that V(p) ={y, z}. Then x|= p, so by (L) we have x|= (p p). Sincey |= p

(Ryz) andzis the only other node inV(p), we must have z|=pand Rxz. Now we showthat W must be reverse well-founded. Suppose to get a contradiction thatx0Rx1Rx2R is some infinite ascending sequence. LetVbe some valuation such that {x0, x1, . . .}= V(p).Thenx0 |= p, butx0|= (p p), contradicting (L).

The McKinsey Formula Another interesting example of a modal formula which does nothave a first order correspondent is the McKinsey formula p p (M). We show thatif it did, then the Lowenheim-Skolem theorem would be violated. Consider the followingframe:

W :={w} {vn, v(n,0), v(n,1) | n N} {zf |f: N {0, 1}}

The relation R onWis defined by

R:={(w, vn), (vn, v(n,0)), (vn, v(n,1)), (v(n,0), v(n,0)), (v(n,1), v(n,1))| n N}

{(w, zf), (zf, v(n,f(n))| n N, f: N {0, 1}}

See Figure 2 to make sense of this.

12


13/66

First we note thatW |= p p. This just involves checking each type of node: w,vn, v(n,i), and zf. Well just check w here. Suppose were supplied with some valuation Vsuch that (W, V), w|= p. Then, as each vn is a successor ofw, andvn only seesv(n,0) andv(n,1), we have that for everyn N eitherv(n,0) orv(n,1) is inV(p). Thus, we may introducea functionf: N {0, 1} such that for every n Nwe have v(n,f(n)) V(p). It follows that

(W, V), zf |= pand so w |= p.Next, we may apply the downward Lowenheim-Skolem theorem toWto obtain a count-

able elementary submodel W W which contains w, vn, and v(n,i) for each n N andi {0, 1}. I.e., W is a submodel containing the points just mentioned such that forevery first order formula (x1, . . . , xm) and elements w1, . . . , wm W we have W |=(w1, . . . , wn) W |= (w1, . . . , wn). Since W is uncountable, we may introduce afunctionf: N {0, 1} withzfW. LetVbe a valuation with V(p) ={v(n,f(n)) | n N}.We claim (W, V), w|= p but (W, V), w|= p. This will imply that p p (M)cant be equivalent to a first order sentence since W doesnt satisfy (M) any more (recallan elementary submodel must satisfy all the same first order sentences as the original).

Lets first see that w |= p. Well, the successors ofw are the vn and the zf that stillremain in W. Ifzf W then f = f, and so there is an n N such that f(n) = f(n),from which it follows that v(n,f(n)) |= p based on our definition ofV. Hence, zf |= p. Asfor the vn, simply observe that both v(n,0) andv(n,1) are successors, but only one can have pholding. Thus vn|= p.

Now we turn our attention to showing that w |= p. Its clear that eachvn can see anode where p holds: again, either p holds at v(n,0) or v(n,1). And as long asf

agrees withf at least one n, we have zf |= p. However, we still need to rule out the possibility thatthere is a zf W

such thatf is the exact opposite offin the sense thatf(n) = 1 f(n)for everyn. Luckily, we can rule this out using the fact that W is an elementary submodelofW. Were able to express in a first order way that ifzg W

, andg is the opposite of

g

, then zg W

. How is this done? Well, note that were able to say in a first order way(with the parameter w) of some element z that Rwz and there are at least three distinctsuccessors ofz. Thus, were able to say ofzthat its one of thezf, because these are the onlysuccessors ofw that have at least three successors. Thus, a first order statement that worksis one that asserts that for all z1, ifz1 = zg for some g

, then there exists an element z2 forwhichz2= zg for someg, and for ally,Rz1y impliesRz2y. The last condition here ensuresthat g is the opposite ofg. Since Wmodels this first order sentence, so too must W, andhence iff is the opposite off, thenzf /W lestzfW. Weve completed verifying thatw|= p.

4.1.4 Some Other Theorems

Weve seen that theres some modal formulas that have first order frame correspondents (e.g.p p), and theres some modal formulas that dont (e.g. p (p p)). Is theresome way to tell in general whether a given modal formula has a first order correspondentor not? Well, effectively no. Chagrovas theorem states that it is undecidable whether anarbitrary basic modal formula has a first order correspondent, though we wont discuss this

13


14/66

theorem further.Although we cant get all the modal formulas with first order correspondents in an effec-

tive way, there is a large fragment that we can get effectively, the Sahlqvist fragment. Thisis a decidable collection of modal formulas (defined syntactically) each formula of whichhas a first order correspondent which can be effectively computed from the modal formula.

Further, Krachts theorem lets us go in the reverse direction: we can syntactically define adecidable collection of first order formulas which are exactly the first order correspondentsof the Sahlqvist formulas, and one can go back and forth effectively. We will not considerthese results further, except to mention that all of the examples of modal formulas with firstorder correspondents given so far have been Sahlqvist formulas (though there are other suchformulas, such as M 4 and M 4).

The point of these remarks is to point out that there is a good deal of known materialaround in understanding the picture in Figure 1 better that we wont be talking about here.

4.2 Some Frame-building Operations

In this section we now turn our attention to the frame-building operations that are involved inthe Goldblatt-Thomason theorem. Each operation is also naturally an operation that workson models as well as frames, as well see, by tacking on whats supposed to happen withthe valuations. We will give the definition of each operation and some illustrative examples.Further, we will supply arguments explaining why closure under them is a necessary conditionfor a class of frames to be modally definable (regardless of whether or not this class iselementary, i.e. first order definable). This is one direction of the Goldblatt-Thomasontheorem. The other direction is a slightly weakened converse that says that if a class offrames is closed under these operations, and in addition it is an elementary class, then theclass is modally definable. This other direction will be proved in the next section. We will

focus on the basic modal language {}, though all of this works in the more general casewith slight modifications.

4.2.1 Generated Subframes

A frame W is called a subframe of a frame W, written W W, if W is a subset ofW and for all x, y W, RW

xy RWxy. W is called a generated subframe of W,writtenW W, if in addition we have for all x W and y W, ifRxy then y W.I.e., ifx W, then W also contains all the W-children ofx. A model (W, V) is calleda submodel of (W, V) if W is a subframe of W and for all proposition letters p we haveV(p) =V(p) W. The model (W, V) is called a generated submodel if in addition W is

a generated subframe ofW.The motivation for considering subframes and submodels should be apparent: we may

wish to consider structures that are smaller pieces of larger structures. But what does thegenerated part do? This ensures that truth at a node in the submodel matches truth inthe larger model. We want to keep all the children around since our recursive definition oftruth was so sensitive to looking at the children.

14


15/66

We show that if (W, V) is a generated submodel of (W, V) then for all modal formulaswe have that for every w W

(W, V), w|= (W, V), w|=

We show this by induction on . The case that is a proposition letter is built into the

definition of submodel. The boolean cases are easy to deal with. So consider the case where our result is known to hold for . Then suppose (W, V), w |= . Thenthere is an x W such that Rwx and (W, V), x|=. By the inductive hypothesis we get(W, V), x|= and so (W, V), w|= . Now suppose (W, V), w |= . Letx W such thatRwx and (W, V), x|=. Since W is a generated subframe ofW, we get that x W, andso (W, V), w|= .

Using this, we get a similar result for frames, but it only goes in one direction. SupposeW W, i.e. W is a generated subframe ofW. We claim that for every modal formula, we have W |= W |= . SupposeW |= . Then letV be a valuation on W andw W such that (W, V), w |= . Then define a valuationV on W byV(p) := V(p).

Then (W

, V

) is a generated submodel of (W, V). Thus, (W, V), w|=. So W|=.The reverse direction doesnt work, i.e. just because W W it doesnt necessarilyfollow that W |= W |= . Heres a counterexample. Suppose W = (N,


16/66

first to be of no help, since were trying to go the wrong way, but since

iIWi is coveredby the Wi it works out. In detail, let W :=

iIWi |= for some modal formula . Then

introduce a valuation V onWand a node w Wsuch that (W, V), w|=. Thenw Wifor some i I, and we may define a valuation Vi on Wi by letting Vi(p) := V(p) Wi. Itfollows that (Wi, Vi) is a generated submodel of (W, V), and hence (Wi, Vi), w|=. Weve

concluded showing (the contrapositive of the statement) that ifWi|= for eachi I, thenW |= too, as desired.

We see from this result that the class of finite frames is not modally definable, as itsnot closed under disjoint union were allowing the index set I to be any size we want,including infinite. Note that the class of finite frames is closed under generated subframes,so it is necessary to consider closure under disjoint unions over and above closure undergenerated subframes.

4.2.3 Bounded Morphic Images

We call a function f: W1 W2 between frames a bounded morphism if

1. For all x, x W1, R1xx R2f(x)f(x) (fis a homomorphism)

2. For all x W1 and all y W2, ifR2f(x)y, then there exists an x W1 such thatR1xx and f(x) =y (f is surjective among children)

A function between models (W1, V1) and (W2, V2) is called a bounded morphism if in additionit satisfies x V1(p) f(x) V2(p) for all x W1 and all proposition letters p.

Now, these conditions might seem slightly unnatural at first, but they are exactly whatsneeded to make an inductive proof work that for every modal formula , for every x W1,(W1, V1), x |= (W2, V2), f(x) |= . Lets see how this proof goes to see where the

assumptions come in. We do it by induction on .The base case involving the proposition letters is covered exactly by the condition thatx V1(p) f(x) V2(p). The boolean cases are straightforward. Consider .Suppose (W1, V1), x|= . We may introduce an x W1such thatR1xx and (W1, V1), x |=. By the inductive hypothesis, (W2, V2), f(x

) |= , and by the homomorphic conditionon fwe have R2f(x)f(x). Thus, (W2, V2), f(x) |= . Now suppose (W2, V2), f(x) |= .Introduce a y W2 such that R

2f(x)y and (W2, V2), y |= . By the surjective amongchildren condition, we may introduce anx W1 such thatR1xx andf(x) =y. Using theinductive hypothesis we get (W1, V1), x |= and so (W1, V1), x|= as desired.

Id like to say a few more words about these conditions defining a bounded morphism.Perhaps the more typical definition which we might expect would be a strong homomorphism

which would satisfy x V1(p) f(x) V2(p) and R1xy R2f(x)f(y). However,this would not be enough to ensure that the above inductive proof would go through. Wewould need a surjectivestrong homomorphism for it to work. But then again, these con-ditions would be overkill since, as we saw, we only need surjective among children for theinductive proof to go through.

16


17/66

A frame W2 is called a bounded morphic image of another frame W1 if there is somesurjective bounded morphismf: W1 W2. We writeW1 W2 in this case. Closure underbounded morphic images is another condition relevant to the Goldblatt-Thomason theorem.Ifis some modal formula, and W1 |=, andW1 W2, then it follows that W2 |=. I.e.,modally definable classes of frames are closed under bounded morphic images. To see this,

suppose W2|= and so let V2 be some valuation onW2 andy W2 such that (W2, V2), y|=. Then define a valuationV1 on W1 by letting V1(p) :={xW1 | f(x) V2(p)}, wheref: W1 W2 is some surjective bounded morphism. Then f is actually also a boundedmorphism of models. Sincefis surjective, we may introduce an x W1 such thatf(x) =y.By the inductive lemma above, we have (W1, V1), x|=. So W1|=.

Consider the class of strongly asymmetric frames. I.e. those frames which validatexy(Rxy Ryx). An example of such a frame is W1= (N, S), i.e., the natural numberswith the usual successor relation as the accessibility relation. An example of a frame which isnot strongly asymmetric is a two element frameW2= {e, o}where the accessibility relationis {(e, o), (o, e)}. However, there is a surjective bounded morphism from W1 to W2 whichshows that the condition strongly asymmetric is not modally definable. It can be easilychecked that the function defined by f(2n) = e and f(2n+ 1) = o works. Note also thatstrongly asymmetric is closed under generated subframes and disjoint unions, so we do indeedhave a new closure condition. Further, all three are still needed since surjective boundedmorphisms preserve the existence of reflexive elements and finiteness.

4.2.4 Ultrafilter Extensions

Ultrafilter extensions are an instance of a more general construction called ultrafilter frames.However, as an introduction to the concept, and because our use of ultrafilter frames willpresently be limited to ultrafilter extensions, we will phrase things only for the case ofultrafilter extensions right now. Later, when we look at canonical frames for normal modallogics, we will think about the more general notion of ultrafilter frame and see how bothcanonical frames and ultrafilter extensions are instances of it.

Ultrafilters To define the ultrafilter extension, we first need to know what an ultrafilteris. Let Ibe any set. As usual, we use the notation P(I) to denote the power set ofI. A(proper) filter U on P(I) is a collection of subsets ofI(i.e. UP(I)) such that

1. U (proper)

2. IfX, Y U thenX Y U(closed under intersection)

3. IfXU andY P(I) andXY, then Y U (closed under superset)

An ultrafilter is a filter that satisfies the additional condition that for every XP(I), eitherXUorI XU (maximality). A subsetCofP(I) that has the property that any finitecollection X1, . . . , X n Uhas non-empty intersection is called consistent. Any consistentsubsetCmay be extended to a filter by closing off under finite intersections and supersets.

17


18/66

A commonly used theorem in model theory (and elsewhere), called the ultrafilter theorem,is that every consistent subset ofP(I) may be extended to an ultrafilter. It can be readilyproved using the axiom of choice.

As an example of an ultrafilter, let i be any element ofI. Let U := {X I | i X}.This is called the principal ultrafilter generated by i, and well denote iti. The existence of

non-principal ultrafilters follows from the ultrafilter theorem we may take the collectionof all cofinite subsets of some infinite set I and observe that this collection is consistent.Any ultrafilter extending it cant be principal as it wont contain any finite sets: if it weregenerated byi then it would contain, e.g., {i}.

Propositions Let W be any frame. (As usual well limit ourselves to the basic modallanguage to keep things simple.) Note that we may define an operation : P(W) P(W)by setting

X:={w W | there is some x such thatRwx andx X}

I.e., Xconsists of those w Wthat can see something in X. Similarly, we may define an

operation : P(W) P(W) by setting

X :={w W | for all y W such that Rwy we have y X}

I.e., X consists of those w W that can only see things in X. We may also define anoperation : P(W) P(W) by setting X := W X, the complement. From thesedefinitions it follows that for all X P(W) we have X = X and X = X.Similarly, X=X, (X Y) = X Y, and XY X Y.

Looking at things this way, motivates the definition of a proposition as a subset ofW.We have a proposition-building operation , which takes a propositionXW and outputsanother proposition XW. Similarly for, , etc. The function V: F P(W) is justa way of associating a proposition to every formula in the modal language. An ultrafilter, inthis light, can be thought of as a maximal, consistent belief state. I.e., the ultrafilter doesntbelieve in falsity ( U), if the ultrafilter believes two propositions then it believes theirconjunction, and it decides for every proposition whether to believe it or its negation.

The ultrafilter extension is a way of taking a modelM= (W, V) and forming a new modelue M= (ue W, ue V), whose elements are the ultrafilters on P(W), such that ue M, U |= iffV() U. If we think of an ultrafilter as a belief state, then we can reword this as sayingthat holds at a belief state just in case the proposition defined by in the original modelis one of the propositions believed. Lets go through the details of how this is accomplished.

Definition of the Ultrafilter Extension LetWbe a frame. We will define a new frame

called the ultrafilter extension of W, written ue W. The elements of the frame are theultrafilters on P(W). The accessibility relationRue is defined as follows:

RueU1U2 XW[(XU1) (XU2)]

I.e., the ultrafilter U1 can see the ultrafilter U2 just in case, whenever XU1, XU2 forevery proposition X P(W). I.e., {X | X U1} U2. What does this mean in terms

18


19/66

of our (loose) belief state analogy? Well, if youre at a belief state U1, you might think itsreasonable to move to another belief state U2 so long as all the things you thought werenecessary before (XU1) are still true (XU2).

Its possible to rephrase this definition of the accessibility relation in terms of. RueU1U2iffXU2 implies that XU1. In our loose analogy, this is saying that if youre at some

belief stateU1, you can only move to another belief state U2so long as you consider everythinginU2at least possible. Heres an argument showing that these two definitions are equivalent.Suppose {X | X U1} U2. We show{X | X U2} U1. Let X U2. Supposeto get a contradiciton that XU1. Then XU1, as we noted above = .I.e., X U1. By assumption, we get X U2, a contradiction. The other direction issimilar.

The ultrafilter extension of a modelM= (W, V) is a new model ue M= (ue W, ue V) suchthat ue Wis the ultrafilter extension ofW(as frames) and ue V(p) :={U |V(p) U}. Underthis definition, we can show that for every modal formula , we have for everyUue W:

ue M, U |= V() U

We show this by induction on . The base case of proposition letters is by definition. Theboolean cases follow from the ultrafilter properties ofUand the recursive definition ofV.The -case is also straightforward but slightly more tricky. First suppose U |= . We showV() U. Well, letU be an ultrafilter such that RueUU and U |=. By the inductionhypothesis, we have V() U. Since we know {X | X U} U, we get V() U.By the inductive definition ofV, we have V() = V(). So V() U.

Now for the other direction. Suppose V() U. We show that U |= . We needto produce an ultrafilter U such that RueU U and U |= . To produce U, we will usethe ultrafilter theorem. We will show that the collection {X | X U} {V()} isconsistent, allowing us to introduce (by the ultrafilter theorem) an ultrafilter extending it.Since {X | X U} U, we then get RueU U, and since V() U, we get, by theinduction hypothesis, U |= . So it only remains to show this collection is consistent. As(X1 Xn) = X1 X2 Xn, {X | XU} is closed under intersections,and so we only need to show that X V() = for any X P(W) such that X U.Well, suppose that X V() = . Then V() X. Thus V() X. Since U isclosed under superset, and V() = V() U, we have XU. I.e.,XU, henceXU.

Weve finished showing that the ultrafilter extension ue M ofM has the property thatfor any modal formula and for any ultrafilter Uue M, we have ue M, U |= iff the setof nodes at which is true inMis inU. In particular, we get ue M, x|= iffM, x|= for

all x Mwhere x denotes the principal ultrafilter generated by x. In fact, one can checkthatx x is actually an embedding of frames in the sense that R

uex1x2 Rx1x2.

Modally Definable Classes Reflect Ultrafilter Extensions In terms of the Goldblatt-Thomason theorem, the relevant fact is that modally definable classes Kof frames reflectultrafilter extensions. This means that ifKis modally definable and ue W K, thenWK

19


20/66

too. Notice that we use the word reflect because the closure property here is in perhapsthe opposite direction than expected. Its not the case that WKimplies that ue WK,i.e. closed under taking ultrafilter extensions; instead its that ue W K implies W K.To see that this is true, let be some modal formula such that W |= . Then introducea valuation V on W and a node w W such that (W, V), w |= . Then it follows that

(ue W, ue V), w|=, so ue W|=. I.e., weve seen that ue W |= impliesW |=.

An Example Consider the class of framesK={W | W |=xy(Rxy Ryy)}. I.e., theclass of frames which have the property that every node has a child which can see itself.Although, as can be checked, this class is closed under generated subframes, disjoint unions,and bounded morphic images, it does not reflect ultrafilter extensions. Here is an exampleshowing failure of this. LetW = (N,


21/66

Concerning the Independence of the Four Closure Conditions Weve already seenan example showing that reflecting ultrafilter extensions is not implied by the other threeconditions. Further, the example xy(Rxy Ryx) shows that closure under boundedmorphic images is not implied by the other three, as can be checked. Also, the finite exampleworks similarly for closure under disjoint union. However, the examplexRxxdoesnt work

for showing closure under generated subframes is not implied by the other three, since thiscondition does not reflect ultrafilter extensions, as is demonstrated by the examples justabove. I imagine there is some example that can be made here, but Im still not quite sure.My best thought so far is the condition that for every x there an infinite descending chainstarting atx: i.e., there arey0, y1, y2, . . .such that Ry2Ry1Ry0Rx. This is not closed undertaking generated subframes, yet is closed under taking disjoint unions and bounded morphicimages. The problem is just that Im not sure whether it reflects ultrafilter extensions.Independence is not essential to show for the purposes of the Goldblatt-Thomason theorem,but it would make things nice and clean.

4.3 Goldblatt-Thomason TheoremIn the last few subsections weve seen four different frame-building operations: generatedsubframes, disjoint unions, bounded morphic images, and ultrafilter extensions. We sawthat if K is a modally definable class of frames, i.e. there are modal formulas suchthat K = {W | W |= }, then K is closed under taking generated subframes, disjointunions, bounded morphic images, and reflects ultrafilter extensions. This is one half of theGoldblatt-Thomason theorem, we now turn to proving the other direction.

A frameFis said to point-generated if there is some w Fsuch that for everyx F,xis a descendant ofw, i.e. there exists a finite sequencew = w0, w1, w2, . . . , wn = x such thatRwiwi+1for eachi. We say thatw generatesF. Given any frameFand anyw F, we may

always construct the smallest generated subframe Fw ofF generated by w, which consistsofw and all its descendants. In fact,F is a bounded morphic image of the disjoint unionof all these generated subframes. I.e.

wFFw F. The surjective bounded morphism

that works here is the union of the inclusion mappings. The reason for this observation willbecome apparent shortly.

Theorem 2. Let Kbe an elementary class of frames (i.e. a class of frames definable byfirst order formulas). Then K is modally definable iff K is closed under taking generatedsubframes, disjoint unions, bounded morphic images, and reflects ultrafilter extensions.

Proof. Weve already seen one direction of the proof, so we concentrate here on the other

direction. LetKbe an elementary class of frames which is closed under taking generatedsubframes, disjoint unions, bounded morphic images, and reflects ultrafilter extensions. Wehope to find a collection of modal formulas such that K={F |F |= }. Well, lets define :={| K|=}, i.e. consists of the modal formulas that every frame inK validates. Bydefinition, we have F K implies F |= . Thus, we just need to show that F |= impliesthatF is in K.

21


22/66

First we note that it suffices to show that F |= implies F K for point-generated F.To see this, assume weve shown this implication for point-generated frames, and lets try toshow it for frames in general. LetFbe some frame, not necessarily point-generated, suchthat F |= . Then each generated subframeFw also has Fw |= . By our assumption, weget that each Fw K. Then, as K is closed under disjoint unions and bounded morphic

images, andwFFw F, we getFKtoo.

So, if we can show for any frame Fwhich is point-generated thatF |= implies FK,then well be done with the proof. So letFbe some frame, let w Fbe a generator forF,and assume F |= .

This wont be a short proof, so first let me try to give a brief summary of how thingswill go. We are trying to show thatFK. We will do this by showing that ue FK, andcite the assumption thatKreflects ultrafilter extensions. But how will we show ue F K?We will find some frame N Kand some surjective bounded morphism fwhich overseesthat Nue F, and cite the assumption that Kis closed under bounded morphic images.But what are N and f going to be? That gets a bit more complicated. We are going tomake the frame Finto a model in a language with lots of proposition letters. Well call thismodelM. Of course the generator w is still in M. Then well find a model N with a pointw N such that (N, w) suitably matches (M, w) except that the underlying frame ofN

is in K. Then well find a certain extension ofN to a larger model Nwhich neverthelesshas very similar properties to N and is still in K. Then well define, with the help of allthe extra proposition letters, a function f: N ue M. Well show that f is a surjectivebounded morphism by carefully checking each condition. Then, simply forgetting about theextra stuff yields ue Fas a bounded morphic image of a frame in K. Figure 3 tries to givea quick visual summary of our method of proof.

Let = {pX | X F}. That is, is a collection of proposition letters, one for eachsubset ofF. We make F into a -model in a natural way. We define the valuation V by

setting V(pX) := X for each X F. I.e., a node x F thinks pX is true iffx X. Wehave a model M = (F, V) which is point-generated by w M. Let be the collection ofmodal formulas such that M, w|=. is of course a collection of modal formulas in theexpanded language with all the proposition letters in .

We claim that there is a -model N and a point w N such that the underlyingframe ofN is in K, and such that N, w |= . As were assuming thatK is an elementaryclass, we may introduce a collection of first order sentences in the frame language ({=, R})such that K={F |F |= }. Now, the modal formulas may be considered as first orderformulas with one free variable (say x) via the standard translation. However, these firstorder formulas (x) are in the -model language ({=, R} {PX | X F}). Nonetheless,as the frame language is a sublanguage of the model language, the first order sentences

may also be considered to be a sentences in this -model language.We will show that (x) is finitely satisfiable. This implies, by the compactness

theorem for first order logic, that there is some model N |= (x). Letting w be theinterpretation ofx, we get our desired model N and point w N. The underlying frameofN is inK since N |= . N, w |= since N |= (x)[w].

22


23/66

Figure 3: A pictorial representation of the proof

So lets see that (x) is finitely satisfiable. Suppose it werent, to get a contradiction.Then |=

0(x) for some finite subset 0(x) of (x). This is equivalent to saying

that, for every frame W K, we have W |=

0. Since 0 is finite, only finitelymany proposition letters occur in it, and so this

0 is equivalent to a modal formula

in the usual -language. Thus, we have

0 , by the definition of . However, this

contradicts the fact thatM, w|=

0 as M |= by the definition ofF.Weve concluded showing that we may introduce a model N with w N such thatN, w |= , the -modal type ofw M, and such that the underlying frame ofN is inK.In fact, we may assume that N is generated by w because modal truth is preserved undertaking generated submodels (regardless of how many proposition letters there are), and Kis closed under taking generated subframes.

Now, using a fact from model theory, we may introduce an -saturated elementary ex-tension N of N. Recall that -saturated implies M-saturated which in turn yields thatmodal equivalence is a bisimulation. N being an elementary extension of N means thatfor every first order formula (x1, . . . , xn) and any elements w1, . . . , wn N, we haveN |= (w1, . . . , wn) N |= (w1, . . . , wn). In particular this holds for sentences

and for formulas of one free variable. Thus, in particular, we have N |= . the underlyingframe ofN is still in Kand the modal type ofw in N agrees with that ofw in N. I.e.N, w |= N, w |=.

The fact that N is generated by w (and M byw), allows us to observe the followinglemmas about every -modal formula :

23


24/66

1. M |= iffN |=

2. is satisfiable in M implies that is satisfiable in N

To prove the first of these, we note the following chain of equivalences:

N |= N |= N, w |= nfor alln N

M, w|= nfor all n N

M |=

The second may be proven as follows:

is satisfiable in M M, w|= nfor some n N

N, w |= nfor some n N

N, w |= nfor some n N

is satisfiable in N

Now were ready to define a mapping f: Nue M. We set, ifs N,

f(s) :={XM |s|=pX}

Recall our hope is that f is a surjective bounded morphism. Thus, there are a bunch ofthings to check about f. Here is a list of the things we need to do to keep them straight:

(a) Check that fis well-defined in the sense that f(s) is indeed an ultrafilter

(b) Show thatf is homomorphic(c) Show that f is surjective among children

(d) Show thatf is surjective

Naturally, we check (a) first. Lets see why f(s) first of all. By the definition off, wehave f(s) iffs|=p. But we have M |=p so by lemma (1) above we have N |=p.In particular, we have s|=p, so f(s).

Next, lets check that f(s) is closed under intersection. LetXandYbe elements off(s)so thats |=pXpY. Since M |= (pXpY) pXY,Nalso satisfies this, and so s |=pXY.Thus, X Y f(s) as desired.

Now we check thatf(s) is closed under superset. LetXf(s) and letY Fsuch thatX Y. We show thatY f(s). We know s|= pX. Further, we haveM |= pX pY, soN |=pXpY and hence s |=pXpY. Thuss |=pY, andY f(s).

Finally we check maximality. LetXF. We show eitherXf(s) orXf(s). Well,M |=pXpX, so we have Nmodelling this and hences|=pXpX. Thus, eitherXf(s)orXf(s).

24


25/66

Weve finished showing (a), that f(s) is an ultrafilter for any s N. Next we show (b)thatf is homomorphic. Let s1 and s2 be elements ofN such that Rs1s2. We want to showthatRf(s1)f(s2). That is, we need to show that {X |Xf(s2)} f(s1). So lets2 |=pX.We show that s1 |= pX. Well, M |= pX pX, so we just need s1 |= pX. But thisfollows from Rs1s2 and s2|=pX.

Now we consider (c) that f is surjective among children. It suffices to show that{(s, f(s)) | s N} is a bisimulation. SinceN is -saturated it is also M-saturated. Also,ue M, being an ultrafilter extension, is automaticallyM-saturated. Thus, it suffices to showthatf(s) =U iffs and Uare modally equivalent, since in the context ofM-saturated struc-tures modal equivalence is a bisimulation. Suppose first thatf(s) =U. AsM |=pVM() ,so too doesN |=pVM() for every

-modal formula. Thus

s|= s|=pVM()

VM() f(s)

VM() U

U |=

Now suppose s andUare modally equivalent. Then for XF

Xf(s) s|=pX

U |=pX

X=VM(pX) U

Thus,{(s, f(s))| s N}is a bisimulation and so f is surjective among children. (Note thatthis argument also showed thatsand f(s) agree on proposition letters.)

Finally we show (d) that f is surjective. LetU P(F) be an ultrafilter. We need tofind an s N such that f(s) =U. In other words, we need to find an s such that s|=pXiff X U. Well, let := {pA | A U}. Note, to prevent confusion, that the pA canbe considered either modal formulas or as the equivalent first order standard translationsSTx(pA).

We claim that is finitely satisfiable in N. Let pA1, . . . , pAn be a finite collection offormulas in . Then, as U is an ultrafilter, we know that A1 An = . Thus, theformulapA1 pAn is satisfiable in M. By lemma (2) it follows thatpA1 pAn issatisfiable inN too.

Since is finitely satisfiable in N, and N is -saturated, all of is satisfied by someelements N. We claimf(s) =U. LetA U. ThenpA , sos |=pA, soA f(s). Now

letA U. ThenA U, so pA, so s |=pA, soA f(s), and so finally A f(s) asf(s) is an ultrafilter.

Weve completed showing thatf: Nue Mis a surjective bounded morphism of models.As the underlying frame ofN is in K and K is closed under bounded morphic images, wehave that ue F is in K. Finally, as Kreflects ultrafilter extensions, Fmust be in K too.

25


26/66

So weve completed showing the Goldblatt-Thomason theorem. Once again, this theoremtells us which first order conditions on frames are also modal conditions on frames. It relatesthe expressivity of first order logic and modal logic.

The proof we gave above of the Goldblatt-Thomason theorem was model-theoretic. How-ever, its also possible to give an algebraic proof using Birkhoffs theorem. This theorem

states that a class of algebras is equationally definable iff the class is closed under takinghomomorphic images, subalgebras, and products.

5 A Modal Version of Lindstroms Theorem

In the last section we focussed on modal formulas talking about frames, but now we willswitch back to considering how modal formulas describe models. Remember that the truthor falsity of a modal formula is evaluated at a node of a model. Thus, the structures ofinterest to us really are pointed models, a pair consisting of a model and a specified node.Pointed models are the things that make modal formulas true or false.

One way to think about modal formulas is that each one separates out a class of pointedmodels. I.e., if is a modal formula, then it defines a class {(M, w)| M, w|=}of pointedmodels. Two modal formulas are considered equivalent if they define the same class ofpointed models. That said, it makes sense to actually identify the modal formula with thisclass. I.e. we can think of a modal formula not just as a finite sequence of symbols in acertain language, but as a class of pointed models.

This kind of thinking actually works for any language and model type. E.g., first orderlanguages, second order languages, propositional logic, etc. You take some languageL andhave in mind associated L-models. Then, eachL-sentence defines a class ofL-models, andtwo such sentences are equivalent if they define the same class. So, again, we can identifythe sentences with classes of models.

This motivates the definition of a sentence as a class of models. This is an abstracttype of sentence that isnt tied down to any sort of syntactic structure it only dependson what type of models youre dealing with presently. For example, weve already seen howa modal formula and its first order standard translationS Tx() talk about the same typeof model and are equivalent, even though they are in different languages. This equivalencelies in the fact that they define the same class of pointed models.

One upshot of looking at things in this light, is that it allows us to compare differentlogics that could have vastly different syntactic structure. Further, it provides a convenientsystem for trying to understand what characterizes modal logic. Weve seen that modal logicis inherentlylocal. We wonder if theres some way to capture this impression by comparison

with other logics in a formal way. The answer is a (tentative) yes, and for it we turn to amodal version of Lindstroms theorem.

26


27/66

5.1 Abstract Logics

We have seen already that there are different modal languages: the basic modal language is{}, but we also noted the basic temporal language {F, P} and an arrow language {} whichconsists of a 2-ary modal operator symbol. However, its not just the operator symbols thatcan vary the proposition letters can vary too. Typically weve just fixed one countable-sized set of proposition letters . However, its nice to be able to vary which propositionletters were using. We already saw an example of this in the Goldblatt-Thomason theoremwhere we introduced a potentially uncountable-sized set of proposition letters . In thefollowing well also have occassion to consider finite sized sets of proposition letters. Theseobservations motivate the following definition.

A modal signature is a pair = (, ) where = {1, 2, . . .} is a collection (finite orinfinite) of modal operators of specified arities ( N), and is a collection (finite or infinite)of proposition letters.

Our notion of frame and model is of course relative to the signature. Ifis a collectionof modal operator symbols of specified arities, then a -frame is a set W together with an

(n+ 1)-ary relation R for each of arity n. E.g., if ={}, then a -frame is a setWwith a 2-ary relation R on it. If = (, ) is a modal signature, then a -model is a pairM = (W, V) where W is a -frame and V: P(W) is a mapping from to the powerset of the frame. A-pointed-model is a pair (M, m) such thatM is a-model andm Mis a node of the underlying frame. E.g., if = ({}, {p, q}), then a-pointed-model is a setMtogether with a 2-ary relation R on M, plus two specified subsets V(p) and V(q) ofM,plus a specified point m M.

If = (, ) and = (, ) are two modal signatures, then we say is a subsignatureof, written , if and . If , andM is a -pointed-model, then thereis a natural way to get a corresponding -pointed-modelM we just forget about theextra structure ofM.

Modal formulas are also signature-dependent. Let = (, ). The-modal formulas areall the syntactic objects you obtain by starting with the atomic proposition letters, and thenby closing off under the operators in and the boolean operators, say (0-ary), (1-ary),and (2-ary). E.g., if={}and ={p, q}then examples of modal formulas include: p,q, , p, (q ), etc.

We can associate to each-modal formulaa class of-pointed-models in the usual way.In fact, in this section, we shall identify this syntactic formula with this class and say thatis a class of-pointed-models. In this light, a bare-bones view of modal logic would be tothink of it as a function from modal signatures to collections of classes of-pointed-models.I.e., ifLM denotes modal logic, then LM() consists of all the classes of-pointed-models

definable by some -modal formula. I.e., LM() consists of all the sentences that areexpressible by some -modal formula. Of course we do lose some information when we lookat modal logic in this light, but still a large amount of information remains. E.g., p qandq p are now considered the same sentence, but we still know theyre distinct from p q,say.

Lets reiterate our point of view. We consider a logic to be some way of assigning to every

27


28/66

signature under consideration a collection of classes of-models. Every class of-modelsis called a sentence. We forget all the information about a logic except how many classesof models it can discern. Some logics are able to express more sentences than others. Forexample, second order logic can express the sentence finite (the class of all finite models)whereas first order logic cannot.

We will be focussing on (abstract) logics that deal with modal signatures, because wereinterested in understanding modal logic better. We will want to compare modal logic (viewedas an abstract logic) with other logics in an attempt to understand modal logic better. Weare attempting to answer the question how modal logic fits in with other logics one coulddefine in the vicinity. As such, it will be useful to give a definition of a logic that rulesout some especially quirky ones. The definition of logic given below at least ensures thatsentences have a finite nature in that they only depend on finitely many symbols of thesignature. Well also require that if a sentence is expressible in some signature then its stillexpressible in any larger signature.

Definition of a Logic If is a modal signature, then a class of-pointed-models is calleda -sentence. A logic L is a function from modal signatures to collections of-sentences

such that:

1. (finite sentences) For every modal signature , if L(), then there is a finitesubsignature of and a sentence L() such that for every-pointed-modelM,M M .

2. (expansions) For every modal signature and sentence L(), if is some modalsignature with , then there is some sentence L() such that for every-pointed-modelM, M M .

One might additionally require other reasonable principles, such as containing the booleanconnectives, but we will not need to do this for what follows. (The restriction could bephrased as follows: if1, 2 L() then1 2 L(), etc.)

The Quintessential Example The quintessential example of the type of logic just definedis the usual modal logic, which well write asLM. This indeed can be thought of as a functionfrom modal signatures to collections of-sentences, as weve already observed. In detail,LM() is defined to consist of the classes of-pointed-models that are definable by some(actual, syntactic)-modal formula.

Comparing Two Logics Let L1 and L2 be two logics. We write L1 L2 if for everymodal signature and sentence L1() we have L2(). I.e., every sentence thatsexpressible in the first logic L1 is expressible in the second logicL2 as well. This means thatL2 can express just as much as L1 and possibly more. It follows from our definition of logicthat ifL1 L2 and L2 L1 then L1 = L2. That is, two logics are the same if they canexpress the same sentences.

28


29/66

Our modal version of Lindstroms theorem below puts two extra conditions on a logic(having a notion of finite degree and being invariant under bisimulations, both to be definedshortly) that in a weak sense characterize modal logic. What is this weak sense? Well, wellbe proving that ifL is any logic that satisfies these two extra constraints and is at least asexpressive as the usual modal logic, then it actually is the same as the usual modal logic

after all. In other words, ifLis some logic such that LML andL satisfies the two extraconditions (involving finite degree and bisimulations) then L = LM. Another way to saythis is that LMis a maximal logic among logics possessing these two extra properties. Thisresult, then, suggests (tentatively) that having a notion of finite degree and being invariantunder bisimulations are important properties from the point of view of understanding modallogic.

One reason to hedge ones bets and say words like tentatively here is that the theoremonly yields modal logic as a maximal logic with these two properties. Indeed, I expectthat its not the case that its unique in this regard. For example, the original first orderLindstrom theorem states that first order logic is a maximal logic (here logic is defined a bitdifferently of course) among compact, skolem logics. Yet there are known examples of logicsother than first order which possess this same property (of course theyre incomparable tofirst order logic). So a Lindstrom theorem doesnt uniquely pinpoint the logic in questionin general, but it does pinpoint it half-way, or from one direction so to speak. (It should beremarked that Ive thought about how adapt the first order example I know of to the modalcase, but it doesnt seem to adapt well. It would be nice to think up some modal example.)

Before we get to the theorem proper, we have to define these two extra conditions, havinga notion of finite degree and invariance under bisimulations. Well also prove a few lemmasthat are potentially of some independent interest, though we wont make use of them inthese notes other than in the Lindstrom theorem.

5.2 Lemmas

To understand what we mean by having a notion of finite degree, lets first look at the case ofthe quintessential logic,LM. We will inductively define a function deg from modal formulas(in any signature) to N as follows:

1. We set deg(p) = 0 for each proposition letterp. Similarly, we set deg() = 0.

2. We set deg() = deg().

3. We set deg(1 2) = max(deg(1), deg(2)).

4. Finally, we set deg(1 n) = max(deg(1), . . . , deg(n)) + 1 for each modal oper-ator symbol of arityn.

That is, the degree of atomic formulas is 0, and the degree goes up by one each time youintroduce a new modal operator symbol, otherwise it doesnt go up. In other words, thedegree is the maximum quantifier depth of the formula. E.g., the degree of(p q) is 0,while the degree of (p) (p ) is 2.

29


30/66

Weve said that modal truth evaluation is local, and this degree function just definedis actually a measure of how local each formula actually is. A formula of degree 0 onlydepends on the properties of the current node for its evaluation. A formula of degree 1involves looking at children of the current node in addition to looking at the current node.A formula of degree 2 involves looking at children of the children of the current node, and

so on.Going hand in hand with this definition of degree for the usual modal logic is the definition

of height of a node in a pointed model, and truncating a pointed model at some specificheight. Let (M, w) be a pointed model. Let x andy be nodes ofM. Thenx is said to be achild ofy if there exist a modal operator symbol (of arity n) and some i {1, 2, . . . , n}and some nodes z1, . . . , z i1, zi+1, . . . , z n such that Ryz1 zi1xzi+1 zn. In this caseyis said to be a parent ofx. The definitions of descendant and ancestor are as expected. Theheight of a node x Mis defined as the smallest numbern such that there is a sequence ofnodesw = w0, w1, . . . , wn1, wn =x such that for each i {0, 1, . . . , n 1}, wi+1 is a childofwi. If there is no such sequence, then the height is defined to be . For short, we maywrite the height ofxas h(x).

Given any pointed model (M, w) we can always truncate so that only the nodes of acertain height or less are left. We write (M, w)n for the submodel {xM | h(x) n} ofM. Note that as h(w) = 0, w will never be lost by truncating in this way. Thus, (M, w)ncan be made into a pointed model in a natural way too, by using w as the designated node.However, we will abuse notation, and also use (M, w)n for what we might otherwise writeas ((M, w)n, w). Hopefully context will make clear which is meant.

The relationship between truncation at height n and modal formulas of degree at mostnis very tight, as demonstrated by the following lemma.

Lemma 3. Let (M, w) be a pointed model and n N. Then for all modal formulas ofdegree no more thann, we have

(M, w)|= (M, w)n|=

.

Proof. We actually prove by (reverse) induction on k= n, n 1, . . . , 1, 0 that for all x Mwithh(x) k, and for all modal formulas of degree no more than n k

M, x|= (M, w)n, x|=

The case where k = 0 andx = w gives us the lemma as stated. We prove it by induction on

k except that we start withk = n and work our way down towards k= 0.For the base case where k =n, we note that formulas of degree n n= 0 cannot have

modal operator symbols in them, and so their truth only depends on the valuation of thenode in question. By the definition of submodel, the valuation at each node remains thesame, so the submodel must agree at every node with the original model on formulas ofdegree 0.

30


31/66

Now assume weve shown the equivalence for k n and we show it for k 1 0. Letx Mwithh(x) k 1. Since the formulas of degree no more than n k+ 1 are booleancombinations of formulas of the form 1 l where 1, . . . , l are formulas of degree nomore thann k(and the atomic formulas), we may restrict attention to just these formulas1 l. Of course M, x |= 1 l iff there are y1, . . . yl M such that Rxy1 . . . yl

andM, yi|=i for eachi. Now note that if there are suchyi, then their height is at mostkbecause x has height k 1, so they are also in (M, w)n. Further, by inductive hypothesis,as h(yi) k and i has degree no more than n k, we have (M, w)n, yi |= i for each i.Thus, (M, w)n, x|=1 l. The other direction is essentially the same argument exceptthat we dont have to argue for the yi being inM.

Finite Degree Notion So weve seen that there is a function deg which takes modalformulas to natural numbers such that for every modal formula and every pointed model(M, w) we have

(M, w)|= (M, w)deg()|=

We can also phrase this not in terms of the syntactic modal formulas, but instead in termsof the abstract sentences, though in the case of modal logic we have to make some choices,because two syntactic modal formulas of different degrees can still yield the same sentence,i.e. class of pointed models.

The general definition of a logicL having a finite degree notion is as follows: there existssome operation deg with codomain N such that for every modal signature and for everysentence L(), we have

(M, w) (M, w)deg()

The usual modal logic LMis an example of a logic having a finite degree notion. We may

define the degree deg() of a sentence to be, say, the smallest degree of a modal formuladefining . E.g., the degree of = {(M, w) | (M, w) |= } is 0, since this sentence isdefined by the zero-degree formula .

Invariance Under Bisimulations The other notion that will help us partially charac-terize modal logic is invariance under bisimulations. Since we only defined bisimulations forthe basic modal language before, and to make sure the definition is fresh in our minds, letsdefine it in the general case of any modal signature.

Let = (, ) be a modal signature. Let (M, w) and (M, w) be two -pointed-models.We say (M, w) and (M, w) are bisimilar, written (M, w) (M, w), if there is a bisimu-

lation Z M M

that contains (w, w

). A bisimulation is a relation Z M M

suchthat

1. (atomic facts match) For all (x, x) Z and for all p , we have x VM(p) iffx VM

(p).

31


32/66

2. (forth) For all (x, x) Z, if is of arity n and there arey1, . . . , ynMsuch thatRM xy1 yn, then there are y

1, . . . , y

n M

such that RM

xy1 y

n and (yi, y

i) Z

for eachi {1, . . . , n}.

3. (back) This is the same as the forth condition except going the other way. In detail,

for all (x, x

) Z, if is of arity n and there are y

1, . . . , y

n M

such thatRM

xy1 y

n, then there are y1, . . . , yn Msuch that R

Mxy1 yn and (yi, y

i) Z

for eachi {1, . . . , n}.

A logicL is said to be invariant under bisimulations if it satisfies the following property:For every modal signature , and for every sentence L(), if (M, w) and (M, w) (M, w), then (M, w) too.

Weve already seen that LM, the usual modal logic, is invariant under bisimulationsbecause we saw that if two nodes are bisimilar, then they are also modally equivalent, i.e.satisfy the same modal formulas. (We actually only saw the proof of this for the basic modallanguage, but the proof is essentially the same in the more general case.)

n-complete formulas Before we proceed to the modal Lindstrom theorem, we have a fewmore concepts and lemmas to introduce. The first order of business is to see how in the usualmodal logic, if one is dealing with a finite signature (finitely many modal operator symbolsand finitely many proposition letters), then single formulas become powerful at describing apointed model in full detail up to any desired depth. These are called n-complete formulaswhich well introduce after the following necessary lemma.

Lemma 4. Let = (, ) be a finite modal signature. Letn N. There are finitely manyformulas of degreen up to logical equivalence.

Proof. We prove this by induction on n. Consider the casen = 0. Here we just have the

boolean combinations of the proposition letters. If there arek proposition letters, then thereare 22k

distinct boolean combinations, up to equivalence. (Each atomic factp can betrue or false, so a complete state description is a subset of . There are 2k such completestate descriptions. However, the boolean combinations express each possible combinationof such complete states. E.g. p q can be identified with the set of complete states thatcontainp or q. Since there are 2k complete states, there are 22

k

sets of complete states, andso this is how many boolean combinations there are.) Anyways, 22

k

is finite.Now suppose there are finitely many formulas of degree no more thannup to equivalence,

and well try to show the same is true for n + 1. Now, every formula of degree n+ 1is a boolean combination of formulas of degree no more than n and formulas of the form1 m, where has arity m and the i have degree no more than n. Thus, it

suffices to show that there only finitely many formulas of the form 1 m where the ihave degree no more than n. We only need to show this for a particular , since isfinite. We note that, by our inductive definition of truth, ifi is equivalent to

i for each

i, then 1 m is equivalent to 1 m. Ifk is the number of formulas of degree at

most n up to equivaelnce, then it follows that there are at most km formulas of the form1 m where eachi has degree at most nup to equivalence.

32


33/66

Lemma 5. Let be a finite modal signature. Let n N. There is a finite collection ofsatisfiable modal formulas1, . . . , meach of degree at mostnthat are pairwise contradictory,collectively exhaustive, and each decides the truth of all formulas of degreen or less. I.e.,

1. (pairwise contradictory) For each i, j {1, . . . , m} with i=j , the formula(i j)

is valid.2. (collectively exhaustive) The formula

i{1,...,m} i is valid.

3. (complete) If is a-modal formula of degree at mostn, then for eachi {1, . . . , m},eitheri ori is valid.

These formulas are called then-complete-modal formulas, and are unique up to equivalence.

Proof. By the previous lemma, we may list all finitely many formulas of degree at mostn as{i| i K}whereKis of course some finite set. For each subset JK, we may introducea formula of degree at most n

J := (

iJ

i) (

iKJ

i)

Since there are finitely many subsets ofK, there are finitely many such formulas J.Now, some of the J may be unsatisfi

introduction to the modal logic

Documents

Transcript of introduction to the modal logic