Hilber Proof Systems

Download as pdf or txt
Download as pdf or txt
You are on page 1of 105

A I  L

Madhavan Mukund S P Suresh


Chennai Mathematical Institute Chennai Mathematical Institute
E-mail: [email protected] E-mail: [email protected]
Abstract

ese are lecture notes for an introductory course on logic aimed at graduate students
in Computer Science. e notes cover techniques and results from propositional logic,
modal logic, propositional dynamic logic and first-order logic. e notes are based on
a course taught to first year PhD students at SPIC Mathematical Institute, Madras,
during August–December, .
Contents

 Propositional Logic 
. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. Axiomatisations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. Maximal Consistent Sets and Completeness . . . . . . . . . . . . . . . . . 
. Compactness and Strong Completeness . . . . . . . . . . . . . . . . . . . 

 Modal Logic 
. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. Correspondence eory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. Axiomatising valid formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. Bisimulations and expressiveness . . . . . . . . . . . . . . . . . . . . . . . . 
. Decidability: Filtrations and the finite model property . . . . . . . . . . 
. Labelled transition systems and multi-modal logic . . . . . . . . . . . . . 

 Dynamic Logic 
. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. Axiomatising valid formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 

 First-Order Logic 
. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. Formalisations in first-order logic . . . . . . . . . . . . . . . . . . . . . . . 
. Satisfiability: Henkin’s reduction to propositional logic . . . . . . . . . . 
. Compactness and the Löwenheim-Skolem eorem . . . . . . . . . . . . 
. A Complete Axiomatisation . . . . . . . . . . . . . . . . . . . . . . . . . . . 
CONTENTS 

. Variants of the Löwenheim-Skolem eorem . . . . . . . . . . . . . . . . 


. Elementary Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. Elementarily Equivalent Structures . . . . . . . . . . . . . . . . . . . . . . 
. An Algebraic Characterisation of Elementary Equivalence . . . . . . . . 
. Decidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
Chapter 

Propositional Logic

. Syntax
We begin with a countably infinite set of atomic propositions P = { p0 , p1 , . . .} and
two logical connectives ¬ (read as not) and ∨ (read as or).
e set Φ of formulas of propositional logic is the smallest set satisfying the fol-
lowing conditions:

• Every atomic proposition p is a member of Φ.

• If α is a member of Φ, so is (¬α).

• If α and β are members of Φ, so is (α ∨ β).

We shall normally omit parentheses unless we need to explicitly clarify the structure of
a formula. We follow the convention that ¬ binds more tightly than ∨. For instance,
¬α ∨ β stands for ((¬α) ∨ β).

Exercise .. Show that Φ is a countably infinite set. ⊣

e fact that Φ is the smallest set satisfying this inductive definition provides us
with the principle of structural induction.

Structural induction principle Let S be a set such that:

• Every atomic proposition p is a member of S.

• If α is a member of S, so is (¬α).
. Semantics 

• If α and β are members of S, so is (α ∨ β).

en, Φ ⊆ S.

. Semantics
To assign meaning to formulas, we begin by assigning meaning to the atomic proposi-
tions. Let ⊤ denote the truth value true and ⊥ the truth value false.

• A valuation v is a function v : P → {⊤, ⊥}.

We can also think of a valuation as a subset of P —if v : P → {⊤, ⊥}, then


v ⊆ P = { p | v( p) = ⊤}. us, the set of all valuations is 2P , the set of all subsets
of P .
We extend each valuation v : P → {⊤, ⊥} to a map vb : Φ → {⊤, ⊥} as follows:

b p) = v( p).
• For p ∈ P , v(
¨
b
⊤ if v(β) =⊥
b =
• For α of the form ¬β, v(α)
⊥ otherwise
¨
b
⊥ if v(β) b )=⊥
= v(γ
b =
• For α of the form β ∨ γ , v(α)
⊤ otherwise

e principle of structural induction can be used to formally argue that vb is well-


defined (that is, vb is indeed a function and is defined for all formulas).
Just as v can be defined as a subset of P , vb can be defined as a subset of Φ—namely,
vb = {α | v(α)
b = ⊤}.

Exercise .. We saw that every subset of P defines a valuation v. Does every subset
of Φ define an extended valuation Vb ? ⊣

b we shall always denote


Since every valuation v gives rise to a unique extension v,
vb as just v.
. Semantics 

Derived connectives It will be convenient to introduce some additional connectives


when discussing propositional logic.

def
α ∧ β = ¬(¬α ∨ ¬β)
def
α ⊃ β = ¬α ∨ β
def
α ≡ β = (α ⊃ β) ∧ (β ⊃ α)

e connective ∧ is read as and, ⊃ as implies and ≡ as if and only if.

Exercise .. Express v(α∧β), v(α ⊃ β) and v(α ≡ β) in terms of v(α) and v(β).

Exercise .. According to the Pigeonhole Principle, if we try to place n+1 pigeons
in n pigeonholes, then at least one pigeonhole must have two or more pigeons. For
i ∈ {1, 2, . . . , n+1} and j ∈ {1, 2, . . . , n}, let the atomic proposition pi j denote that
the i th pigeon is placed in the the j th pigeonhole. Write down a formula expressing
the Pigeonhole Principle. What is the length of your formula as a function of n? ⊣

Satisfiability and validity A formula α is said to be satisfiable if there is a valuation v


such that v(α) = ⊤. We write v ⊨ α to indicate that v(α) = ⊤.
e formula α is said to be valid if v ⊨ α for every valuation v. We write ⊨ α
to indicate that α is valid. We also refer to valid formulas of propositional logic as
tautologies.

Example  Let p be an atomic proposition. e formula p is satisfiable. e formula


p ∨ ¬ p is valid. e formula p ∧ ¬ p is not satisfiable.

e following observation connects the notions of satisfiability and validity.

Proposition .. Let α be a formula. α is valid iff ¬α is not satisfiable.

In applications of logic to computer science, a central concern is to develop algo-


rithms to check for satisfiability and validity of formulas. e preceding remark shows
that the two notions are dual: an algorithm which tests validity of formulas can be
converted into one for testing satisfiability and vice versa.
. Axiomatisations 

In principle, testing the validity of a formula α involves checking its truth value
across an uncountable number of valuations. However, it is sufficient to look at the
effect of valuations on the atomic propositions mentioned in α.
Let us define Voc(α), the vocabulary of α, as follows:

• For p ∈ P , Voc( p) = { p}.

• If α = ¬β, then Voc(α) = Voc(β).

• If α = β ∨ γ , then Voc(α) = Voc(β) ∪ Voc(γ ).

Proposition .. Let α be a formula and v1 , v2 be valuations. If v1 and v2 agree on


Voc(α) then v1 (α) = v2 (α).

is justifies the familiar algorithm for testing validity: build a truth-table for the
propositions mentioned in α and check if all rows yield the value ⊤.

. Axiomatisations
ough we have a straightforward algorithm for testing validity of formulas in propo-
sitional logic, such algorithms do not exist for more complicated logical systems. In
particular, there is no such algorithm for first-order logic.
However, it is still possible to effectively enumerate all the valid formulas of first-
order logic. One way of presenting such an enumeration is through an axiomatisation
of the logic. To prepare the ground for studying axiomatisations of more complex
logics, we begin with an axiomatisation for propositional logic.

Axiom System AX e axiom system AX consists of three axioms and one inference
rule.

(A) α ⊃ (β ⊃ α)
(A) (α ⊃ (β ⊃ γ )) ⊃ ((α ⊃ β) ⊃ (α ⊃ γ ))
(A) (¬β ⊃ ¬α) ⊃ ((¬β ⊃ α) ⊃ β)
α, α ⊃ β
(Modus Ponens, or MP)
β

e rule MP is read as follows—from α and α ⊃ β, infer β. It is important to note


that these are axiom schemes—that is, they are not actual formulas but templates which
. Axiomatisations 

can be instantiated into real formulas by consistently substituting concrete formulas


for α, β and γ . For instance, if p, q ∈ P , p ⊃ (q ⊃ p) is an instance of axiom (A).
An alternate way to present such an axiomatisation is to list the axioms as concrete
formulas and have an additional inference rule to permit uniform substitution of new
formulas into an existing formula.

Derivations A derivation of α using the axiom system AX is a finite sequence of for-


mulas β1 , β2 , . . . βn such that:

• βn = α

• For each i ∈ {1, 2, . . . , n}, βi is either an instance of one of the axioms (A)–(A),
or is obtained by applying the rule (MP) to formulas β j , βk , where j , k <
i —that is, βk is of the form β j ⊃ βi .

We write ⊢AX α to denote that α is derivable using the axiom system AX and say that
α is a thesis of the system. We will normally omit the subscript AX.

Here is an example of a derivation using our axiom system.

1. ( p ⊃ (( p ⊃ p) ⊃ p)) ⊃ (( p ⊃ ( p ⊃ p)) ⊃ ( p ⊃ p)) Instance of (A)


2. p ⊃ (( p ⊃ p) ⊃ p) Instance of (A)
3. ( p ⊃ ( p ⊃ p)) ⊃ ( p ⊃ p)) From  and  by MP
4. p ⊃ ( p ⊃ p) Instance of (A)
5. p ⊃ p From  and  by MP

Exercise .. Show that (¬β ⊃ ¬α) ≡ (α ⊃ β) is a thesis of AX. ⊣

e axiom system we have presented is called a Hilbert-style axiomatisation. ere


are several other ways of presenting axiomatisations. One common alternative to
Hilbert-style systems is the sequent calculus notation due to Gentzen. Typically, Hilbert-
style axiomatisations have a large number of axioms and very few inference rules, while
sequent calculi have very few axioms and a large number of inference rules. Sequent
calculi are often easier to work with when searching for derivations, but are also more
complicated from a technical point of view. We shall look at sequent calculi later,
when we come to first-order logic.
Another fact worth remembering is that the axiom system AX defined here is just
one of many possible Hilbert-style axiom systems for propositional logic.
. Axiomatisations 

e main technical result we would like to establish is that the set of formulas
derivable using AX is precisely the set of valid formulas of propositional logic.
eorem .. For all formulas α, ⊢ α iff ⊨ α.

We break up the proof of this theorem into two parts. e first half is to show that
every thesis of AX is valid. is establishes the soundness of the axiom system,
Lemma .. (Soundness) For all formulas α, if ⊢ α then ⊨ α.

P If ⊢ α, then we can exhibit a derivation β1 , β2 , . . . , βn of α. Formally, the


proof of the lemma is by induction on the length of this derivation. Since every formula
in the sequence β1 , β2 , . . . , βn is either an instance of one of the axioms or is obtained
by applying the rule (MP), it suffices to show that all the axioms define valid formulas
and that (MP) preserves validity—in other words, if α is valid and α ⊃ β is valid, then
β is valid. is is straightforward and we omit the details. ⊣

e other half of eorem .. is more difficult to establish. We have to argue that
every valid formula is derivable. Formally, this would show that our axiomatisation is
complete.
We follow the approach of the logician Leon Henkin and attack the problem in-
directly. Consider the contrapositive of the statement we want to prove—that is, if a
formula α is not a thesis, then it is not valid.

Consistency We write ⊬ α to denote that α is not a thesis. We say that α is consistent


(with respect to AX ) if ⊬ ¬α.

Exercise ..
(i) Show that α ∨ β is consistent iff either α is consistent or β is consistent.

(ii) Show that if α∧β is consistent then both α and β are consistent. Is the converse
true?

(iii) Suppose that ⊢ α ⊃ β. Which of the following is true?

(a) If α is consistent then β is consistent.


(b) If β is consistent then α is consistent. ⊣

By Proposition .. we know that α is not valid iff ¬α is satisfiable. Suppose we can


show the following.
. Maximal Consistent Sets and Completeness 

Lemma .. (Henkin) For all formulas β, if β is consistent then β is satisfiable.

We can then argue that our axiomatisation is complete. Consider a formula β


which is not derivable. It can be shown that ¬¬β ⊃ β is a thesis. If β is not derivable,
neither is ¬¬β—otherwise, we can use the rule MP to derive β from ¬¬β ⊃ β.
Since ⊬ ¬(¬β), ¬β is consistent. By Lemma .., ¬β is satisfiable. Hence, by
Proposition .., β is not valid.

. Maximal Consistent Sets and Completeness


To prove Lemma .., we extend the notion of consistency from a single formula
to sets of formulas. A finite set of formulas X = {α1 , α2 , . . . , αn } is consistent if the
formula α1 ∧α2 ∧. . .∧αn is consistent—that is, ⊬ ¬(α1 ∧α2 ∧. . .∧αn ). An arbitrary set
of formulas X ⊆ Φ is consistent if every finite subset of X is consistent. (Henceforth,
Y ⊆fin X denotes that Y is a finite subset of X .)
A maximal consistent set (MCS) is a consistent set which cannot be extended by
adding any formulas. In other words, X ⊆ Φ is an MCS iff X is consistent and for
each formula α ∈/ X , X ∪ {α} is inconsistent.

Lemma .. (Lindenbaum) Every consistent set can be extended to an MCS.

P Let X be an arbitrary consistent set. Let α0 , α1 , α2 , . . . be an enumeration of


Φ.
We define an infinite sequence of sets X0 , X1 , X2 , . . . as follows.

• X0 = X
¨
Xi ∪ {αi } if Xi ∪ {αi } is consistent
• For i ≥ 0, Xi +1 =
Xi otherwise

∪ in this sequence is consistent, by construction, and X0 ⊆ X1 ⊆ X2 ⊆ · · · .


Each set
Let Y = i ≥0 Xi . We claim that Y is an MCS extending X . To establish this, we
have to show that Y is consistent and that it maximal.
If Y is not consistent, then there is a subset Z ⊆fin Y which is inconsistent. Let
Z = {β1 , β2 , . . . , βn }. We can write Z as {αi1 , αi2 , . . . , αin } where the indices cor-
respond to our enumeration of Φ. en it is clear that Z ⊆fin X j +1 in the sequence
X0 ⊆ X1 ⊆ X2 ⊆ · · · ⊆ Y , where j = max(i1 , i2 , . . . , in ). is implies that X j +1 is
inconsistent, which is a contradiction.
. Maximal Consistent Sets and Completeness 

Having established that Y is consistent, we show that it is maximal. Suppose that


Y ∪ {β} is consistent for some formula β ∈ / Y . Let β = α j in our enumeration of
Φ. Since α j ∈ / Y , α j was not added at step j +1 in our construction. is means that
X j ∪ {α j } is inconsistent. In other words, there exists Z ⊆fin X j such that Z ∪ {α j }
is inconsistent. Since X j ⊆ Y , we must have Z ⊆fin Y as well, which contradicts the
assumption that Y ∪ {α j } is consistent. ⊣

Maximal consistent sets have a rich structure which we shall exploit to prove com-
pleteness.

Lemma .. Let X be a maximal consistent set. en:

(i) For all formulas α, α ∈ X iff ¬α ∈


/ X.

(ii) For all formulas α, β, α ∨ β ∈ X iff α ∈ X or β ∈ X .

We postpone the proof of these properties and first show how they lead to com-
pleteness.

Maximal consistent sets and valuations Let X be an MCS. Define the valuation vX to
be the set { p ∈ P | p ∈ X }—in other words, vX ( p) = ⊤ iff p ∈ X .

Proposition .. Let X be an MCS. For all formulas α, vX ⊨ α iff α ∈ X .

P e proof is by induction on the structure of α.


Basis: α = p, where p ∈ P . en, vX ⊨ p iff (by the definition of vX ) p ∈ X .
Induction step: ere are two cases to consider—when α is of the form ¬β and when
α is of the form β ∨ γ .
(α = ¬β) vX ⊨ ¬β iff (by the definition of valuations) vX ⊭ β iff (by the induction
hypothesis) β ∈
/ X iff (by the properties satisfied by MCSs) ¬β ∈ X .
(α = β ∨ γ ) vX ⊨ β ∨ γ iff (by the definition of valuations) vX ⊨ β or vX ⊨ γ iff (by
the induction hypothesis) β ∈ X or γ ∈ X iff (by the properties satisfied by MCSs)
β∨γ ∈ X. ⊣

us, every MCS X defines a canonical valuation vX which satisfies precisely those
formulas that belong to X . (Conversely, every valuation also defines an MCS in a
canonical way: given a valuation v, Xv = {α | v ⊨ α}. It is not difficult to establish
that the valuation vXv generated by Xv is exactly the same as v.)
. Compactness and Strong Completeness 

Proposition .. immediately yields a proof of Henkin’s lemma.


Proof of Lemma ..: Let α be a consistent formula. By Lindenbaum’s Lemma,
{α} can be extended to an MCS X . By Proposition .., vX ⊨ α since α ∈ X . us,
α is satisfiable. ⊣
To complete our argument, we have to prove Lemma ...
Proof Sketch of Lemma ..: Let X be an MCS.

(i) For every formula α, we have to show that α ∈ X iff ¬α ∈


/ X.
We first show that {α, ¬α} ̸⊆ X . For this, we need the fact that α ⊃ ¬¬α and
¬¬α ⊃ α are both derivable using AX. We omit these derivations.
We know that α ⊃ α, or, equivalently, ¬α ∨ α is a thesis. From this, we can
derive ¬¬(¬α ∨ α). But ¬(¬α ∨ α) is just α ∧ ¬α, so we have ¬(α ∧ ¬α) as a
thesis. is means that {α, ¬α} is inconsistent, whence it cannot be a subset of
X (recalling that X is consistent).
Next we show that at least one of α and ¬α is in X . Suppose neither formula
belongs to X . Since X is an MCS, there must be sets B ⊆fin X and C ⊆fin X
such that B ∪ {α} is inconsistent and C ∪ {¬α} are inconsistent. Let B =
b abbreviate the formula β ∧
{β1 , β2 , . . . , βn } and C = {γ1 , γ2 , . . . , γ m }. Let β 1
β2 ∧ . . . ∧ βn and γb abbreviate the formula γ1 ∧ γ2 ∧ . . . ∧ γ m . en, we have
⊢ ¬(α ∧ β) b and ⊢ ¬(¬α ∧ γb). Rewriting ∧ in terms of ∨, this is equivalent to
⊢ ¬α ∨ ¬β b and ⊢ ¬¬α ∨ ¬b γ . From this, we can conclude that ⊢ α ⊃ ¬β b and
⊢ ¬α ⊃ ¬b γ.
We now use that fact that (α ⊃ β) ⊃ ((δ ⊃ γ ) ⊃ ((α∨δ) ⊃ (β∨γ ))) is a thesis.
(Once again, we omit the derivation). Instantiating this with α = α, δ = ¬α,
β = ¬β b and γ = ¬b γ we can derive (α ∨ ¬α) ⊃ (¬β b ∨ ¬bγ ). Since ⊢ α ∨ ¬α,
b
we get ⊢ ¬β ∨ ¬b γ . By rewriting ∨ in terms of ∧, we can derive ¬(βb ∧ γb). But
this implies that (B ∪ C ) ⊆fin X is inconsistent, which is a contradiction.

(ii) e proof of the second part follows in a similar manner, assuming the deriv-
ability of appropriate formulas. We omit the details. ⊣

. Compactness and Strong Completeness


Often, we are not interested in absolute validity, but in restricted validity. Rather than
asking whether a formula α is always true, we ask whether α is true in all valuations
. Compactness and Strong Completeness 

which satisfy certain properties. One way of restricting the class of valuations under
consideration is to specify a set of formulas X and only look at those valuations where
X is true. If α is true wherever the formulas from X are true, then α is a logical
consequence of X .

Logical consequence Let X be a set of formulas and v a valuation. We write v ⊨ X to


denote that v ⊨ β for every formula β ∈ X . A formula α is a logical consequence of X ,
written X ⊨ α, if for every valuation v such that v ⊨ X it is also the case that v ⊨ α.
e notion of logical consequence is central to the way we formalise mathematics.
For instance, when we study algebraic structures such as groups, we first formulate ax-
ioms which characterise groups. Any theorem we prove about groups can be rephrased
as a statement which is a logical consequence of these axioms: in other words, the
theorem is true whenever the group axioms are also true.
As with validity, we now look at a syntactic approach to logical consequence.

Derivability Let X be a set of formulas. We say that a formula α is derivable from X ,


written X ⊢ α if there exists a sequence α1 , α2 , . . . , αn of formulas such that αn = α and
for i ∈ {1, 2, . . . , n}, αi is either a member of X , or an instance of one of the axioms
(A)–(A) of AX, or is derived from α j , αk , j , k < i , using the inference rule MP.
(Notice that unlike axioms, we cannot use the formulas in X as templates to generate
new formulas for use in a derivation. e formulas in X are concrete formulas and
must be used “as is”.)
e theorem we would like to prove is the following.
eorem .. (Strong Completeness) Let X ⊆ Φ and α ∈ Φ. en, X ⊨ α iff
X ⊢ α.

It is possible to prove this directly using a technique similar to the one used to
prove the soundness and completeness of AX (see Exercise ..). However, we will
prove it indirectly using two auxiliary results which are of independent interest—the
Deduction eorem and the Compactness eorem.
We begin with the Deduction eorem, which is a statement about derivability.
eorem .. (Deduction) Let X ⊆ Φ and α, β ∈ Φ. en, X ∪ {α} ⊢ β iff X ⊢
α ⊃ β.

P (⇐) Suppose that X ⊢ α ⊃ β. en, by the definition of derivability, X ∪


{α} ⊢ α ⊃ β as well. Since α ∈ X ∪ {α}, X ∪ {α} ⊢ α. Applying MP, we get
X ∪ {α} ⊢ β.
. Compactness and Strong Completeness 

(⇒) Suppose that X ∪ {α} ⊢ β. en, there is a derivation β1 , β2 , . . . , βn of β.


e proof is by induction on n.
If n = 1, then β is either an instance of an axiom or a member of X ∪ {α}.
If β is an instance of an axiom, then X ⊢ β as well. Further, from axiom (A),
X ⊢ β ⊃ (α ⊃ β). Applying MP, we get X ⊢ α ⊃ β.
If β ∈ X , there are two cases to consider. If β ∈ X \ {α}, then X ⊢ β. Once
again we have X ⊢ β ⊃ (α ⊃ β) and hence X ⊢ α ⊃ β. On the other hand, if β = α,
we have X ⊢ α ⊃ α from the fact that α ⊃ α is derivable in AX.
If n > 1, we look the justification for adding βn = β to the derivation. If βn is
an instance of an axiom or a member of X ∪ {α}, we can use the same argument as in
the base case to show X ⊢ α ⊃ β.
On the other hand, if βn was derived using MP, there exist βi and β j , with
i , j < n such that β j is of the form βi ⊃ βn . By axiom (A), X ⊢ (α ⊃ (βi ⊃
βn )) ⊃ ((α ⊃ βi ) ⊃ (α ⊃ βn )). By the induction hypothesis, we know that X ⊢ α ⊃
(βi ⊃ βn ) and X ⊢ α ⊃ βi . Applying MP twice, we get X ⊢ α ⊃ βn . ⊣

e Deduction eorem reflects a method of proof which is common in mathemat-


ics—proving that property x implies property y is equivalent to assuming x and in-
ferring y.
e second step in proving Strong Completeness is the Compactness eorem,
which is a statement about logical consequence. To prove this we need the following
lemma about trees, due to König.

Lemma .. (König) Let T be a finitely branching tree—that is, every node has a finite
number of children (though this number may be unbounded). If T has infinitely many
nodes, then T has an infinite path.

P Let T be a finitely branching tree with infinitely many nodes. Call a node x
in T bad if the subtree rooted at x has infinitely many nodes. Clearly, if a node x is
bad, at least one of its children must be bad: x has only finitely many children and if
all of them were good, the subtree rooted at x would be finite.
We now construct an infinite path x0 x1 x2 . . . in T . Since T has an infinite number
of nodes, the root of T is a bad node. Let x0 be the root of T . It has at least one bad
successor. Pick one of the bad successors of x0 and designate it x1 . Pick one of the bad
successors of x1 and designate it x2 , and so on. ⊣

eorem .. (Compactness) Let X ⊆ Φ and α ∈ Φ. en X ⊨ α iff there exists


Y ⊆fin X , Y ⊨ α.
. Compactness and Strong Completeness 

;.

p1 7→ ⊤ p1 7→ ⊥

p1 7→ ⊤ p1 7→ ⊤ p1 7→ ⊥ p1 7→ ⊥
p2 7→ ⊤ p2 7→ ⊥ p2 7→ ⊤ p2 7→ ⊥
··· ··· ··· ···

Figure .: e tree T in the proof of Lemma ..

We shall first prove the following related result. Let X be a set of formulas. We
say that X is satisfiable if there exists a valuation v such that v ⊨ X .

Lemma .. (Finite satisfiability) Let X ⊆ Φ. en, X is satisfiable iff every Y ⊆fin


X is satisfiable.

P (⇒) Suppose X is satisfiable. en, there is a valuation v such that v ⊨ X .


Clearly, v ⊨ Y for each Y ⊆fin X as well.
(⇐) Suppose X is not satisfiable. We have to show that there exists Y ⊆fin X
which is not satisfiable.
Assume that our set of atomic propositions P is enumerated { p1 , p2 , . . .}. Let
P0 = ; and for i ∈ {1, 2, . . .}, let Pi = { p1 , p2 , . . . , pi }. For i ∈ {1, 2, . . .}, let Φi be
the set of formulas generated using only atomic propositions from Pi and let Xi =
X ∩ Φi .
We construct a tree T whose nodes are valuations over the sets Pi , i ∈ {0, 1, 2, . . .}.
More formally, the set of nodes is given by {v | ∃i ∈ {0, 1, 2, . . .}. v : Pi → {⊤, ⊥}}.
e root of T is the unique function ; → {⊤, ⊥}.
e relation between nodes is given as follows. Let v : Pi → {⊤, ⊥}. en v has
two children v ′ , v ′′ : Pi +1 → {⊤, ⊥}, where v ′ extends v to Pi +1 by setting pi +1 to
⊤ and v ′′ extends v to Pi +1 by setting pi +1 to ⊥. More formally, for each p ∈ Pi ,
v ′ ( p) = v ′′ ( p) = v( p) and v ′ ( pi +1 ) = ⊤ and v ′′ ( pi+1 ) = ⊥. (See Figure .).
Observe that T is a complete infinite binary tree. e nodes at level i of the tree
consist of all possible valuations over Pi —there are precisely 2i such valuations for
. Compactness and Strong Completeness 

each i. Notice that if v j at level j is an ancestor of vi at level i then vi agrees with v j


on the atomic propositions in P j .
e infinite paths in T are in - correspondence with valuations over P . Let
π = v0 v1 v2 . . . be an infinite path in the tree. e valuation vπ : P → {⊤, ⊥} is
given by pi 7→ vi ( pi ) for i ∈ {1, 2, . . .}. Conversely, given a valuation v : P →
{⊤, ⊥}, we can define a unique path πv = v0 v1 v2 . . . by setting v0 to be the root of
T and vi : Pi → {⊤, ⊥} to be the restriction of v to Pi —that is, for all p ∈ Pi ,
vi ( p) = v( p). It is easy to verify that these two maps are inverses of each other.
Let us call a node v in T bad if v(β) = ⊥ for some β ∈ X . Clearly, if v is
bad, then so is every valuation in the subtree rooted at v. We prune T by deleting
all bad nodes which also have bad ancestors. (Equivalently, along any path in T , we
retain only those nodes upto and including the first bad node along the path.) It is not
difficult to verify that the set of nodes which remains forms a subtree T ′ of T all of
whose leaf nodes are bad and all of whose non-leaf nodes are not bad.
We claim that T ′ has only a finite number of nodes. Assuming that this is true, let
the set of leaf nodes of T ′ be {v1 , v2 , . . . , v m }. Since each vi is bad, there is a corre-
sponding formula βi ∈ X such that vi (βi ) = ⊥. We claim that {β1 , β2 , . . . , β m } ⊆fin
X is not satisfiable. Consider any valuation v. e corresponding path πv must pass
through one of the nodes in {v1 , v2 , . . . , v m }, say v j . But then, vπv (β j ) = v j (β j ) =
⊥. us, v ⊭ {β1 , β2 , . . . , β m }.
To see why T ′ must be finite, suppose instead that it has an infinite set of nodes.
en, by König’s Lemma, it contains an infinite path π = v0 v1 v2 . . . such that none
of the nodes along this path is bad. e path π is also an infinite path in T . We
know that π defines a valuation vπ . Consider any formula β ∈ X . en β ∈ X j for
some j ∈ {1, 2, . . .}, so vπ (β) = v j (β) = ⊤. us, vπ ⊨ X , which contradicts our
assumption that X is not satisfiable. ⊣
We can now complete our proof of compactness.
Proof of eorem .. (Compactness):
(⇐) If Y ⊆fin X and Y ⊨ α then it is clear that X ⊨ α. For, if v ⊨ X , then v ⊨ Y
as well and, by the assumption that Y ⊨ α, v ⊨ α as required.
(⇒) For all Z ⊆ Φ and all β ∈ Φ, it is clear that Z ⊨ β iff Z ∪ {¬β} is not
satisfiable.
Suppose X ⊨ α. en, X ∪ {¬α} is not satisfiable. By Lemma .., there is a
subset Y ⊆fin X ∪ {¬α} such that Y is not satisfiable. us, (Y \ {¬α}) ∪ {¬α} is not
satisfiable either, where (Y \ {¬α}) ⊆fin X . is implies that Y \ {¬α} ⊨ α. ⊣
With the Deduction eorem and the Compactness eorem behind us, we can
prove Strong Completeness.
. Compactness and Strong Completeness 

Proof of eorem .. (Strong Completeness):


To show that X ⊢ α implies X ⊨ α is routine. Conversely, suppose that X ⊨ α. By
compactness, there is a finite subset Y ⊆fin X such that Y ⊨ α. Let Y = {β1 , β2 , . . . , β m }.
It is then easy to see that β1 (⊃ (β2 (⊃ · · · (β m ⊃ α) · · · ) is valid. Hence, by the
completeness theorem for propositional logic, ⊢ β1 (⊃ (β2 (⊃ · · · (β m ⊃ α) · · · ).
Applying the Deduction eorem m times we get {β1 , β2 , . . . , β m } ⊢ α. Since
{β1 , β2 , . . . , β m } ⊆ X , it follows that X ⊢ α. ⊣
Observe that we could alternatively derive compactness from strong completeness.
If X ⊨ α then, by strong completeness, X ⊢ α. We let Y ⊆fin X be the subset of
formulas actually used in the derivation of α. us, Y ⊢ α as well. By the other half
of strong completeness, Y ⊨ α.
We conclude our discussion of propositional logic with two exercises. e first
leads to an alternative proof of compactness which is more along the lines of the com-
pleteness proof for propositional logic. e second exercise leads to a direct proof of
strong completeness.

Exercise .. (Compactness)


Let X be a set of formulas. X is said to be a finitely satisfiable set (FSS) if every Y ⊆fin X
is satisfiable.
Equivalently, X is an FSS if there is no finite subset {α1 , α2 , . . . , αn } of X such
that ¬(α1 ∧ α2 ∧ . . . ∧ αn ) is valid.
(Note that if X is an FSS we are not promised a single valuation v which satisfies
every finite subset of X . Each finite subset could be satisfied by a different valuation).
Show that:

(i) Every FSS can be extended to a maximal FSS.

(ii) If X is a maximal FSS then:

(a) For every formula α, α ∈ X iff ¬α ∈


/ X.
(b) For all formulas α, β, (α ∨ β) ∈ X iff (α ∈ X or β ∈ X ).

(iii) Every maximal FSS X generates a valuation vX such that for every formula α,
vX ⊨ α iff α ∈ X .

From these facts conclude that:

(iv) Any FSS X is simultaneously satisfiable (that is, for any FSS X , there exists vX
such that vX ⊨ X ).
. Compactness and Strong Completeness 

(v) For all X and all α, X ⊨ α iff there exists Y ⊆fin X such that Y ⊨ α. ⊣

Exercise .. (Strong Completeness)


We define a new notion of consistency. A set X is said to be consistent if there is no
formula α such that X ⊢ α and X ⊢ ¬α.
Show that:

(i) X is consistent iff every finite subset of X is consistent.

(ii) Every consistent set X can be extended to a maximal consistent set (MCS).

(iii) Every MCS X generates a valuation vX such that for all formulas α, vX ⊨ α iff
α ∈ X.

(iv) Every consistent set X is satisfiable: that is, there exists a valuation vX such that
vX ⊨ X .

(v) If X ⊨ α then X ∪ {¬α} is not consistent.

(vi) Use the Deduction eorem to show that that if X ⊨ α then X ⊢ ¬α ⊃ (β ∧


¬β) for some formula β.

Conclude that if X ⊨ α then X ⊢ α. ⊣


Chapter 

Modal Logic

In propositional logic, a valuation is a static assignment of truth values to atomic propo-


sitions. In computer science applications, atomic propositions describe properties of
the current state of a program. It is natural to expect that the truth of an atomic
proposition varies as the state changes. Modal logic is a framework to describe such a
situation.¹
e basic idea in modal logic is to look at a collection of possible valuations si-
multaneously. Each valuation represents a possible state of the world. Separately, we
specify how these “possible worlds” are connected to each other. We then enrich our
logical language with a way of referring to truth across possible worlds.

. Syntax
As in propositional logic, we begin with a countably infinite set of atomic propositions
P = { p0 , p1 , . . .} and two logical connectives ¬ (read as not) and ∨ (read as or). We
add a unary modality □ (read as box).
e set Φ of formulas of modal logic is the smallest set satisfying the following:

• Every atomic proposition p is a member of Φ.

• If α is a member of Φ, so is (¬α).

• If α and β are members of Φ, so is (α ∨ β).


¹Traditional modal logic arose out of philosophical enquiries into the nature of necessary and con-
ditional truth. We shall concentrate on the technical aspects of the subject and avoid all discussion of the
philosophical foundations of modal logic.
. Semantics 

• If α is a member of Φ, so is (□α).

As before, we omit parentheses if there is no ambiguity. e derived propositional


connectives ∧, ⊃ and ≡ are defined as before. In addition, we have a derived modality
def
◊ (read diamond) which is dual to the modality □, defined as follows: ◊α = ¬□¬α.

. Semantics
Frames A frame is a structure F = (W , R), where W is a set of possible worlds and
R ⊆ W × W is the accessibility relation. If w R w ′ , we say that w ′ is an R-neighbour
of w.
In more familiar terms, a frame is just a directed graph over the set of nodes W . We
do not make any assumptions about the set W —not even the fact that it is countable.

Models A model is a pair M = (F ,V ) where F = (W , R) is a frame and V : W → 2P


is a valuation.²
Recall that a propositional valuation v : P → {⊤, ⊥} can also be viewed as a
set v ⊆ P consisting of those atomic propositions p such that v( p) = ⊤. We have
implicitly used this when defining valuations in modal logic. Formally, V is a function
which assigns a propositional valuation to each world in W —in other words, for each
w ∈ W , V (w) : P → {⊤, ⊥}. us, V is actually a function of the form W →
(P → {⊤, ⊥}), which we abbreviate as V : W → 2P .

Satisfaction e notion of truth is localised to each world in a model. We write


M , w ⊨ α to denote that α is true at the world w in the model M . e satisfaction
relation is defined inductively as follows.

M,w ⊨ p iff p ∈ V (w) for p ∈ P


M , w ⊨ ¬α iff M , w ⊭ α
M , w ⊨ α ∨ β iff M , w ⊨ α or M , w ⊨ β
M , w ⊨ □α iff for each w ′ ∈ W , if w R w ′ then M , w ′ ⊨ α
us, M , w ⊨ □α if every world accessible from w satisfies α. Notice that if w is
isolated—that is, there is no world w ′ such that w R w ′ —then M , w ⊨ □α for every
formula α.
²e semantics we describe here was first formalised by Saul Kripke, so these models are often called
Kripke models in the literature.
. Semantics 

Exercise .. Verify that M , w ⊨ ◊α iff there exists w ′ , w R w ′ and M , w ′ ⊨ α. ⊣

Satisfiability and validity As usual, we say that α is satisfiable if there exists a frame F =
(W , R) and a model M = (F ,V ) such that M , w ⊨ α for some w ∈ W . e formula
α is valid, written ⊨ α, if for every frame F = (W , R), for every model M = (F ,V )
and for every w ∈ W , M , w ⊨ α.

Example  Here are some examples of valid formulas in modal logic.

(i) Every tautology of propositional logic is valid. Consider a tautology α and a


world w in a model M = ((W , R),V ). Since the truth of α depends only on
V (w), and α is true under all propositional valuations, M , w ⊨ α.

(ii) e formula □(α ⊃ β) ⊃ (□α ⊃ □β) is valid. Consider a model M =


((W , R),V ) and a world w ∈ W . Suppose that M , w ⊨ □(α ⊃ β). We must
argue that M , w ⊨ □α ⊃ □β. Let M , w ⊨ □α. en we must show that
M , w ⊨ □β. In other words, we must show that every R-neighbour w ′ of w
satisfies β. Since we assumed M , w ⊨ □(α ⊃ β), we know that M , w ′ ⊨ α ⊃ β.
Moreover, since M , w ⊨ □α, M , w ′ ⊨ α. By the semantics of the connective ⊃,
it follows that M , w ′ ⊨ β, as required.

(iii) Suppose that α is valid. en, □α must also be valid. Consider any model
M = ((W , R),V ) and any w ∈ W . To check that M , w ⊨ □α we have to
verify that every R-neighbour of w satisfies α. Since α is valid, M , w ′ ⊨ α for
all w ′ ∈ W . So, every R-neighbour of w does satisfy α and M , w ⊨ □α.

Exercise .. e argument given in part (i) of Exercise  applies only to non-modal
instances of propositional tautologies—for instance, the explanation does not justify
the validity of the formula □α ∨ ¬□α. Show that all substitution instances of propo-
sitional tautologies are valid formulas in modal logic. ⊣

As in propositional logic, one of our central concerns in modal logic is to be able to


decide when formulas are satisfiable (or, dually, valid). Notice that unlike the truth-
table based algorithm for propositional logic, there is no obvious decision procedure
for satisfiability in modal logic. To check satisfiability of a formula α, though it suffices
to look at valuations over the vocabulary of α, we also have to specify an underlying
frame. ere is no a priori bound on the size of this frame.
. Correspondence Theory 

Later in this section we will describe a sound and complete axiomatisation for
modal logic. is will give us an effective way of enumerating all valid formulas. After
that, we will encounter a technique by which we can bound the size of the underlying
frame required to satisfy a formula α. But, we first examine an aspect of modal logic
which does not have any counterpart in propositional logic.

. Correspondence eory


e modalities □ and ◊ can be used to describe interesting properties of the accessibil-
ity relation R of a frame F = (W , R). is area of modal logic is called correspondence
theory.
Let α be a formula of modal logic. With α, we identify a class of frames Cα as
follows:

F = (W , R) ∈ Cα iff for every valuation V over W , for every world


w ∈ W and for every substitution instance β of α, ((W , R),V ), w ⊨ β.

In other words, when defining Cα , we interpret α as a template, much like an ax-


iom scheme. Notice that for any frame F = (W , R) which does not belong to Cα ,
we can find a valuation V , a world w and a substitution instance β of α such that
((W , R),V ), w ⊭ β.

Characterising classes of frames We say a class of frames C is characterised by the for-


mula α if C = Cα .

We now look at some examples of frame conditions which can characterised by


formulas of modal logic.

Proposition .. e class of reflexive frames is characterised by the formula □α ⊃ α.

P We first show that every reflexive frame belongs to C□α⊃α . Let M = ((W , R),V )
be a model where R is reflexive. Consider any world w ∈ W . Suppose that M , w ⊨
□α. We have to show that M , w ⊨ α as well. Since M , w ⊨ □α, every R-neighbour
of w satisfies α. But R is reflexive, so w is an R-neighbour of itself. Hence, M , w ⊨ α.
Conversely, we show that every non-reflexive frame does not belong to C□α⊃α . Let
F = (W , R) be a frame where for some w ∈ W , it is not the case that w R w. Choose
a proposition p and define a valuation V as follows: V (w) = ; and V (w ′ ) = { p} for
all w ′ ̸= w. Clearly, (F ,V ), w ⊨ □ p but (F ,V ), w ⊭ p. Hence w fails to satisfy the
substitution instance □ p ⊃ p of the formula □α ⊃ α. ⊣
. Correspondence Theory 

Proposition .. e class of transitive frames is characterised by the formula □α ⊃


□□α.

P We first show that every transitive frame belongs to C□α⊃□□α . Let M =
((W , R),V ) be a model where R is transitive. Consider any world w ∈ W . Sup-
pose that M , w ⊨ □α. We have to show that M , w ⊨ □□α as well.
For this, we have to show that every R-neighbour w ′ of w satisfies □α. Consider
any R-neighbour w ′ of w. If w ′ has no R-neighbours, then it is trivially the case that
M , w ′ ⊨ □α. On the other hand, if w ′ has R-neighbours, then we must show that
each R-neighbour of w ′ satisfies α. Let w ′′ be an R-neighbour of w ′ . Since w R w ′
and w ′ R w ′′ , by transitivity w ′′ is also an R-neighbour of w. Since we assumed that
M , w ⊨ □α, it must be the case that M , w ′′ ⊨ α, as required.
Conversely, we show that every non-transitive frame does not belong to C□α⊃□□α .
Let F = (W , R) be a frame where for some w, w ′ , w ′′ ∈ W , w R w ′ and w ′ R w ′′
but it is not the case that that w R w ′′ . Choose a proposition p and define a valuation
V as follows: ¨
{ p} if w R wb
b =
V (w)
; otherwise
Since w ′′ is not an R-neighbour of w, V (w ′′ ) = ;. is means that M , w ′ ⊭ □ p, for
w ′′ is an R-neighbour of w ′ and M , w ′′ ⊭ p. erefore, M , w ⊭ □□ p, since w ′ is an
R-neighbour of w. On the other hand, M , w ⊨ □ p by the definition of V . Hence,
M , w ⊭ □ p ⊃ □□ p, which is an instance of □α ⊃ □□α. ⊣

e characteristic formula for transitivity can dually be written ◊◊α ⊃ ◊α. is form
represents transitivity more intuitively—the formula says that if w R w ′ R w ′′ and w ′′
satisfies α, there exists an R-neighbour wb of w satisfying α. If R is transitive, w ′′ is a
natural candidate for w. b Similarly, α ⊃ ◊α is the dual (and more appealing) form of
the characteristic formula for reflexivity. We have used the □ forms of these formulas
because they are more standard in the literature.

Proposition .. e class of symmetric frames is characterised by the formula α ⊃


□◊α.

P We first show that every symmetric frame belongs to Cα⊃□◊α . Let M =
((W , R),V ) be a model where R is symmetric. Consider any world w ∈ W . Sup-
pose that M , w ⊨ α. We have to show that M , w ⊨ □◊α as well.
For this, we have to show that every R-neighbour w ′ of w satisfies ◊α. Consider
any R-neighbour w ′ of w. Since R is symmetric, w is an R-neighbour of w ′ . We
assumed that M , w ⊨ α so M , w ′ ⊨ ◊α, as required.
. Correspondence Theory 

. w′ w ′′

Figure .: e Euclidean condition

Conversely, we show that every non-symmetric frame does not belong to Cα⊃□◊α .
Let F = (W , R) be a frame where for some w, w ′ ∈ W , w R w ′ but it is not the case
that that w ′ R w. Choose a proposition p and define a valuation V as follows:
¨
; if w ′ R wb
b =
V (w)
{ p} otherwise

By construction M , w ′ ⊭ ◊ p. Hence, since wRw ′ , M , w ⊭ □◊ p. On the other hand,


M , w ⊨ p by the definition of V , so M , w ⊭ p ⊃ □◊ p, which is an instance of the
formula α ⊃ □◊α. ⊣

We say that an accessibility relation R over W is Euclidean if for all w, w ′ , w ′′ ∈ W ,


if w R w ′ and w R w ′′ then w ′ R w ′′ and w ′′ R w ′ (see Figure .).

Proposition .. e class of Euclidean frames is characterised by the formula ◊α ⊃


□◊α.

P We first show that every Euclidean frame belongs to C◊α⊃□◊α . Let M =
((W , R),V ) be a model where R is Euclidean. Consider any world w ∈ W . Suppose
that M , w ⊨ ◊α. We have to show that M , w ⊨ □◊α as well.
Let w ′ be an R-neighbour of w. We must show that M , w ′ ⊨ ◊α. Since M , w ⊨
◊α, there must exist wα such that w R wα and M , wα ⊨ α. Since R is Euclidean,
w ′ R wα as well, so M , w ′ ⊨ ◊α as required.
Conversely, we show that every non-Euclidean frame does not belong to C◊α⊃□◊α .
Let F = (W , R) be a frame where for some w, w ′ , w ′′ ∈ W , w R w ′ and w R w ′′ but
one of w ′ R w ′′ and w ′′ R w ′ fails to hold. Without loss of generality, assume that it
is not the case that w ′′ R w ′ .
. Correspondence Theory 

Choose a proposition p and define a valuation V such that V (w ′ ) = { p} and


b = ; for all wb ̸= w ′ . en, since w R w ′ , M , w ⊨ ◊ p by the definition of V .
V (w)
On the other hand, by construction M , w ′′ ⊭ ◊ p, so M , w ⊭ □◊ p. So, M , w ⊭ ◊ p ⊃
□◊ p, which is an instance of ◊α ⊃ □◊α. ⊣

Notice that if R is Euclidean, for all w ′ , if there exists w such that w R w ′ , then
w ′ R w ′ . It is not difficult to verify that if R is reflexive and Euclidean then R is in
fact an equivalence relation. On the other hand, if R is symmetric and transitive then
it is also Euclidean.
A frame (W , R) is said to be converse well-founded if for all nonempty subsets X
of W , there exists a maximal element x of X , i.e. x is in X and for all y in X , it is not
the case that x R y.

Proposition .. e class of transitive, converse well-founded frames is characterised by


the formula □(□α ⊃ α) ⊃ □α.

P We first show that every transitive and converse well-founded frame is a model
of □(□α ⊃ α) ⊃ □α, i.e., it belongs to C□(□α⊃α)⊃□α . Let M = ((W , R),V ) be a
model where R is transitive and converse well-founded. Consider any world w ∈ W .
Suppose that M , w ⊨ □(□α ⊃ α). We have to show that M , w ⊨ □α as well.
For this, we have to show that every R-neighbour w ′ of w satisfies α. Consider
any R-neighbour w ′ of w. Since w satisfies □(□α ⊃ α), w ′ satisfies □α ⊃ α. us,
to show that every R-neighbour w ′ of w satisfies α it suffices to show that w ′ satisfies
□α.
Consider the set X of worlds x such that w R x. Since R is transitive, whenever
x is an element of X and x R y, we also have w R y and hence y is in X . A path
in W is any finite sequence ρ = w0 , w1 , . . . , wn of worlds (n ≥ 0) such that for all
i : 0 < i ≤ n, wi R wi+1 . e length of such a path, denoted len(ρ), is defined to be
n. A path ρ = w0 , w1 , . . . , wn is said to be an x-path (for x ∈ W ) if x = w0 . For any
node x ∈ W , define the height of x, denoted ht(x) to be sup{len(ρ) | ρ is an x-path}.
e height of a given world is in general an ordinal. But the following useful property
holds: whenever x R y then ht(y) < ht(x).
For all x ∈ X , we prove by transfinite induction on ht(x) that x satisfies □α (and
hence α). e base case is when ht(x) is 0, which means that there is no y ∈ W such
that x R y. But then x vacuously satisfies □α. For the induction step, consider an
arbitrary world x in X . For all y ∈ W such that x R y, y ∈ X and ht(y) is strictly less
than ht(x). erefore by the induction hypothesis every R-neighbour y of x satisfies
□α (and hence α), and hence x satisfies □α (and hence α).
us every R-neighbour w ′ of w satisfies α, and hence w satisfies □α.
. Correspondence Theory 

Conversely, consider a frame F = (W , R) which is not transitive. is means that


there are three worlds of W , w, w ′ , and w ′′ such that w R w ′ , w ′ R w ′′ , but not
w R w ′′ . Choose a proposition p and define a valuation V as follows:
¨
{ p} if w R wb and wb ̸= w ′
b =
V (w)
; otherwise

Clearly, for all wb in W such that w R wb and w ′ ̸= w,


b wb satisfies p and hence □ p ⊃ p.
On the other hand w ′ does not satisfy □ p (since it has an R-neighbour, namely w ′′ ,
which does not satisfy p) and hence satisfies □ p ⊃ p vacuously. Since all R-neighbours
of w satisfy □ p ⊃ p, w satisfies □(□ p ⊃ p). On the other hand, clearly w does not
satisfy □ p.
Consider now a frame F = (W , R) which is transitive but not converse well-
founded. is means that there is a subset X of W with no R-maximal world, i.e.
for all x in X , there is a y in X such that x R y. Choose a world w in X , choose a
proposition p and define a valuation V as follows:
¨
; if wb ∈ X
b =
V (w)
{ p} otherwise

Clearly, for all wb ∈


/ X , wb satisfies p and hence □ p ⊃ p. On the other hand, every wb
in X has an R-neighbour in X (which does not satisfy p) and hence wb does not satisfy
□ p and thus satisfies □ p ⊃ p. us w satisfies □(□ p ⊃ p). But clearly w does not
satisfy □ p. ⊣

Exercise .. What classes of frames are characterised by the following formulas?

(i) ◊α ⊃ □α.

(ii) ◊α ⊃ ◊◊α.

(iii) α ⊃ □α. ⊣

Are there natural classes of frames which cannot be characterised in modal logic?
We will see later that irreflexive frames form one such class. But first, we return to the
notions of satisfiability and validity and look for a completeness result.
. Axiomatising valid formulas 

. Axiomatising valid formulas


Validity revisited We said earlier that a formula α is valid if for every frame F =
(W , R), every model M = (F ,V ) and every world w, M , w ⊨ α. In light of our
discussion of correspondence theory we can refine this notion by restricting the range
over which we consider frames.
Let C be a class of frames. We say that a formula α is C -valid if for every frame
F = (W , R) from the class C , for every model M = (F ,V ) and for every world w,
M , w ⊨ α. We denote the fact that α is C -valid by ⊨C α.
Let F represents the class of all frames. en, the set of F -valid formulas is the
same as the set of valid formulas according to our earlier definition. In other words,
the notions ⊨F α and ⊨ α are equivalent.
Dually, we say that a formula α is C -satisfiable if there is a frame F = (W , R) in
the class C , a model M = (F ,V ) and a world w, such that M , w ⊨ α. Once again, a
formula is F -satisfiable iff it is satisfiable according to our earlier definition.

Completeness for the class F


Consider the following axiom system.

Axiom System K

Axioms
(A) All tautologies of propositional logic.
(K) □(α ⊃ β) ⊃ (□α ⊃ □β).

Inference Rules
α, α ⊃ β α
(MP) (G) □α
β

e axiom (A) is an abbreviation for any set of axioms which are sound and com-
plete for Propositional Logic—in particular, we could instantiate (A) with the axioms
(A)–(A) of the system AX discussed in the previous section.
As usual, we say that α is a thesis of System K ³, denoted ⊢K α, if we can derive
α using the axioms (A) and (K) and the inference rules (MP) and (G). Once again,
we will omit the subscript and write ⊢ α if there is no confusion about which axiom
system we are referring to.
³e name K is derived from Saul Kripke.
. Axiomatising valid formulas 

e result we want to establish is the following.


eorem .. For all formulas α, ⊢K α iff ⊨F α.
As usual, one direction of the proof is easy.
Lemma .. (Soundness of System K ) If ⊢K α then ⊨F α.
P As we observed in the previous section, it suffices to show that each axiom is
F -valid and that the inference rules preserve F -validity. is is precisely what we
exhibited in Example  and Exercise ... ⊣
As in Propositional Logic, we use a Henkin-style argument to show that every F -valid
formula is derivable using System K.

Consistency As before, we say that a formula α is consistent with respect to System K


if ⊬K ¬α. A finite set of formulas {α1 , α2 , . . . , αn } is consistent if the conjunction α1 ∧
α2 ∧ · · · ∧ αn is consistent. Finally, an arbitrary set of formulas X is consistent if every
finite subset of X is consistent.

Our goal is to prove the following.


Lemma .. Let α be a formula which is consistent with respect to System K. en, α is
F -satisfiable.
As we saw in the case of Propositional Logic, this will yield as an immediate corol-
lary the result which we seek:
Corollary .. (Completeness for System K ) Let α be a formula which is F -valid.
en, ⊢K α.

Maximal Consistent Sets


As before, we say that a set of formulas X is a maximal consistent set or MCS if X is con-
sistent and for all α ∈
/ X , X ∪ {α} is inconsistent. As we saw earlier, by Lindenbaum’s
Lemma, every consistent set of formulas can be extended to an MCS.
We will once again use the properties of MCSs established in Lemma ... In
addition, the following properties of MCSs will prove useful.
Lemma .. Let X be a maximal consistent set.
(i) If β is a substitution instance of an axiom, then β ∈ X .
(ii) If α ⊃ β ∈ X and α ∈ X , then β ∈ X .
P e proof is routine and is left as an exercise. ⊣
. Axiomatising valid formulas 

e canonical model
When we studied propositional logic, we saw that each maximal consistent set defines a
“propositional world”. In modal logic, we have to construct frames with many propo-
sitional worlds. In fact, we generate a frame with all possible worlds, with a suitable
accessibility relation.

Canonical model e canonical frame for System K is the pair FK = (WK , RK ) where:

• WK = {X | X is an MCS}.

• If X and Y are MCSs, then X RK Y iff {α | □α ∈ X } ⊆ Y .

e canonical model for System K is given by MK = (FK ,VK ) where for each
X ∈ WK , VK (X ) = X ∩ P .

Exercise .. We can dually define RK using the modality ◊ rather than □. Verify
that X RK Y iff {◊α | α ∈ Y } ⊆ X . ⊣

e heart of the completeness proof is the following lemma.

Lemma .. For each MCS X ∈ WK and for each formula α ∈ Φ, MK , X ⊨ α iff


α ∈ X.

P As usual, the proof is by induction on the structure of α.


Basis: If α = p ∈ P , then MK , X ⊨ p iff p ∈ VK (X ) iff p ∈ X , by the definition of
VK .
Induction step:
α = ¬β: en MK , X ⊨ ¬β iff MK , X ⊭ β iff (by the induction hypothesis) β ∈
/X
iff (by the fact that X is an MCS) ¬β ∈ X .
α = β ∨ γ : en MK , X ⊨ β ∨ γ iff MK , X ⊨ β or MK , X ⊨ γ iff (by the induction
hypothesis) β ∈ X or γ ∈ X iff (by the fact that X is an MCS) β ∨ γ ∈ X .
α = □β: We analyse this case in two parts:
(⇐) Suppose that □β ∈ X . We have to show that MK , X ⊨ □β. Consider any
MCS Y such that X RK Y . Since □β ∈ X , from the definition of RK it follows that
β ∈ Y . By the induction hypothesis MK , Y ⊨ β. Since the choice of Y was arbitrary,
MK , X ⊨ □β.
. Axiomatising valid formulas 

(⇒) Suppose that MK , X ⊨ □β. We have to show that □β ∈ X . Suppose that


□β ∈ / X . en, since X is an MCS, ¬□β ∈ X . We show that this leads to a
contradiction.

Claim Y0 = {γ | □γ ∈ X } ∪ {¬β} is consistent.

If we assume the claim, we can extend Y0 to an MCS Y . Clearly, X RK Y .


Since ¬β ∈ Y , β ∈/ Y . By the induction hypothesis, MK , Y ⊭ β. is means that
MK , X ⊭ □β which contradicts our initial assumption that MK , X ⊨ □β.
To complete the proof, we must verify the claim.

Proof of claim Suppose that Y0 is not consistent. en, there exists


{γ1 , γ2 , . . . , γn }, a finite subset of Y0 , such that γ1 ∧ γ2 ∧ · · · ∧ γn ∧ ¬β is
inconsistent. Let us denote γ1 ∧ γ2 ∧ · · · ∧ γn by γ̃ .
We then have the following sequence of derivations:
⊢ ¬(γ̃ ∧ ¬β) By the definition of consistency
⊢ ¬γ̃ ∨ β Tautology of propositional logic (Axiom A)
⊢ γ̃ ⊃ β Definition of ⊃
⊢ □(γ̃ ⊃ β) Inference rule G
⊢ □γ̃ ⊃ □β Axiom K plus one application of MP
⊢ ¬(□γ̃ ∧ ¬□β) Tautology of propositional logic (Axiom A)
We can easily show that ⊢ □(γ ∧ δ) ≡ (□γ ∧ □δ).
In one direction, since ⊢ γ ∧ δ ⊃ γ is a tautology of propositional logic,
we can use the rule G to get ⊢ □(γ ∧ δ ⊃ γ ). From axiom K and one
application of MP, ⊢ □(γ ∧ δ) ⊃ □γ . Symmetrically, it follows that
⊢ □(γ ∧ δ) ⊃ □δ. So, ⊢ □(γ ∧ δ) ⊃ (□γ ∧ □δ).
Conversely, ⊢ γ ⊃ (δ ⊃ (γ ∧ δ)) from propositional logic. By applying
axiom K and MP a couple of times, we obtain ⊢ □γ ⊃ (□δ ⊃ □(γ ∧δ)),
from which it follows that ⊢ (□γ ∧ □δ) ⊃ □(γ ∧ δ).
We can extend this argument to show that ⊢ □(δ1 ∧ δ2 ∧ · · · ∧ δn ) ≡
(□δ1 ∧ □δ2 ∧ · · · □δn ) for all n.
From the last line in our derivation above, it then follows that ⊢ ¬(□γ1 ∧
□γ2 ∧ · · · □γn ∧ ¬□β). us the set {□γ1 , □γ2 , . . . , □γn , ¬□β} is in-
consistent. But this is a finite subset of X , which means that X is itself
inconsistent, contradicting the fact that X is an MCS.
. Axiomatising valid formulas 

From the preceding result, the proof of Lemma .. is immediate.


Proof of Lemma ..: Let α be a formula which is consistent with respect to
System K. By Lindenbaum’s Lemma, α can be extended to a maximal consistent set
Xα . By the preceding result M , Xα ⊨ α, so α is F -satisfiable. ⊣
Once we have proved Lemma .., we immediately obtain a proof of complete-
ness (Corollary ..) using exactly the same argument as in propositional logic.
It is worth pointing out one important difference between the canonical model
constructed for System K and the models constructed when proving completeness for
propositional logic. In propositional logic, to satisfy a consistent formula α, we build a
valuation v which depends on α. On the other hand, the construction of the canonical
model for System K is independent of the choice of α. us, every consistent formula
α is satisfied within the model MK .

Completeness for other classes of frames


Can we axiomatise the set of C -valid formulas for a class of frames C which is properly
included in F ? To do this, we use the characteristic formulas which we looked at when
discussing correspondence theory.

Reflexive frames
System T is the set of axioms obtained by adding the following axiom scheme to Sys-
tem K.

(T) □α ⊃ α

Lemma .. System T is sound and complete with respect to the class of reflexive frames.

P To show that System T is sound with respect to reflexive frames, we only need
to verify that the new axiom (T) is sound for this class of frames—the other axioms
and rules from System K continue to be sound. e soundness of axiom (T) follows
from Proposition ...
To show completeness, we must argue that every formula which is consistent with
respect to System T can be satisfied in a model based on a reflexive frame. To es-
tablish this, we follow the proof of completeness for System K and build a canonical
model MT = ((WT , RT ),VT ) for System T which satisfies the property described in
Lemma ... We just need to verify that the resulting frame (WT , RT ) is reflexive.
For any MCS X , we need to verify that X RT X or, in other words, that {α | □α ∈
X } ⊆ X . Consider any formula □α ∈ X . Since □α ⊃ α is an axiom of System T,
. Axiomatising valid formulas 

□α ⊃ α ∈ X , by Lemma .. (i). From Lemma .. (ii), it then follows that α ∈ X ,
as required. ⊣

Transitive frames
System  is the set of axioms obtained by adding the following axiom scheme to Sys-
tem K.

() □α ⊃ □□α

Lemma .. System  is sound and complete with respect to the class of transitive frames.

P We know that the axiom () is sound for the class of transitive frames from
Proposition ... is establishes the soundness of System .
To show completeness, we must argue that every formula which is consistent with
respect to System  can be satisfied in a model based on a transitive frame. Once again,
we can build a canonical model M4 = ((W4 , R4 ),V4 ) for System  which satisfies the
property described in Lemma ... We just need to verify that the resulting frame
(W4 , R4 ) is transitive.
In other words, if X , Y, Z are MCSs such that X R4 Y and Y R4 Z, we need
to verify that X R4 Z—that is, we must show that {α | □α ∈ X } ⊆ Z. Consider
any formula □α ∈ X . Since □α ⊃ □□α is an axiom of System , it follows from
Lemma .. that □□α ∈ X . Since X R4 Y , it must be the case that □α ∈ Y .
Further, since Y R4 Z it must be the case that α ∈ Z, as required. ⊣

Exercise .. e System B is obtained by adding the following axiom to System K.

(B) α ⊃ □◊α.
Verify that System B is sound and complete with respect to symmetric frames. ⊣

Combinations of frame conditions


By combining the characteristic formulas for different frame conditions, we obtain
completeness for smaller classes of frames.
. Bisimulations and expressiveness 

Reflexive and transitive frames


e System S is obtained by adding the axioms (T) (for reflexivity) and () (for tran-
sitivity) to System K.

Lemma .. System S is sound and complete with respect to the class of reflexive and
transitive frames.

P Since System T is sound for the class of reflexive frames and System  is sound
for the class of transitive frames, it follows that System S is sound for the class of
reflexive and transitive frames.
To show completeness, as usual we build a canonical model M S4 = ((WS4 , RS4 ),VS4 )
satisfying the property in Lemma ... Using the argument in the proof of Lemma ..,
it follows that RS4 is reflexive. Similarly, from the proof of Lemma .. it follows that
RS4 is transitive. ⊣

Equivalence relations
e System S is obtained by adding the following axioms to System K.
(T) □α ⊃ α
() ◊α ⊃ □◊α.
We have already seen that (T) is the axiom for reflexivity, while () characterises Eu-
clidean frames.

Exercise ..

(i) Show that System S is sound and complete for the class of frames whose acces-
sibility relation is an equivalence relation.

(ii) Show that the axioms () and (B) can be derived in System S. ⊣

. Bisimulations and expressiveness


Intuitively, it is clear that models which have “similar” structure satisfy the same modal
logic formulas. For instance, if we choose the same valuation for all worlds in the
two frames shown in Figure ., it seems evident that no formula can distinguish the
resulting pair of models.
To formalise this notion, we introduce bisimulations.
. Bisimulations and expressiveness 

w w1 w2 w3 w4 ···

Figure .: A pair of similar frames

. Let M1 = ((W1 , R1 ),V1 ) and M2 = ((W2 , R2 ),V2 ) be a pair of models.


Bisimulation
A bisimulation is a relation ∼ ⊆ W1 × W2 satisfying the following conditions.

(i) If w1 ∼ w2 and w1 R1 w1′ then there exists w2′ such that w2 R2 w2′ and w1′ ∼ w2′ .

(ii) If w1 ∼ w2 and w2 R2 w2′ then there exists w1′ such that w1 R1 w1′ and w1′ ∼ w2′ .

(iii) If w1 ∼ w2 then V1 (w1 ) = V2 (w2 ).

Notice that the empty relation is a trivial example of a bisimulation. Two worlds which
are related by a bisimulation satisfy exactly the same formulas.

Lemma .. Let ∼ be a bisimulation between M1 = ((W1 , R1 ),V1 ) and M2 = ((W2 , R2 ),V2 ).
For all w1 ∈ W1 and w2 ∈ W2 , if w1 ∼ w2 , then for all formulas α, M1 , w1 ⊨ α iff
M2 , w2 ⊨ α.

P As usual, the proof is by induction on the structure of α.

Basis: Suppose α = p ∈ P . By the definition of bisimulations, we know that V1 (w1 ) =


V2 (w2 ). Hence, M1 , w1 ⊨ p iff M2 , w2 ⊨ p.

Induction step: e propositional cases α = ¬β and α = β ∨ γ are easy, so we omit


them and directly consider the case α = □β.
(⇒) Suppose that M1 , w1 ⊨ □β. We must show that M2 , w2 ⊨ □β as well. For
this, we must argue that M2 , w2′ ⊨ β for each world w2′ such that w2 R2 w2′ . Since
∼ is a bisimulation, for each such w2′ there exists a world w1′ such that w1 R1 w1′ and
w1′ ∼ w2′ . Since M1 , w1 ⊨ □β, it follows that M1 , w1′ ⊨ β. Since w1′ ∼ w2′ , by the
induction hypothesis, it follows that M2 , w2′ ⊨ β. Since w2′ was an arbitrarily chosen
R2 -neighbour of w2 , we have M2 , w2 ⊨ □β, as required.
(⇐) Suppose that M2 , w2 ⊨ □β. We must show that M1 , w1 ⊨ □β as well. e
argument is symmetric to the earlier one and we omit the details. ⊣
. Bisimulations and expressiveness 

We can use bisimulations to show that certain classes of frames cannot be charac-
terised in modal logic.

Lemma .. e class of irreflexive frames cannot be characterised in modal logic.

P Let α be a formula that characterises the class of irreflexive frames. Consider
the pair of frames in Figure .. Since the first frame is not irreflexive, there should be
a valuation V and an instance β of α such that β is not satisfied at w under V .
Let us define a valuation V ′ on the second model such that for each wi , V ′ (wi ) =
V (w). We can clearly set up a bisimulation between the two models by relating w to
each of the worlds wi . is means that w satisfies exactly the same formulas as each
of the worlds wi . In particular, β is not satisfied at each wi . is is a contradiction
because the second model is irreflexive and β is an instance of the formula α which
we claimed was a characteristic formula for irreflexive frames. ⊣

Exercise .. We say that a frame (W , R) is “non-connected” if there are worlds w


and w ′ such that it is not the case that w(R ∪ R−1 )∗ w ′ . In other words, we convert
(W , R) into an undirected graph by ignoring the orientation of edges in R. e frame
is “non-connected” if there are two nodes in the resulting undirected graph which are
not reachable from each other.
Show that there is no axiom which characterises the class of “non-connected”
frames. ⊣

Antisymmetry
We have seen that irreflexivity cannot be characterised in modal logic. Another natural
frame condition which is beyond the expressive power of modal logic is antisymmetry.
Recall that a relation R on W is antisymmetric if w R w ′ and w ′ R w imply that
w = w ′.

Lemma .. Let α be a formula which is satisfiable over the class of reflexive and transi-
tive frames. en, α is satisfiable in a model based on an reflexive, transitive and antisym-
metric frame.

P Let M = ((W , R),V ) be a model where R is reflexive and transitive. We


describe a technique called bulldozing, due to Krister Segerberg, for constructing a
new model M b = ((W
c, R),
b Vb ), where Rb is reflexive, transitive and antisymmetric, such
b and M satisfy the same formulas.
that M
. Bisimulations and expressiveness 

Consider the frame (W , R). If R is not antisymmetric, there are two worlds w and
w in W such that w R w ′ and w ′ R w. e idea is to break each loop of this kind

by making infinitely many copies of w and w ′ and arranging these copies alternately
in a chain. We then verify that the new model which we construct is bisimilar to the
original model.
Formally, we say that X ⊆ W is a cluster if X × X ⊆ R—in a cluster, every world
can “see” every other world.
Let Cl be the class of maximal clusters of W —that is, X ∈ Cl if X is cluster and
for each w ∈/ X , (X ∪ {w}) × (X ∪ {w}) ̸⊆ R. Since R is reflexive, every singleton
{w} is a cluster. It follows that the set Cl of maximal clusters is not empty and that
every world w ∈ W belongs to some maximal cluster in Cl. In fact, W is partitioned
into maximal clusters.
For each X ∈ Cl, define WX = X × N, where N is the set {0, 1, 2, . . .} of natural
numbers. us WX contains infinitely many copies of each world from X . For each
set WX , we define an accessibility relation within WX . For this, we first fix an arbitrary
total order ≤X on X . For X ∈ Cl, RX ⊆ WX × WX is then defined as follows:

RX = {((w, i), (w, i)) | w ∈ X and i ∈ N}


∪ {((w, i), (w ′ , i )) | w, w ′ ∈ X and w ≤X w ′ }
∪ {((w, i), (w ′ , j )) | w, w ′ ∈ X and i < j }

We then define a relation across maximal clusters based on the original accessibility
relation R:


R′ = {(WX × WY ) | X ̸= Y and for some w ∈ X and w ′ ∈ Y, w R w ′ }

c, R)
Finally, we can define the new frame (W b corresponding to (W , R).

c=
• W WX .
X ∈Cl

b = R′ ∪
• R RX .
X ∈Cl

b is reflexive, transitive and antisymmetric (Exercise ..).


It can be verified that R
Each world in W c is of the form (w, i ) where w ∈ X for some maximal cluster
X ∈ Cl and i ∈ N. We extend (W c, R)
b to a model by defining Vb ((w, i )) = V (w) for
all w ∈ W and i ∈ N.
We define a relation ∼ ⊆ W c × W as follows:

∼ = {((w, i ), w) | w ∈ W , i ∈ N}
. Bisimulations and expressiveness 

We claim that ∼ is a bisimulation between M b and M . From the definition of Vb , we


b
have V ((w, i )) = V (w) for all w ∈ W and i ∈ N, so the third condition in the
definition of bisimulations is trivially satisfied.
Suppose that (w, i ) ∼ w and (w, i ) R b (w ′ , j ). We must show that w R w ′ . If
w and w ′ belong to the same maximal cluster X , then w R w ′ because all elements
in X are R-neighbours of each other. On the other hand, if w ∈ X and w ′ ∈ Y for
distinct clusters X and Y , it must be the case that (w, i ) R′ (w ′ , j ). is means that
we have w1 ∈ X and w1′ ∈ Y such that w1 R w1′ . Since w R w1 and w1′ R w ′ , from
the transitivity of R it follows that w R w ′ .
Conversely, suppose that (w, i ) ∼ w and w R w ′ . We exhibit a world (w ′ , j )
such that (w, i ) R b (w ′ , j ). If w and w ′ belong to the same maximal cluster X , we
just choose (w ′ , j ) such that i < j . en, by the definition of RX , (w, i ) RX (w ′ , j ),
b (w ′ , j ) as well. On the other hand, if w ∈ X and w ′ ∈ Y for distinct
so (w, i ) R
maximal clusters X and Y , then (x, i ) R′ (y, j ) for all j ∈ N, so once again we can
pick a (w ′ , j ) such that (x, i ) R b (y, j ).
us, ∼ is a bisimulation between M b , whose frame is antisymmetric and transitive,
and M , whose frame is transitive. Hence, for any world w ∈ W and any formula α,
M , w ⊨ α iff M b , (w, i ) ⊨ α for all i ∈ N. In other words, every formula which is
satisfiable in the class of transitive frames is also satisfiable in the class of antisymmetric
and transitive frames. ⊣

Exercise .. Show that the relation R b constructed in the proof of Lemma .. is
reflexive, transitive and antisymmetric. ⊣

Corollary .. e class of antisymmetric frames cannot be characterised in modal logic.

P Let α be a formula characterizing the class of antisymmetric frames. Let (W , R)


be a frame where R is reflexive and transitive but not antisymmetric. en, there exists
an instance β of α and a valuation V over (W , R) such that M , w ⊨ ¬β for some
w ∈ W . By Lemma .., we can convert M into a model M b = ((Wc, R),
b Vb ) where
b is reflexive, transitive and antisymmetric, such that M , wb ⊨ ¬β for some wb ∈ W
R c.
is is a contradiction, since β was assumed to be an instance of the formula α which
characterises antisymmetric frames. ⊣

We have already seen that the system S is sound and complete for the class of
reflexive, transitive frames. is class is very close to the class of partial orders, which are
ubiquitous in computer science. e fact that antisymmetry cannot be characterised in
. Decidability: Filtrations and the finite model property 

modal logic means that modal logic cannot distinguish between reflexive and transitive
frames (often called preorders) and reflexive, transitive and antisymmetric frames (or
partial orders).

Corollary .. e system S is sound and complete for the class of partial orders.

P Since partial orders are reflexive and transitive, S is certainly sound for this
class of frames. We already know that every formula which is consistent with respect
to S is satisfiable in a preorder. e bulldozing construction described in the proof
of Lemma .. shows that every formula satisfiable over a preorder is also satisfiable
over a partial order. ⊣

. Decidability: Filtrations and the finite model property


ough we have looked at sound and complete axiomatisations of different classes of
frames, we have yet to establish any results concerning decidability. e basic tech-
nique for showing decidability is to prove that any formula which is satisfiable is in
fact satisfiable in a finite model.

Finite model property Let A be an axiom system which is sound and complete with
respect to a class of frames C . e system A has the finite model property if for all
formulas α, ⊬A α implies there is a model M = (F ,V ) based on a finite frame F =
(W , R) ∈ C such that for some w ∈ W , M , w ⊨ ¬α.
Since A is sound and complete for the class C , this is equivalent to demanding that
any formula which is satisfiable in the class C is in fact satisfiable in a model based on
a finite frame from the class C .
Assume that we can effectively decide whether or not a given finite frame belongs
to the class C , we can then systematically enumerate all finite models built from the
class C . As a consequence, the finite model property allows us to enumerate the set
of formulas satisfiable within the class C . On the other hand, the completeness of the
axiom system A allows us to enumerate the set of formulas which are valid in this class
of frames.
To check whether a formula α is valid, we interleave these enumerations. If α is
valid, it will be enumerated as a thesis of the system A. On the other hand, if α is
not valid, its negation ¬α must be satisfiable, so ¬α will appear in the enumeration of
formulas satisfiable over C . us, the finite model property yields a decision procedure
for validity (and, dually, satisfiability).
. Decidability: Filtrations and the finite model property 

Subformulas Let α be formula. e set of subformulas of α, denoted sf(α), is the


smallest set of formulas such that:

• α ∈ sf(α).

• If ¬β ∈ sf(α) then β ∈ sf(α).

• If β ∨ γ ∈ sf(α) then β ∈ sf(α) and γ ∈ sf(α).

• If □β ∈ sf(α) then β ∈ sf(α).

Exercise .. Show that the size of the set sf(α) is bounded by the length of α. More
formally, for a formula α, define |α|, the length of α, to be the number of symbols in
α. Show that if |α| = n then |sf(α)| ≤ n. Give an example where |sf(α)| < |α|. ⊣


For a set X of formulas, we write sf(X ) to denote the set α∈X sf(α). A set of
formulas X is said to be subformula-closed (or just sf-closed ) if X = sf(X ).
Let M = ((W , R),V ) and M ′ = ((W ′ , R′ ),V ′ ) be a pair of models. We have
already seen that if we can set up a bisimulation ∼ between M and M ′ , then for each
pair of worlds (w, w ′ ) ∈∼, the worlds w and w ′ satisfy the same formulas. Often, we
are willing to settle for a weaker relationship between w and w ′ —we do not require
them to agree on all formulas, but only on formulas from a fixed set X . For sf-closed
subsets X , this can be achieved using filtrations.

Filtrations Let M = ((W , R),V ) and M ′ = ((W ′ , R′ ),V ′ ) be a pair of models and X
an sf-closed set of formulas. An X -filtration from M to M ′ is a function f : W → W ′
such that:

(i) For all w, w ′ ∈ W , if w R w ′ then f (w) R f (w ′ ).

(ii) e map f is surjective.

(iii) For all p ∈ P ∩ X , p ∈ V (w) iff p ∈ V ′ ( f (w)).

(iv) If ( f (w), f (w ′ )) ∈ R′ , then for each formula of the form □α in X , if M , w ⊨


□α then M , w ′ ⊨ α.

In a filtration, we have a weaker requirement on the inverse image of f than in


a bisimulation. We do not demand that w R w ′ whenever If f (w)R′ f (w ′ ). We
only insist that w and w ′ be “semantically” related upto the formulas in X . It is quite
possible that (w, w ′ ) ∈
/ R and hence for some □β ∈ / X , M , w ⊨ □β while M , w ′ ⊭ β.
. Decidability: Filtrations and the finite model property 

Lemma .. Let f be an X -filtration from M = ((W , R),V ) to M ′ = ((W ′ , R′ ),V ′ )


where X is an sf-closed set of formulas. en, for all α ∈ X and for all w ∈ W , M , w ⊨ α
iff M ′ , f (w) ⊨ α.

P e proof is by induction on the structure of α.


Basis If α = p ∈ P ∩ X , then M , w ⊨ p iff p ∈ V (w) iff (by the definition of
X -filtrations) p ∈ V ′ ( f (w)) iff M , f (w) ⊨ p.
Induction step e propositional cases α = ¬β and α = β ∨ γ are easy, so we omit
them and directly consider the case α = □β.
(⇒) Suppose M , w ⊨ □β. To show that M ′ , f (w) ⊨ □β, we must show that
for each w ′ with f (w)R′ w ′ , M ′ , w ′ ⊨ β. Fix an arbitrary w ′ such that f (w)R′ w ′ .
Since f is surjective, there is a world w ′′ ∈ W such that w ′ = f (w ′′ ). From the
last clause in the definition of filtrations, it follows that M , w ′′ ⊨ β. Since X is sf-
closed, β ∈ X . From the induction hypothesis, we have M ′ , f (w ′′ ) ⊨ β or, in other
words, M ′ , w ′ ⊨ β. Since w ′ was an arbitrary R′ -neighbour of f (w), it follows that
M ′ , f (w) ⊨ □β.
(⇐) Suppose that M ′ , f (w) ⊨ □β. To show that M , w ⊨ □β, we must show that
for each w ′ with w R w ′ , M , w ′ ⊨ β. Fix an arbitrary w ′ such that w R w ′ . From
the first clause in the definition of filtrations, it follows that f (w)R′ f (w ′ ). Since
M ′ , f (w) ⊨ □β, it must be the case that M ′ , f (w ′ ) ⊨ β. Since β ∈ X , from the
induction hypothesis we have M , w ′ ⊨ β. Since w ′ was an arbitrary R-neighbour of
w, it follows that M , w ⊨ □β. ⊣

Recall that our goal is to establish the finite model property for a class of frames
C —whenever a formula α is satisfiable over C , then there is a model for α based on
a finite frame from the class C .
Our strategy will be as follows: given a formula α and an arbitrary model M for
α, define an sf-closed set of formulas Xα and a finite model Mα such that α ∈ Xα and
there is an Xα -filtration from M to Mα . Lemma .. then tells us that α is satisfied in
Mα . Since this procedure applies uniformly to all satisfiable formulas α over the given
class of frames, it follows that this class of frames has the finite model property.
Defining Xα is easy—we set Xα = sf(α). To construct Mα , we have to define a
frame (Wα , Rα ) and a valuation Vα : Wα → 2P .
We define Wα and Vα in a uniform manner for all classes of frames. To define
Wα , we begin with the following equivalence relation ≃α on W : w ≃α w ′ if for
each β ∈ Xα , M , w ⊨ β iff M , w ′ ⊨ β. In other words, w ≃α w ′ iff the worlds w
and w ′ satisfy exactly the same formulas from the set Xα . We use [w] represent the
equivalence class of w with respect to the relation ≃—that is, [w] = {w ′ | w ′ ≃α w}.
. Decidability: Filtrations and the finite model property 

Let Wα = {[w] | w ∈ W }. Observe that Wα is finite whenever Xα is finite. Since


Xα = sf(α), we know that Xα is finite (recall Exercise ..). ∩
Defining Vα is simple: for each [w] ∈ Wα , Vα ([w]) = w ′ ∈[w] V (w ′ ).
Defining Rα is more tricky: in general, this relation has to be defined taking into
account the class of frames under consideration. We now show how to define “suit-
able” Rα for some of the classes of frames for which we have already shown complete
axiomatisations.

Lemma .. e axiom system K has the finite model property.

P Recall that system K is sound and complete for the class F of all frames.
From our discussion of the finite model property, it suffices to show that any formula
satisfiable over F is in fact satisfiable over a finite frame in F .
Let α be a satisfiable formula and let M = ((W , R),V ) be a model for α—-that
is, for some wα ∈ W , M , wα ⊨ α. Let Xα = sf(α) and define Wα and Vα as described
earlier. Define Rα as follows:

Rα = {([w], [w ′ ]) | For each formula β ∈ Xα , if M , w ⊨ □β then M , w ′ ⊨ β}

Let Mα = ((Wα , Rα ),Vα ).


Fix the function f : W → Wα such that w 7→ [w] for each w ∈ W . We claim
that f is an Xα -filtration from M to Mα —for this, we have to verify that f satisfies
properties (i)–(iv) in the definition of filtrations.
It is clear that f is surjective (property (ii)).
To verify property (iii) we have to show that for each p ∈ P ∩ Xα and for each
w ∈ W , p ∈ V (w) iff p ∈ Vα ([w]). Since the worlds in [w] agree on∩all formulas in
Xα , it follows that p ∈ V (w) iff for each w ′ ≃α w, p ∈ V (w ′ ) iff p ∈ w ′ ∈[w] V (w ′ )
iff (by the definition of Vα ) p ∈ Vα ([w]).
Property (i) demands that (w, w ′ ) ∈ R implies ([w], [w ′ ]) ∈ Rα . By the defini-
tion of Rα , ([w], [w ′ ]) ∈ Rα if for each β ∈ Xα , whenever M , w ⊨ □β, M , w ′ ⊨ β
as well. is is immediate from the fact that (w, w ′ ) ∈ R.
Finally, property (iv) states that whenever ([w], [w ′ ]) ∈ Rα , for each formula
□β ∈ Xα , if M , w ⊨ □β then M , w ′ ⊨ β. is follows directly from the definition
of Rα .
Having established that f is an Xα -filtration from M to Mα , it follows that Mα , [wα ] ⊨
α. us Mα is a finite model for α, as required. ⊣

Lemma .. e axiom system T has the finite model property.


. Decidability: Filtrations and the finite model property 

P Recall that system T is sound and complete for the class of reflexive frames.
Let α be a formula satisfiable at a world wα in a model M = ((W , R),V ) where (W , R)
is a reflexive frame. We have to exhibit a finite model for α based on a reflexive frame.
Define Xα and Mα = ((Wα , Rα ),Vα ) as in the proof of Lemma ... We have
already seen that f : w 7→ [w] then defines an Xα -filtration from M to Mα . To
complete the proof of the present lemma, it suffices to show that the frame (Wα , Rα )
is reflexive.
Since R is reflexive, we have (w, w) ∈ R for each w ∈ W . By property (i) of
filtrations, (w, w) ∈ R implies ([w], [w]) ∈ Rα . Since f is surjective, it then follows
that Rα is reflexive as well. (Notice that this argument actually establishes that any
filtration from a reflexive model M to a model M ′ preserves reflexivity.) ⊣

Lemma .. e axiom system S has the finite model property.

P Recall that S is sound and complete for the class of reflexive and transitive
frames. Let α be a formula satisfiable at a world wα in a model M = ((W , R),V )
where (W , R) is reflexive and transitive. We have to exhibit a finite model for α based
on a reflexive and transitive frame.
Let Xα = sf(α) and define Wα and Vα in terms of ≃α as usual. Let Rα be defined
as follows:

Rα = {([w], [w ′ ]) | For each formula □β ∈ Xα . if M , w ⊨ □β then M , w ′ ⊨ □β.}

Let Mα = ((Wα , Rα ),Vα ).


As usual, we define f : W → Wα by w 7→ [w]. We have already seen that such a
function satisfies properties (ii) and (iii) in the definition of a filtration.
We have to verify that f satisfies properties (i) and (iv) with the new definition of
Rα . To show property (i), we have to verify that if (w, w ′ ) ∈ R then ([w], [w ′ ]) ∈
Rα . Suppose that M , w ⊨ □β. Since (W , R) is transitive, M , w ⊨ □β ⊃ □□β, so
M , w ⊨ □□β as well. Since (w, w ′ ), M , w ′ ⊨ □β. us ([w], [w ′ ]) ∈ Rα .
For property (iv), we have to show that if ([w], [w ′ ]) ∈ Rα then for each formula
of the form □β in Xα , if M , w ⊨ □β, then M , w ′ ⊨ β. From the definition of Rα ,
we know that if M , w ⊨ □β, then M , w ′ ⊨ □β as well. Since (W , R) is reflexive,
M , w ′ ⊨ □β ⊃ β, so M , w ′ ⊨ β as required.
Having established that f is an Xα -filtration from M to Mα , it remains to prove
that the frame (Wα , Rα ) is reflexive and transitive. Recall that (W , R) is assumed to be
a reflexive and transitive frame. We have already remarked in the proof of the previous
. Decidability: Filtrations and the finite model property 

lemma that any filtration from a reflexive model preserves reflexivity, so it is immediate
that (Wα , Rα ) is a reflexive frame.
To show transitivity, suppose that ([w1 ], [w2 ]) and ([w2 ], [w3 ]) belong to Rα .
We have to show that ([w1 ], [w3 ]) ∈ Rα as well. is means that for each formula
□β in Xα , we have to show that if M , w1 ⊨ □β then M , w3 ⊨ □β. Suppose that
M , w1 ⊨ □β. Since ([w1 ], [w2 ]) ∈ Rα , we know that M , w2 ⊨ □β. Now, since
([w2 ], [w3 ]) ∈ Rα , it follows that M , w3 ⊨ □β as well. ⊣

Exercise ..

(i) Recall that the axiom system B is sound and complete for the class of symmetric
frames. Show that B has the finite model property. Define Rα as follows:

Rα = {([w], [w ′ ]) | For each formula □β ∈ Xα , (i) if M , w ⊨ □β then M , w ′ ⊨ β


(ii) if M , w ′ ⊨ □β then M , w ⊨ β}

(ii) Recall that the axiom system S is sound and complete for the class of frames
based on equivalence relations. Show that S has the finite model property.
Define Rα as follows:

Rα = {([w], [w ′ ]) | For each formula □β ∈ Xα , M , w ⊨ □β iff M , w ′ ⊨ □β}

Small model property In all the finite models we have constructed, we have defined Wα
to be the set of equivalence classes generated by the relation ≃α . Since the size of sf(α)
is bounded by |α|, it follows that |Wα | is bounded by 2|α| . us, when we establish
the finite model property using the equivalence relation ≃α , we in fact derive a bound
on the size of a finite model for α. As a result, we establish a stronger property, which
we call the small model property.
More formally, we say that a class of frames C has the small model property if
there is a function fC : N → N such that for each formula α satisfiable over the class
C , there is a model for α over C whose size is bounded by fC (|α|). For instance, in
the examples we have seen, fC (|α|) = 2|α| .
. Labelled transition systems and multi-modal logic 

e small model property gives us a more direct decidability argument—to check


if α is satisfiable, we just have to enumerate all models of size less than fC (|α|). To
show that this is possible, we first observe that the number of frames in this subclass
is bounded. To bound the number of models based on this finite set of frames, notice
that it suffices to consider valuations restricted to the finite set of atomic propositions
which occur in α. us given a finite frame, there are only finitely many different
valuations possible over that frame.
is decision procedure has the advantage of giving us a bound on the complexity
of the decision problem. is bound is just the bound on the number of different
models which can be generated whose size is less than fC (|α|).

Exercise .. In the examples we have seen (axiom systems K, T etc.) verify that the
satisfiability of a formula α can be checked in time which is doubly exponential in |α|.

. Labelled transition systems and multi-modal logic


Transition systems A transition system is a pair (S, →) where S is a set of states and
→ ⊆ S × S is a transition relation. Transition systems are a general framework to de-
scribe computing systems. States describe configurations of the system—for instance,
the contents of the disk, memory and registers of a computer at a particular instant.
e transition relation then describes when one configuration can follow another—for
instance the effect of executing a machine instruction which affects some of the mem-
ory, register or disk locations and leaves the rest of the configuration untouched.
It is clear that a transition system has exactly the same structure as a frame (W , R)
in modal logic. Hence, we can use modal logic to describe properties of transition
systems. is is one of the main reasons why modal logic is interesting to computer
scientists.
Often, we are interested in a more structured representation of the configuration
space of a computing system—in particular, we not only want to record that a transi-
tion is possible from a configuration s to a configuration s ′ but we also want to keep
track of the “instruction” which caused this change of configuration. is leads us to
the notion of labelled transition systems.

Labelled transition systems A labelled transition system is a triple (S, Σ, →) where S is a


set of states, Σ is a set of actions and → ⊆ S × Σ × S is a labelled transition relation.
. Labelled transition systems and multi-modal logic 

e underlying structure in a finite automaton is a familiar example of a labelled


transition system, where the set of states is finite.
How can we reason about labelled transition systems in the framework of modal
logic? One option is to ignore the labels and consider the derived transition relation
⇒ = {(s , s ′ ) | ∃a ∈ Σ : (s , a, s ′ ) ∈ →}. We can then reason about the frame (S, ⇒)
using the modalities □ and ◊. is approach is clearly not satisfactory because we
have lost all information about the labels of actions within our logic. A more faithful
translation involves the use of multi-modal logics.

Multi-modal logics A multi-relational frame is a structure (W , R1 , R2 , . . . , Rn ) where


Ri is a binary relation on W for each i ∈ {1, 2, . . . , n}. A multi-relational frame can
be viewed as the superposition of n normal frames (W , R1 ), (W , R2 ), …, (W , Rn ), all
defined with respect to the same set of worlds.
To reason about a multi-relational frame, we define a multi-modal logic whose syn-
tax consists of a set P of atomic propositions, the boolean connectives ¬ and ∨ and a
set of n modalities □1 , □2 , …, □n .
To define the semantics of multi-modal logic, we first fix a valuation V : W → 2P
as before. We then define the satisfaction relation M , w ⊨ α. e propositional cases
are the same as for standard modal logic. e only difference is in the semantics of the
modalities. For each i ∈ {1, 2, . . . , n}, we define

M , w ⊨ □i α iff for each w ′ ∈ W , if w Ri w ′ then M , w ′ ⊨ α

us, the modalities {□i }i ∈{1,2,...,n} are used to “independently” reason about the re-
lations {Ri }i ∈{1,2,...,n} . We can then use the theory we have developed to describe
properties of each of these relations. For instance, the multi-relational frames where
the axioms □3 α ⊃ α and □7 α ⊃ □7 □7 α are valid correspond to the class where R3
is reflexive and R7 is transitive. We can express interdependencies between different
relations using formulas which combine these modalities. For instance, the formula
α ⊃ ◊5 ◊2 β indicates that a world which satisfies α has an R5 -neighbour which in
turn has an R2 -neighbour where β holds.
We have seen how to characterise classes of frames using formulas from modal
logic. We can extend this idea in a natural way to characterise classes of multi-relational
frames.

Exercise .. Consider the class of multi-relational frames (W , R1 , R2 ) where R2 =


R−1
1
. Describe axioms to characterise this class. (Hint: e combined relation R1 ∪ R2
is a symmetric relation on W . Work with suitable modifications of axiom (B). You
may use more than one axiom.) ⊣
. Labelled transition systems and multi-modal logic 

To reason about labelled transition systems in this framework, we have to massage


(S, Σ, →) into a multi-relational frame. To achieve this, we define a relation →a ⊆
S × S for each a ∈ Σ as follows:

→a = {(s , s ′ ) | (s , a, s ′ ) ∈ →}

It is then clear that the multi-relational frame (S, {→a }a∈Σ ) describes the same struc-
ture as the original labelled transition system (S, Σ, →).
To reason about the structure (S, {→a }a∈Σ ), we have modalities □a (read as Box a)
and ◊a (read as Diamond a) for each a ∈ Σ. Traditionally, the modality □a is written
[a] and the modality ◊a is written 〈a〉.
When reasoning about labelled transition systems, the set of atomic propositions
P corresponds to properties which distinguish one configuration of the system from
each other. For instance, we could have an atomic proposition to denote that “memory
location  is unused” or that “the printer is busy”. In these notes, we will not go into
the details of how to model a computing system in terms of such a logic.
Assuming we have an abstract encoding of system properties in terms of atomic
propositions, we can now reason about the dynamic behavior of the system. For in-
stance, we can assert M , s ⊨ [c]〈b 〉α to denote that in the state s , any c-transition will
lead to a state from where we can use a b -transition to realise the property described
by α. In particular, if α is just the constant ⊤, this formula asserts that a b -transition
is enabled after any c-transition.
Unfortunately, we still do not have the expressive power we need to make non-
trivial statements about programs. For instance, we cannot say that after a c-transition,
we can eventually reach a state where a b -transition is enabled. Or that we have reached
a portion of the state space where henceforth only a and d transitions are possible.
For this, we need to move from modal logic to dynamic logic, which is the topic
of discussion in the next section.
Chapter 

Dynamic Logic

Dynamic logic is a multi-modal logic where the modalities are indexed not by un-
interpreted letters, but by programs, which have structure. e relationship between
different programs also forms an integral part of the logic.

. Syntax
As in propositional logic, we begin with a countably infinite set of atomic propositions
P = { p0 , p1 , . . .} and two logical connectives ¬ (read as not) and ∨ (read as or). We
also begin with a countably infinite set of atomic actions A = {a0 , a1 , . . .}.
e set Φ of formulas of dynamic logic and the set Π of programs are simultane-
ously defined by induction as the smallest sets satisfying the following:

• Every atomic proposition p is a member of Φ.

• If α is a member of Φ, so is (¬α).

• If α and β are members of Φ, so is (α ∨ β).

• If α is a member of Φ and π is a member of Π, then ([π]α) is a member of Φ.

• Every atomic action a is a member of Π.

• If π1 and π2 are members of Π, so are (π1 + π2 ) and (π1 · π2 ).

• If π is a member of Π, so is (π∗ ).

• If α is a member of Φ, (α?) is a member of Π.


. Semantics 

As before, we omit parentheses if there is no ambiguity. e derived propositional


connectives ∧, ⊃ and ≡ are defined as before. In addition, we have a derived modality
def
〈π〉 which is dual to the modality [π], defined as follows: 〈π〉α = ¬[π]¬α.
Informally, [π]α is true in a world w iff all worlds w ′ which one ends up in after
executing program π in w satisfies α. e programs π1 + π2 , π1 · π2 , and π∗ denote
nondeterministic choice between π1 and π2 , sequential composition of π1 and π2 ,
and arbitrary iteration of π, respectively. e program α? executed at world w is just
a skip if α is true at w and an abort otherwise.

. Semantics
Frames A frame is just a labelled transition system F = (W , A , →). For each a in
a
A , define −→ ⊆ W × W to be the set of pairs (w, w ′ ) such that (w, a, w ′ ) belongs
a
to →. If w −→w ′ we say that w ′ is an a-neighbour of w.

Models A model is a pair M = (F ,V ) where F = (W , A , →) is a frame and V :


W → 2P is a valuation.

Satisfaction e notion of truth is localised to each world in a model. We write


M , w ⊨ α to denote that α is true at the world w in the model M . e satisfaction
π
relation and the relations −→ for each π in Π are defined by simultaneous induction
π
as follows. We say that w ′ is a π-neighbour of w if w −→w ′ .

M,w ⊨ p iff p ∈ V (w) for p ∈ P


M , w ⊨ ¬α iff M,w ⊭ α
M,w ⊨ α ∨β iff M , w ⊨ α or M , w ⊨ β
π
M , w ⊨ [π]α iff for each w ′ ∈ W , if w −→w ′ then M , w ′ ⊨ α
π1 +π2 π1 π2
w −→ w ′ iff w −→w ′ or w −→w ′
π1 ·π2 π1 π2
w −→ iff for some w ′′ ∈ W , w −→w ′′ and w ′′ −→w ′
π∗ π ∗
w −→w ′ iff w −→ w ′ , where R∗ denotes the reflexive transitive closure of R
α?
w −→w ′ iff w = w ′ and M , w ⊨ α

us, M , w ⊨ [π]α if every π-neighbour of w satisfies α. Notice that if w is π-


π
isolated—that is, there is no world w ′ such that w −→w ′ —then M , w ⊨ [π]α for
every formula α. We say that a sequence of worlds w0 , w1 , . . . , wn (n ≥ 0) is a π-path
π
if wi −→wi+1 for all i such that 0 ≤ i < n. Such a path is said to be of length n. It is
. Semantics 

said to be from w to w ′ if w0 = w and wn = w ′ . w ′ is said to be π-reachable from


π∗
w if there is a π-path from w to w ′ . Notice that w −→w ′ iff w ′ is π-reachable from
w. us M , w ⊨ [π∗ ]α iff every π-reachable world w ′ of w satisfies α.

Satisfiability and validity As usual, we say that α is satisfiable if there exists a frame
F = (W , A , →) and a model M = (F ,V ) such that M , w ⊨ α for some w ∈ W . e
formula α is valid, written ⊨ α, if for every frame F = (W , A , →), for every model
M = (F ,V ) and for every w ∈ W , M , w ⊨ α.

Example  Here are some examples of valid formulas in dynamic logic.

(i) Every substitution instance of a tautology of propositional logic is valid. e


details are trivial.

(ii) e formula [π](α ⊃ β) ⊃ ([π]α ⊃ [π]β) is valid. Consider a model M =


((W , A , →),V ) and a world w ∈ W . Suppose that M , w ⊨ [π](α ⊃ β).
We must argue that M , w ⊨ [π]α ⊃ [π]β. Let M , w ⊨ [π]α. en we
must show that M , w ⊨ [π]β. In other words, we must show that every π-
neighbour w ′ of w satisfies β. Since we assumed M , w ⊨ [π](α ⊃ β), we
know that M , w ′ ⊨ α ⊃ β. Moreover, since M , w ⊨ [π]α, M , w ′ ⊨ α. By the
semantics of the connective ⊃, it follows that M , w ′ ⊨ β, as required.

(iii) e formula [π1 + π2 ]α ≡ ([π1 ]α ∧ [π2 ]α) is valid. Consider a model M =


((W , A , →),V ) and a world w ∈ W . Now M , w ⊨ [π1 + π2 ]α iff (by seman-
π1 +π2
tics) M , w ′ ⊨ α for all π1 + π2 -neighbours w ′ of w iff (by definition of −→ )
M , w ′ ⊨ α for all w ′ that are either π1 -neighours or π2 -neighbours of w iff (by
semantics) M , w ⊨ ([π1 ]α ∧ [π2 ]α).

(iv) e formula [π1 · π2 ]α ≡ [π1 ][π2 ]α is valid. Consider a model M = ((W , A , →


),V ) and a world w ∈ W . Now M , w ⊨ [π1 · π2 ]α iff (by semantics) M , w ′ ⊨
π1 ·π2
α for all π1 · π2 -neighbours w ′ of w iff (by definition of −→) M , w ′ ⊨ α
for all w ′ that are π2 -neighours of some π1 -neighbour w ′′ of w iff (by se-
mantics) M , w ′′ ⊨ [π2 ]α for all π1 -neighbours w ′′ of w iff (by semantics)
M , w ⊨ [π1 ][π2 ]α.

(v) e formula [π∗ ]α ≡ α∧[π][π∗ ]α is valid. Consider a model M = ((W , A , →


),V ) and a world w ∈ W . Now M , w ⊨ [π∗ ]α iff (by semantics) every world
w ′ π-reachable from w satisfies α iff (by definition of π-reachability) w satisfies
α and for all π-neighbours w ′′ of w, all worlds w ′ π-reachable from w ′′ satisfy
. Axiomatising valid formulas 

α iff (by semantics) w satisfies α and every π-neighbour w ′′ of w satisfies [π∗ ]α


iff (by semantics, again) M , w ⊨ α ∧ [π][π∗ ]α.

(vi) e formula (α ∧ [π∗ ](α ⊃ [π]α)) ⊃ [π∗ ]α is valid. Consider a model


M = ((W , A , →),V ) and a world w ∈ W . Suppose M , w ⊨ α and M , w ⊨
[π∗ ](α ⊃ [π]α). For any world w ′ of W that is π-reachable from w, define the
π-height of w ′ (with respect to w) as the length of the shortest π-path from w
to w ′ . We prove by induction on the π-height that every world w ′ π-reachable
from w satisfies α, thereby showing that M , w ⊨ [π∗ ]α. Consider any world
w ′ whose π-height is zero. It follows that w ′ = w and therefore M , w ′ ⊨ α.
Consider any world w ′ whose π-height is a non-zero number n. Clearly, there
π
is a world w ′′ with π-height n − 1 such that w ′′ −→w ′ . Now, by induction
hypothesis, M , w ′′ ⊨ α. But since M , w ⊨ [π∗ ](α ⊃ [π]α), it follows that
M , w ′′ ⊨ α ⊃ [π]α. erefore M , w ′′ ⊨ [π]α, and hence M , w ′ ⊨ α.

(vii) e formula [α?]β ≡ (α ⊃ β) is valid. Consider a model M = ((W , A , →


),V ) and a world w ∈ W . Now M , w ⊨ [α?]β iff M , w ′ ⊨ β for all α?-
α?
neighbours w ′ of w iff (since w −→w ′ iff w = w ′ and M , w ⊨ α) whenever w
satisfies α it also satisfies β iff (by semantics) M , w ⊨ α ⊃ β.

(viii) Suppose that α is valid. en, [π]α must also be valid. Consider any model
M = ((W , A , →),V ) and any w ∈ W . To check that M , w ⊨ [π]α we have
to verify that every π-neighbour of w satisfies α. Since α is valid, M , w ′ ⊨ α
for all w ′ ∈ W . So, every π-neighbour of w does satisfy α and M , w ⊨ [π]α.

. Axiomatising valid formulas


Consider the following axiom system.

Axioms
(A) All tautologies of propositional logic.
(A) [π](α ⊃ β) ⊃ ([π]α ⊃ [π]β).
(A) [π1 + π2 ]α ≡ ([π1 ]α ∧ [π2 ]α).
(A) [π1 · π2 ]α ≡ [π1 ][π2 ]α.
(A) [π∗ ]α ≡ (α ∧ [π][π∗ ]α).
(A) (α ∧ [π∗ ](α ⊃ [π]α)) ⊃ [π∗ ]α.
(A) [α?]β ≡ (α ⊃ β).
. Axiomatising valid formulas 

Inference Rules
α, α ⊃ β α
(MP) (G)
β [π]α

As usual, we say that α is a thesis, denoted ⊢ α, if we can derive α using the axioms
(A) to (A) and the inference rules (MP) and (G). It is easily seen that ⊢ [π](α∧β) ≡
([π]α ∧ [π]β).
e result we want to establish is the following.

eorem .. For all formulas α, ⊢ α iff ⊨ α.

As usual, one direction of the proof is easy.

Lemma .. (Soundness) If ⊢ α then ⊨ α.

P As we observed earlier, it suffices to show that each axiom is valid and that the
inference rules preserve validity. is is precisely what we exhibited in Example . ⊣

As in Propositional Logic and Modal Logic, we use a Henkin-style argument to show


that every valid formula is derivable in our axiom system, but we do not construct a
canonical model. It is technically much simpler to directly construct a finite model for
each consistent formula.

Consistency We say that a formula α is consistent if ⊬¬α. A finite set of formulas


{α1 , α2 , . . . , αn } is consistent if the conjunction α1 ∧α2 ∧· · ·∧αn is consistent. Finally,
an arbitrary set of formulas X is consistent if every finite subset of X is consistent.

Our goal is to prove the following.

Lemma .. Every consistent formula is satisfiable.

As we saw in the case of Propositional Logic, this will yield as an immediate corol-
lary the result we seek:

Corollary .. (Completeness for dynamic logic) Let α be a valid formula. en
⊢ α.
. Axiomatising valid formulas 

Atoms
Instead of working with maximal consistent sets as we did in modal logic, we work
with certain subsets of subformulas of the formula of interest. We first make precise
the notion of subformula of a formula. e definition is not completely obvious – it
has some aspects which are motivated by the proof of completeness. For convenience,
in the rest of the section, we will fix a consistent formula α0 and try to construct a model in
which it is satisfied.

Subformulas Let α be formula. e set of subformulas of α, denoted sf(α), is the


smallest set of formulas such that:

• α ∈ sf(α).

• If ¬β ∈ sf(α) then β ∈ sf(α).

• If β ∨ γ ∈ sf(α) then β ∈ sf(α) and γ ∈ sf(α).

• If [a]β ∈ sf(α) (for a ∈ A ) then β ∈ sf(α).

• If [π1 + π2 ]β ∈ sf(α) then [π1 ]β ∈ sf(α) and [π2 ]β ∈ sf(α).

• If [π1 · π2 ]β ∈ sf(α) then [π1 ][π2 ]β ∈ sf(α).

• If [π∗ ]β ∈ sf(α) then [π][π∗ ]β ∈ sf(α) and β ∈ sf(α).

• If [β?]γ ∈ sf(α) then β ∈ sf(α) and γ ∈ sf(α).

Exercise .. Show that the size of the set sf(α) is bounded by the square of the length
of α. More formally, for a formula α, define |α|, the length of α, to be the number
of symbols in α. Show that if |α| = n then |sf(α)| ≤ n 2 . Give an example where
|sf(α)| < |α|2 . ⊣

It is convenient in what follows to work with negation-closed sets of formulas. For


any formula α, we define α to be β if α is of the form ¬β, and ¬α otherwise. We
define the closure of a formula α, denoted cl(α), to be the set {β, β | β ∈ sf(α)}. Note
that the size of cl(α) is at most twice that of sf(α). In what follows, we will freely use
the fact that ⊢ ¬α ≡ α, and loosely talk of ¬α belonging to a particular set when we
actually mean that α belongs to that set. For the rest of the section, we fix cl to be
cl(α0 ).
. Axiomatising valid formulas 

An atom is a maximal consistent subset of cl—it is a consistent subset A of cl such


that for all α ∈
/ cl, A ∪ {α} is not consistent. It can be easily seen that the atoms are
exactly sets of the form X ∩ cl for some MCS X . e set of all atoms is denoted by
AT.
As we saw earlier, by Lindenbaum’s Lemma, every consistent set of formulas can
be extended to an MCS. In particular, there is an MCS X containing α0 , and hence
(by the observation in the previous paragraph), an atom A0 containing α0 .
We will use the following properties of atoms.

Lemma .. Let A be an atom. en:

(i) For all formulas ¬α ∈ cl, α ∈


/ A iff ¬α ∈ A.

(ii) For all formulas α ∨ β ∈ cl, α ∨ β ∈ A iff α ∈ A or β ∈ A.

(iii) If α ∈ cl is a thesis, then α ∈ A.

(iv) If A is an atom and if A ∪ {α} is consistent for α ∈ cl, then α ∈ A.

(v) If ⊢ (α1 ∧ · · · ∧ αn ) ⊃ β, αi ∈ A for each i ≤ n and β ∈ cl, then β ∈ A.

(vi) For all formulas [π1 + π2 ]α ∈ cl, [π1 + π2 ]α ∈ A iff [π1 ]α ∈ A and [π2 ]α ∈
A.

(vii) For all formulas [π1 · π2 ]α ∈ cl, [π1 · π2 ]α ∈ A iff [π1 ][π2 ]α ∈ A.

(viii) For all formulas [π∗ ]α ∈ cl, [π∗ ]α ∈ A iff α ∈ A and [π][π∗ ]α ∈ A.

(ix) For all formulas [α?]β ∈ cl, [α?]β ∈ A iff α ∈


/ A or β ∈ A.

P e proof is routine and is left as an exercise. ⊣

b to be α ∧ · · · ∧ α , and
For any finite set of formulas A = {α1 , . . . , αn }, define A 1 n
for any finite collection V = {A1 , . . . , Am } of finite sets of formulas, define Vb to be
c ∨ ··· ∨ A
A Ó . We first present the following useful properties related to AT. c
1 m

c
Lemma .. ⊢ AT.

c en ¬AT
P Let AT = {A1 , . . . , Ar }. Suppose it is not the case that ⊢ AT. c is
consistent. In other words, ¬A c ∧ · · · ∧ ¬A
c is consistent. By Lindenbaum’s lemma,
1 r
there is a maximal consistent set X such that ¬A c ∧ · · · ∧ ¬A
c ∈ X . is means that
1 r
b
for all i : 1 ≤ i ≤ r , ¬Ai ∈ X . Let B = X ∩ cl. Since X is a maximal consistent
. Axiomatising valid formulas 

set, B is an atom, i.e. B ∈ AT. Let it be Ak for some k : 1 ≤ k ≤ r . But then


c ∈ X , contradicting the consistency of X . erefore it cannot be the case that ¬AT
A c
k
c
is consistent, and so ⊢ AT. ⊣

b ≡ ¬Vb .
Lemma .. Let U ⊆ AT and let V = AT \ U . en ⊢ U

P Let U = {A1 , . . . , Am } and V = {B1 , . . . , Bn }. Further for all i : 1 ≤ i ≤ m


and j : 1 ≤ j ≤ n, let αi j ∈ Ai \ B j . e following derivation shows that if ⊢ U b ∨ Vb
then ⊢ Ub ≡ ¬Vb . (e line number ℓ denotes 2n(i −1)+2 j +1 and the line number
ij
ℓ′i j denotes ℓi j + 1 for 1 ≤ i ≤ m and 1 ≤ j ≤ n. Note that ℓ11 = 3.)
1. U b ∨ Vb Assumption.
b
2. ¬V ⊃ U b , PL.
c b
ℓ11 . (A1 ⊃ α11 ) ∧ (B1 ⊃ ¬α11 ) α11 ∈ A1 \ B1 .
′ c
ℓ11 . A1 ⊃ ¬B1 b ℓ11 , PL.

···

ℓi j . (Abi ⊃ αi j ) ∧ (Bbj ⊃ ¬αi j ) αi j ∈ Ai \ B j .


ℓ′ . Ab ⊃ ¬Bb
ij i j ℓi j , PL.

···

Ó ⊃ α ) ∧ (B
ℓ mn . (A c ⊃ ¬α ) α mn ∈ Am \ Bn .
m mn n mn
Ó ⊃ ¬B
ℓ′mn . A c ℓ mn , PL.
m n
ℓ′mn + 1. (Ac ∨ ··· ∨ A
Ó ) ⊃ (¬Bb ∧ · · · ∧ ¬B)
bℓ′11 , . . . , ℓ′mn , PL.
1 m 1
ℓ′mn + 2. U b ⊃ ¬Vb ℓ′mn + 1, def. of U b , Vb , PL.
ℓ′ + 3. U b ≡ ¬Vb , ℓ′mn + 2, PL.
mn
Now it follows from definitions of U and V and by Lemma .. that ⊢ U b ∨ Vb .
b ≡ ¬Vb .
From the above derivation ⊢ U ⊣

b.
Lemma .. Let α ∈ cl, and let U denote the set {A ∈ AT | α ∈ A}. en ⊢ α ≡ U

P Let U be {A1 , . . . , Am } and let V = AT \ U be {B1 , . . . , Bn }. en we have


the following derivation.
. Axiomatising valid formulas 

c ⊃ α) ∧ · · · ∧ (A
1. (A Ó ⊃ α) α ∈ Ai for 1 ≤ i ≤ m.
1 m
b
2. U ⊃ α , def. of U b , PL.
3. (Bb1 ⊃ ¬α) ∧ · · · ∧ (B
c ⊃ ¬α)
n α∈ / B j , and hence ¬α ∈ B j for 1 ≤ j ≤ n.
b c)
4. (α ⊃ ¬B1 ) ∧ · · · ∧ (α ⊃ ¬B , PL
n
5. α ⊃ ¬V b , def. of Vb , PL.
6. Ub ≡ ¬Vb Lemma ...
7. α ⊃ U b , , PL.
8. α ≡ U b , , PL.
is completes the proof of the lemma. ⊣
Lemma .. Suppose α and β are formulas and π is a program such that for all A ∈
b or ⊢ A
AT, either ⊢ α ⊃ [π]¬A b ⊃ β. en ⊢ α ⊃ [π]β.

P Let AT be {A1 , . . . , Ar }. Consider an arbitrary atom Ai . If ⊢ Abi ⊃ β, then we


have the following sequence of derivations.
⊢ Abi ⊃ β Assumption
⊢ [π](Abi ⊃ β) Applying rule (G)
⊢ α ⊃ [π](Abi ⊃ β) from the above, by propositional logic
If, on the other hand, ⊢ α ⊃ [π]¬Ab , then we have the following sequence of
i
derivations.
⊢ ¬Abi ⊃ (Abi ⊃ β) Propositional Logic
⊢ [π]¬Abi ⊃ [π](Abi ⊃ β) Axiom (A), Rule (G), and propositional logic
⊢ α ⊃ [π]¬Abi Assumption
⊢ α ⊃ [π](Ab ⊃ β)
i Propositional Logic
us, for all i : 1 ≤ i ≤ r , we have ⊢ α ⊃ [π](Abi ⊃ β). Now by propositional
logic and the fact that ⊢ [π](γ ∧δ) ≡ ([π]γ ∧[π]δ), we immediately see that ⊢ α ⊃
[π]((Ac ⊃ β)∧· · ·∧(A c ⊃ β)). It easily follows that ⊢ α ⊃ [π]((A
c ∨· · ·∨ A
c ) ⊃ β).
1 r 1 r
c=A
But then AT c ∨ ··· ∨ Ac , and by Lemma .. ⊢ AT, c so it immediately follows
1 r
that ⊢ α ⊃ [π]β, as desired. ⊣

e atom graph for α0


e atom graph e atom graph for α0 is defined to be F = (AT, A , →) where, for all
a
A, B ∈ AT and a ∈ A , A−→B iff A b ∧ 〈a〉Bb is consistent.
e atom model is given by M = (F ,V ) where for each A ∈ AT, V (A) = A ∩ P .
π
Given M , the various −→’s for different programs π is defined in the standard manner,
as described in Section ..
. Axiomatising valid formulas 

e heart of the completeness proof is the following lemma.

Lemma .. For each atom A ∈ AT and for each formula α ∈ cl, M , A ⊨ α iff α ∈ A.
In particular, M , A0 ⊨ α0 .

P e proof is by induction on the length of the formula. We precisely define


the length of a formula below. e notion is carefully defined to ensure that as many
formulas in sf(α) as possible end up having length strictly less than that of α. e
notions |α| for a formula α and |π| for a program π are defined by simultaneous
induction as follows: | p| = 1 for p ∈ P , |¬α| = |α| + 1, |α ∨ β| = |α| + |β| + 1,
|[π]α| = |π| + |α|; |a| = 1 for a ∈ A , |π1 + π2 | = |π1 · π2 | = |π1 | + |π2 | + 1,
|π∗ | = |π| + 1, |α?| = |α| + 1.
Note that the definition ensures that |[π1 ]α| < |[π1 + π2 ]α| and |[π1 ][π2 ]α| <
|[π1 · π2 ]α|, for instance. It can be easily checked that all appeals to the induction
hypothesis in the following proof are proper.
In what follows, we prove three claims by simultaneous induction.

(i) For each atom A ∈ W and for each formula α ∈ cl, M , A ⊨ α iff α ∈ A.

(ii) For any two atoms A and B, and any program π which “occurs” in α0 —more
π
formally, any π such that [π]α ∈ cl for some α—if A−→B and [π]α ∈ A, then
α ∈ B.
b
(iii) For any two atoms A and B, and any program π which occurs in α0 , if A∧〈π〉Bb
π
is consistent then A−→B.

Proof of (i)
Basis: If α = p ∈ P ∩ cl, then M , A ⊨ p iff p ∈ V (A) iff p ∈ A, by the definition of
V.
Induction step:
α = ¬β ∈ cl: en M , A ⊨ ¬β iff M , A ⊭ β iff (by the induction hypothesis) β ∈
/A
iff (by the fact that A is an atom) ¬β ∈ A.
α = β ∨ γ ∈ cl: en M , A ⊨ β ∨ γ iff M , A ⊨ β or M , A ⊨ γ iff (by the induction
hypothesis) β ∈ A or γ ∈ A iff (by the fact that A is an atom) β ∨ γ ∈ X .
α = [π]β ∈ cl: We analyse this case in two parts:
(⇐) Suppose that [π]β ∈ A. We have to show that M , A ⊨ [π]β. Consider any
π
atom B such that A−→B. By (ii), we know that β ∈ B. By induction hypothesis,
M , B ⊨ β. Since B is an arbitrary π-neighbour of A, M , A ⊨ [π]β, as desired.
. Axiomatising valid formulas 

π
(⇒) Suppose M , A ⊨ [π]β. is means that for all atoms B such that A−→B,
M , B ⊨ β. In other words, for all atoms B such that M , B ⊭ β, it is not the case
π b ⊃ [π]¬B.
that A−→B. By (iii), this implies that for all such B, ⊢ A b By induction
hypothesis on β, M , B ⊭ β iff β ∈ / B. us our earlier statement is equivalent to
saying that for all atoms B such that β ∈ / B, ⊢ A b ⊃ [π]¬B. b By the properties of
atoms, this is the same as saying that for all atoms B such that ¬β ∈ B, ⊢ Ab ⊃ [π]¬B.
b
b
By propositional logic, axiom (A) and rule (G), we can see that ⊢ A ⊃ [π]¬U , b
where U is the set of all atoms which contain ¬β. But by Lemma .., we see that
⊢Ub ≡ ¬β. erefore ⊢ A b ⊃ [π]β. But A b is a conjunction of formulas belonging to
A and [π]β ∈ cl, and A is an atom, so it follows that [π]β ∈ B, as desired.
Proof of (ii)
Basis: Suppose π = a ∈ A and A and B are atoms. Let [a]α ∈ A and α ∈ / B. en
b b
¬α ∈ B. Now it is easy to see that ⊢ A ⊃ [a]α and ⊢ B ⊃ ¬α. us ⊢ α ⊃ ¬B, b and
hence by rule (G), ⊢ [a]α ⊃ [a]¬B. b ⊃ [a]¬B.
b erefore it follows that ⊢ A b But this
a a
means that it is not the case that A−→B. us we see that if A−→B and [a]α ∈ A
then α ∈ B.
Induction step:
π = π1 + π2 : For any atom A, [π1 + π2 ]α ∈ A iff [π1 ]α ∈ A and [π2 ]α ∈ A. Now
π1 +π2 π1 π2
A −→ B iff A−→B or A−→B. In either case it follows from induction hypothesis
that α ∈ B.
π1 ·π2
π = π1 · π2 : For any atom A, [π1 · π2 ]α ∈ A iff [π1 ][π2 ]α ∈ A. Now A −→B
π1 π2
iff there exists another atom C such that A−→C and C −→B. Now by induction
hypothesis it follows that [π2 ]α ∈ C and again by induction hypothesis it follows that
α ∈ B.
π = π1∗ : For any atom A, [π1∗ ]α ∈ A iff α ∈ A and [π1 ][π1∗ ]α ∈ A. Consider any
π1∗
atom such that A−→B. is means that there exists a sequence of atoms A0 , . . . , Ak
π1
(k ≥ 0) such that A = A0 , B = Ak and for all i : 0 ≤ i < k, Ai −→Ai +1 . We prove by
induction that [π1∗ ]α ∈ Ai for all i : 0 ≤ i ≤ k. In particular, [π1∗ ]α ∈ Ak = B and
hence α ∈ B, as desired.
Now for the induction. Clearly [π1∗ ]α ∈ A0 . Suppose [π1∗ ]α ∈ Ai . en
π1
[π1 ][π1∗ ]α ∈ Ai . But since Ai −→Ai +1 , we can apply the induction hypothesis on
π1
−→ to conclude that [π1∗ ]α ∈ Ai+1 , as desired.
π = β?: For any atom A, [β?]α ∈ A iff β ∈
/ A or α ∈ A. By applying (i) on β, β ∈
/A
. Axiomatising valid formulas 

β?
iff M , A ⊭ β. Now A−→B iff M , A ⊨ β and A = B. is tells us that β ∈ A and
hence it has to be the case that α ∈ A = B.
Proof of (iii)
a
Basis: For a ∈ A , it immediately follows from the definition of −→ that whenever
b ∧ 〈a〉B,
A b A−→B.a

Induction step:
π = π1 + π2 : Suppose π1 + π2 occurs in α0 . We prove the desired claim in the
π1 +π2 π1
contrapositive form. It is not the case that A −→ B iff it is not the case that A−→B
π2
and it is not the case that A−→B. But by induction hypothesis, this implies that
⊢Ab ⊃ [π ]¬Bb and ⊢ A b ⊃ [π ]¬B.
b It immediately follows from Axiom (A) that
1 2
⊢Ab ⊃ [π + π ]¬B. b
1 2

π = π1 · π2 : Suppose π1 · π2 occurs in α0 . We prove the desired claim in the con-


π1 ·π2
trapositive form. It is not the case that A −→B iff it is not the case that there exists
π1 π2
an atom C such that A−→C and C −→B. But by induction hypothesis, this im-
plies that for all atoms C , ⊢ Ab ⊃ [π ]¬C
b or ⊢ C
b ⊃ [π ]¬B. b But we can appeal to
1 2
b
Lemma .. now—with A in place of α, [π2 ]¬Bb in place of β, and π1 in place of
π—and conclude that ⊢ A b ⊃ [π ][π ]¬B.b But now it follows from Axiom (A) that
1 2
⊢A b ⊃ [π · π ]¬B, b as desired.
1 2

π = α?: Suppose α? occurs in α0 . We prove the desired claim in the contrapositive


α?
form. It is not the case that A−→B iff it is either the case that M , A ⊭ α or it is the case
that A ̸= B. In the first case, by (i) applied to α, α ∈/ A and hence ¬α ∈ A (A being
b
an atom and α being in cl). erefore ⊢ A ⊃ ¬α and, sure enough, ⊢ A b ⊃ (α ⊃ ¬B). b
In the second case, it is clear that there is some β ∈ A such that ¬β ∈ B. It therefore
follows that ⊢ Ab ⊃ ¬Bb and therefore ⊢ A b ⊃ (α ⊃ ¬B).
b So in both cases it is clear that
b b
⊢ A ⊃ (α ⊃ ¬B). But by Axiom (A), this is the same as saying that ⊢ A b ⊃ [α?]¬B, b
as desired.
π = π1∗ : Suppose π1∗ occurs in α0 . We prove the desired claim in the contrapositive
π1∗
form. Suppose it is not the case that A−→B. Define U to be the set of all π1 -reachable
π1∗
worlds from A, i.e. U = {C ∈ AT | A−→C }. Clearly B ̸= C for all C ∈ U . Now
b ⊃ ¬D.
for any two distinct atoms C and D, it is easy to see that ⊢ C b erefore it
b ⊃ ¬B.
follows that ⊢ U b Now suppose we prove that ⊢ U b ⊃ [π ]U b . en it follows
1
by axiom (A), rule (G), and propositional logic that ⊢ Ub ⊃ [π∗ ]U b⊃ U
b . But ⊢ A b
1
. Axiomatising valid formulas 

(since A ∈ U ) and ⊢ U b ⊃ ¬B.


b erefore by axiom (A), rule (G) and propositional
b ⊃ [π∗ ]¬B,
logic it follows that ⊢ A b as desired. It is only left to verify the following
1
claim.

b ⊃ [π ]U
Claim ⊢ U b.
1

Proof Let V = AT\ U . By Lemma .., ⊢ U b ≡ ¬Vb . us it suffices to


b b
show that ⊢ U ⊃ [π1 ]¬V . Consider any C ∈ U and D ∈ V . en it
is clear that D is not a π1 -neighbour of Y . (If it were, then by definition
of U , D would also belong to U , which is a contradiction.) e fact
π1
that D is not a π1 -neighbour of C and the induction hypothesis on −→
immediately imply that C b ∧ 〈π 〉Db is not consistent. In other words,
1
⊢C b ⊃ [π ]¬D. b But this holds for every C ∈ U and D ∈ V . us by
1
axiom (A), rule (G) and propositional logic, ⊢ Ub ⊃ [π ]¬Vb , and the
1
claim follows.

is completes the proof of Lemma .., and hence of Lemma ... ⊣

Once we have proved Lemma .., we immediately obtain a proof of complete-


ness (Corollary ..) using exactly the same argument as in propositional logic. Note
that we not only have completeness but also the small model property for dynamic
logic, as follows: whenever α is satisfiable it is consistent (by soundness), whence it is
2
satisfied in the atom model for α (which is of size at most 22·|α| ). us we also see that
the satisfiability problem for dynamic logic is decidable.
Chapter 

First-Order Logic

Consider typical structures which we come across in mathematics and computer sci-
ence—graphs, groups, monoids, rings, fields, …. A graph, for instance, is a set of
vertices with a binary relation on this set which defines the edges. A group is a set
equipped with a special constant (identity) and a binary function on the set which
is associative. In general, all these structures consist of an underlying set of elements
together with relations and functions defined over this set which satisfy certain prop-
erties.
First-order logic provides a natural framework for talking about such structures.
In first-order logic, we begin by fixing abstract symbols to denote relations, functions
and constants. ese can then be combined using the usual propositional connectives
built up from ¬ (not) and ∨ (or). In addition, first-order logic provides the means to
quantify over elements¹ in the structure—we have the existential quantifier ∃ (read as
“there exists”) and its dual, the universal quantifier ∀ (read as “for all”). e logic also
has the symbol ≡, denoting equality, as a primitive construct.²
Defining the precise syntax and semantics of first-order logic is a little more in-
volved than for propositional or modal logics. Before getting into the details, let us
look at an informal example.

Groups in first-order logic As we know, a group is a structure (G, +, 0) where G is a set,


0 ∈ G is a special element called the identity and + : G × G → G is a binary operation
¹e “first” in first-order logic refers to the limitation placed on the quantifiers. In first-order logic,
we can only quantify over single elements of the underlying set. In second-order logic, we can quantify
over functions and relations. In third-order logic we can quantify over sets of function etc.
²We use ≡ in the logical language rather than = to avoid any confusion between syntactic references
to equality and “real” equality over sets.


such that the following properties hold:

• e operation + is associative.

• e constant 0 is a right-identity for the operation +.

• Every element in G has a right-inverse—that is, for each x ∈ G we can find


another element y ∈ G such that x + y = 0.

To formalise this in first-order logic, we have to first fix the symbols in the language.
We choose a function symbol op which takes two arguments and a constant symbol ϵ.
We can then write the following formulas.

(G) ∀x∀y∀z op(op(x, y), z) ≡ op(x, op(y, z))

(G) ∀x op(x, ϵ) ≡ x

(G) ∀x∃y op(x, y) ≡ ϵ

To assign meaning to these formulas, we fix a set S and map the symbol op to a


binary function f on S and ϵ to an element s of S. e symbol ≡ is assumed to be
interpreted as equality over the set S. e formula (G) then captures the fact that the
function f denoted by op is associative. e next formula expresses that the element s
denoted by ϵ acts as a right identity for the function f . e last formula postulates the
existence of a right inverse for each element in S. If the set S, the function f assigned
to op and the element s assigned to ϵ “satisfy” the formulas (G)–(G), we say that the
structure (S, f , s ) is a model for (G)–(G). It should be clear that any model (S, f , s )
of (G)–(G) is in fact a group. Conversely, any group (G, +, 0) can be made a model
of (G)–(G) by assigning + to be the function³ denoted by op and 0 to be the element
denoted by ϵ. us, in a precise logical sense, the formulas (G)–(G) describe groups:
a structure (S, f , s ) is a group iff it is a model of (G)–(G).
Our goal is to explore the extent to which first-order logic can capture properties
of mathematical structures. While several properties can be naturally described in the
logic, we shall see that various useful properties cannot. In the process of arriving at
these results, we shall formally analyse first-order logic as we have done other logics so
far—we shall explore issues such as compactness, completeness and decidability.
³Notice that though we normally write the group operation + in infix notation as x + y, it is just a
binary function and can just as well be written +(x, y).
. Syntax 

. Syntax
First-order languages To define the formulas of first-order logic, we have to first fix
the underlying language. A first-order language is a triple L = (R, F , C ) where R =
{r1 , r2 , . . .} is a countable set of relation symbols, F = { f1 , f2 , . . .} is a countable set
of function symbols and C = {c1 , c2 , . . .} is a countable set of constant symbols. Each
symbol r ∈ R and f ∈ F is associated with an arity, denoted #(r ) or #( f ), indicating
how many arguments the symbol takes. We also fix a countable set Var = {v1 , v2 , . . .}
of variables. We shall use x, y, z, . . . to denote typical elements of Var.

e set of first-order formulas over a first-order language L is built up from atomic


formulas using the propositional connectives ¬ and ∨ and the existential quantifier ∃.
To define atomic formulas, we first have to define the terms of the language L.

Terms Let L = (R, F , C ) be a first-order language. e set of terms over L is the


smallest set satisfying the following conditions:

• Every constant symbol c ∈ C is a term.

• Every variable x ∈ Var is a term.

• Let t1 , t2 , . . . , tn be terms over L and let f ∈ F be a function symbol of arity n


. en f (t1 , t2 , . . . , tn ) is a term.

A term which does not contain any variables is called a closed term. Notice that if L
contains no function symbols, then the only terms over L are constants from C and
variables from Var.
As we described before in an informal way, to define the semantics of first-order
logic we have to fix a structure with respect to which the formulas of the language
are interpreted. is interpretation will map each term to a unique element of the set
underlying the structure. It is helpful to think of terms as the “names” which we can
generate within L to talk about elements in the structure we are interested in.

Atomic formulas Let L = (R, F , C ) be a first-order language. e atomic formulas over


L are defined as follows:

• Let r ∈ R be a relation symbol of arity n and let t1 , t2 , . . . , tn be terms over L.


en, r (t1 , t2 , . . . , tn ) is an atomic formula.

• Let t1 and t2 be terms over L. en, t1 ≡ t2 is an atomic formula.


. Semantics 

Atomic formulas play the role of atomic propositions in propositional logic. e


first type of atomic formula asserts that the n-tuple denoted by 〈t1 , t2 , . . . , tn 〉 is part
of the relation denoted by r while the second type of atomic formula asserts that two
different terms t1 and t2 are in fact just different “names” for the same element. Both
these types of statements can be unambiguously labelled as true or false once we have
fixed a structure and the interpretation of the symbols in the language within that
structure.

Formulas Having defined the atomic formulas, we can then define ΦL , the set of first-
order formulas over L. e set ΦL is the smallest set satisfying the following conditions:

• Every atomic formula over L belongs to ΦL .

• If φ ∈ ΦL then ¬φ ∈ ΦL .

• If φ, ψ ∈ ΦL then φ ∨ ψ ∈ ΦL .

• If φ ∈ ΦL and x ∈ Var, then ∃x φ ∈ ΦL .

As usual, we may use parentheses to disambiguate the structure of a formula. We


can define derived propositional connectives ∧, ⊃ and ≡ using ¬ and ∨ in the standard
way. In addition, we define the dual of ∃ as follows:
def
∀x φ = ¬∃x ¬φ

. Semantics
As we saw informally earlier, to give meaning to a first-order formula over a language
L = (R, F , C ), we have to fix a set S and assign a relation over S to each relation
symbol in R, a function over S to each function symbol in F and an element of S to
each constant symbol in C .

First-order structures Let L = (R, F , C ) be a first-order language. A first-order structure


for L is a pair M = (S, ι) where S is a non-empty set and ι is a function defined over
R ∪ F ∪ C such that:

• For each relation symbol r ∈ R with #(r ) = n, ι(r ) is an n-ary relation over
S—that is, ι(r ) ⊆ S n .

• For each function symbol f ∈ F with #( f ) = n, ι( f ) is an n-ary function over


S—that is, ι( f ) : S n → S.
. Semantics 

• For each constant symbol c ∈ C , ι(c) is an element of S—that is, ι(c) ∈ S.

For convenience, we often denote ι(r ), ι( f ) and ι(c) by r M , f M and c M respec-


tively. We also refer to a first-order structure for L as an L-structure.
Once we define a first-order structure, we fix the meaning of the symbols in the
first-order language. However, we also have to assign meanings to the variables in Var.
Once this is done, we can assign meaning to all formulas in the language.

Interpretation Let L = (R, F , C ) be a first-order language. An interpretation of L is a


pair I = (M , σ) where M = (S, ι) is a first-order structure for L and σ : Var → S
is an assignment of elements of S to variables in Var. In informal usage, we say that an
interpretation or a structure has a certain cardinality when we mean that the associated
underlying set has that cardinality.
Let σ : Var → S be an assignment. We denote by σ[x1 7→ s1 , x2 7→ s2 , . . . , xn 7→
sn ] the modified assignment σ ′ where σ ′ (xi ) = si for i ∈ {1, 2, . . . , n} and σ ′ (z) =
σ(z) for all variables z ∈ / {x1 , x2 , . . . , xn }. For an interpretation I = (M , σ), we use
I [x1 7→ s1 , x2 7→ s2 , . . . , xn 7→ sn ] to denote the modified interpretation (M , σ[x1 7→
s1 , x2 7→ s2 , . . . , xn 7→ sn ]).

We mentioned earlier that terms are names for elements in the structure. We can
now make this statement precise. Once we fix an interpretation I , each term t over
L maps to a unique element t I of S. Let I = (M , σ) where M = (S, ι). en:

• If t is a constant c ∈ C , t I = c M .

• If t is a variable x ∈ Var, t I = σ(x).

• If t is of the form f (t1 , t2 , . . . , tn ) where f ∈ F , then t I = f M (t1I , t2I , . . . , tnI ).

Satisfaction relation Let L = (R, F , C ) be a first-order language and let I be an inter-


pretation for L. e notion of a formula φ ∈ ΦL being satisfied under the interpreta-
tion I = (M , σ) is denoted I ⊨ φ and is defined as follows:

• I ⊨ t1 ≡ t2 if t1I = t2I .

• I ⊨ r (t1 , t2 , . . . , tn ) if (t1I , t2I , . . . , tnI ) ∈ r M .

• I ⊨ ¬φ if I ⊭ φ.

• I ⊨ φ ∨ ψ if I ⊨ φ or I ⊨ ψ.

• I ⊨ ∃x φ if there is an element s ∈ S such that I [x 7→ s ] ⊨ φ.


. Semantics 

Exercise .. Verify that the semantics of ∀x φ is as follows:

I ⊨ ∀x φ if for each element s ∈ S, I [x 7→ s ] ⊨ φ.

As usual, we say that a first-order formula φ ∈ ΦL is satisfiable if there is an in-


terpretation I based on an L-structure M such that I ⊨ φ. Similarly, a formula
φ ∈ ΦL is valid if for every L-structure M and every interpretation I based on M ,
I ⊨ φ. A model of φ is an interpretation satisfying φ.

Bound and free variables Before looking at examples of how to describe properties of
structures in first-order logic, let us look closer at the role that variables play in defining
the meaning of a formula.
As we saw above, we need to augment an L-structure M with an assignment σ in
order fully specify the meaning of formulas. In principle, σ fixes a value for all variables
in Var. However, for a fixed formula φ, we only need to know the values fixed by σ
for those variables mentioned in φ.
More precisely, we only need σ to fix values of variables which are not “quantified”
within φ. In a formula of the form ∃x ψ or ∀x ψ, the value assigned by σ to x is
irrelevant in fixing the meaning of the overall formula—the semantics of the quantifiers
forces us to look at all possible assignments for x in order to give meaning to the
formula.
Formally, in a formula of the form ∃x ψ the scope of the quantifier ∃x is the
formula ψ. We say that a variable x is free in φ if it does not occur within the scope
of a quantifier ∃x. Otherwise, x is said to be bound. For a formula φ, the set of free
variables of φ, denoted FV(φ), is defined inductively as follows:

• If φ is an atomic formula r (t1 , t2 , . . . , tn ), FV(φ) is the set of variables men-


tioned in {t1 , t2 , . . . , tn }.

• If φ is an atomic formula t1 ≡ t2 , FV(φ) is the set of variables mentioned in


{t1 , t2 }.

• FV(¬φ) = FV(φ).

• FV(φ ∨ ψ) = FV(φ) ∪ FV(ψ).

• FV(∃x φ) = FV(φ) \ {x}.


. Semantics 

In the rest of the notes, we often write φ(x1 , x2 , . . . , xk ) to denote that FV(φ) ⊆
{x1 , x2 , . . . , xk }.
e following proposition, analogous to Proposition .. of Propositional Logic,
formalises the fact that the meaning of a formula does not depend on that portion of
the assignment which lies outside its set of free variables.

Proposition .. Let L be a first-order language and φ ∈ ΦL . Let M be an L-structure


and σ, σ ′ be a pair of assignments which agree on FV(φ). en (M , σ) ⊨ φ iff (M , σ ′ ) ⊨
φ.

In other words, to give meaning to a formula φ(x1 , x2 , . . . , xn ), it is sufficient to fix


a structure M and an assignment for the variables x1 , x2 , . . . , xn which are potentially
free in φ, rather than specifying an assignment σ over all variables. us, we can write
(M , [x1 7→ s1 , x2 7→ s2 , . . . , xn 7→ sn ]) ⊨ φ to indicate that (M , σ) ⊨ φ for every
assignment σ which assigns xi the value si for i ∈ {1, 2, . . . , n}.

Sentences A sentence is a first-order formula with no free variables. e formulas


(G)–(G) which we wrote earlier to describe properties of groups are all sentences.
From the preceding discussion, it is clear that the meaning of a sentence is fixed once
we fix an L-structure for the language L—assignments play no role in defining the
meaning of a sentence.

Corollary .. Let L be a first-order language and φ ∈ ΦL a sentence. Let M be an


L-structure and σ, σ ′ any pair of assignments. en, (M , σ) ⊨ φ iff (M , σ ′ ) ⊨ φ.

In other words, for a sentence φ and an L-structure M it makes sense to directly write
M ⊨ φ. As usual, if X is a set of sentences, we write M ⊨ X to denote that M ⊨ φ
for each sentence φ ∈ X ..

Logical consequence We formalise the notion of logical consequence in first-order logic


in the same way that we have for propositional logic. Let X be a set of first-order
sentences over L. We say that a sentence φ is a logical consequence of X , denoted
X ⊨ φ, if it is the case that for every structure M , if M ⊨ X then M ⊨ φ.

us, for instance, the first-order formulas which are valid over all groups are just
those formulas which are logical consequences of the sentences (G)–(G) which we
used to characterise groups.
We end this section with some notation about variables and some assumptions
about substitution. Given a formula φ(x1 , x2 , . . . , xn ), where {x1 , x2 , . . . , xn } ⊆ FV(φ),
. Formalisations in first-order logic 

and terms t1 , t2 , . . . , tn , the formula φ(x1 , x2 , . . . , xn )[x1 7→ t1 , x2 7→ t2 , . . . , xn 7→ tn ]


is obtained by substituting uniformly for xi by ti in φ for i ∈ {1, 2, . . . , n}. In the
process, it may be that a variable in one of the terms ti accidentally “intrudes” into the
scope of a quantifier in φ. For instance, consider the formula φ(x) = ∃y ¬(x ≡ y) and
t = y. If we blindly substitute x by t , we end up with the formula ∃y ¬(y ≡ y), which
is clearly not what was intended. In such cases, we assume that the bound variables
in φ are renamed to avoid clashes—in the preceding example, ∃y ¬(x ≡ y) [x 7→ y]
would result in a formula of the form ∃z ¬(y ≡ z). We shall not go into the precise
definition of this renaming operation, but it should be intuitively clear from the exam-
ple. Henceforth, we implicitly assume that such renaming is performed whenever we
substitute a term for a free variable in a formula. We frequently abbreviate the formula
φ(x1 , x2 , . . . , xn )[x1 7→ t1 , x2 7→ t2 , . . . , xn 7→ tn ] as φ(t1 , t2 , . . . , tn ).

. Formalisations in first-order logic


We have seen, informally, how to represent groups in terms of first-order logic. Now
that we have the precise syntax and semantics of the logic in place, let us look at some
more examples of how to describe properties of structures in the logic.

Groups revisited
As we saw earlier, the three sentences (G)–(G) characterize groups, in the sense that
any structure M = (S, f , s ) which is a model for (G)–(G) defines a group over the
set S with group operation f and identity s .
In groups, the cancellation law holds. is says that for any three elements x, y, z
in the group, if x ◦ z = y ◦ z, then x = y. Recall that the language we chose for
groups consisted of a binary function symbol op and a constant ϵ. In this language,
the cancellation law can be stated as follows:
def
φc = ∀x ∀y ∀z (op(x, z) ≡ op(y, z) ⊃ x ≡ z)

Since the cancellation law φc holds in all groups, we would expect that (G1), (G2), (G3) ⊨
φc .
An element g in a group (G, +, 0) such that g ̸= 0 and |g + g + {z
· · · + g = 0 is
}
n times

said to be of order n. We can formulate the fact that a group has no elements of order
two as follows:
def
ψ = ¬∃x (¬(x ≡ ϵ) ∧ op(x, x) ≡ ϵ)
. Formalisations in first-order logic 

In other words, if M = (S, f , s ) is a model for (G)–(G) and M ⊨ ψ, then M is a


group which has no elements of order two.
An abelian group is one in which the group operation is commutative. is is
simple to state:

(Ab) ∀x ∀y op(x, y) ≡ op(y, x)

us, the set of sentences {(G1), (G2), (G3), (Ab)} characterize abelian groups.
Lest we get the impression that all interesting properties of groups can be captured
easily in first-order logic, let us consider torsion groups. A group (G, +, 0) is said to be
a torsion group if every element of G has finite order—that is, for each g ∈ G, there is
a natural number n ≥ 1 such that |g + g + {z
· · · + g = 0. To formalize this in a “natural
}
n times

way”, we would have to write a formula of the form


∀x (x ≡ ϵ ∨ op(x, x) ≡ ϵ ∨ op(op(x, x), x) ≡ ϵ ∨ · · · )
is is an infinite formula and is not permitted by our syntax. We shall show later that
we cannot capture this property in first-order logic, even if we are permitted an infinite
set of formulas to replace this single formula of infinite width.

Equivalence relations
Let r be a binary relation symbol in the language. We can force r to be interpreted as
an equivalence relation through the following three sentences.

• ∀x r (x, x)

• ∀x ∀y (r (x, y) ≡ r (y, x))

• ∀x ∀y ∀z ((r (x, y) ∧ r (y, z)) ⊃ r (x, z))

It should be clear that in any structure M , these three sentences would force r M
to be reflexive, symmetric and transitive.

Order
Ordered structures occur frequently in mathematics. A strict linear order < over a
set S is a non-empty binary relation which is irreflexive and transitive and which has
the property that any two distinct elements in S are related by <. For instance, the
less-than ordering over the set of natural numbers is a strict linear order.
Using the same symbol < to denote the ordering relation within our language, we
can axiomatise linear order using the following sentences.
. Formalisations in first-order logic 

• ∀x ¬(x < x)

• ∀x ∀y ∀z ((x < y ∧ y < z) ⊃ x < z)

• ∀x ∀y (x < y ∨ x = y ∨ y < x)

Fields
Recall that a field is a structure (F , +, ·, 0, 1) where:

• (F , +, 0) is an abelian group.

• · is a associative, commutative operation over F with identity 1 such that 0 ̸= 1


and every element other than 0 has a right-inverse with respect to ·.

• e operation · distributes over the operation +.

Exercise .. Using a first-order language with two binary function symbols and two
constants, axiomatise fields. ⊣

Questions of cardinality
We can make assertions about the size of structures in first-order logic. Consider the
sentence
def
φ≥2 = ∃x ∃y ¬(x ≡ y)
Clearly, any structure which models φ≥2 must have at least two distinct elements in the
underlying set. We can easily generalize this formula to φ≥n for any natural number
n as follows: ∨
def
φ≥n = ∃x1 ∃x2 · · · ∃xn ¬(xi ≡ x j )
i ̸= j

Conversely, the negation ¬φ≥2 = ∀x ∀y (x ≡ y) asserts that the underlying struc-


ture has at most one element. (In fact, since we only deal with non-empty structures,
¬φ≥2 asserts that the structure has exactly one element).
We can thus combine formulas of the form φ≥n and ¬φ≥m to tightly bound the
range of elements in the structure.
Alternatively, we can use the infinite family of sentences {φ≥2 , φ≥3 , . . .} to specify
that the structure we are interested is not finite.
. Formalisations in first-order logic 

Modal logic as a fragment of first-order logic


As a final example of formalisation in first-order logic, let us look at how to embed
modal logic within first-order logic. In order to achieve this, we have to show how to
translate models and formulas of modal logic into first-order logic in such a way that
a model M = (F ,V ) satisfies a formula α iff the structure M corresponding to M
satisfies the formula α̂ corresponding to α.
Let P = { p0 , p2 , p2 , . . .} be the set of atomic propositions which are used in
defining the formulas of modal logic. e first-order language L which we use to
embed formulas over P will have a binary relation symbol r , to describe the under-
lying modal frame, and unary relation symbols {P0 , P1 , P2 , . . .} to describe the valua-
tion. us, an L-structure M would consist of a set S, which constitute the “possible
worlds”, together with a relation r M , describing the accessibility relation, and subsets
P0M , P1M , P2M , . . . describing the valuations V ( p0 ),V ( p1 ),V ( p2 ), . . ..
We inductively define a translation {α 7→ α̂(x)}, where x is a variable, for all modal
logic formulas over P as follows:

def
• For pi ∈ P , p̂i (x) = Pi (x), where x is a variable.
def
• If α = ¬β, then α̂(x) = ¬β̂(x).
def
• If α = β ∨ γ , then α̂(x) = β̂(x) ∨ γ̂ (x).
def
• If α = □β, then α̂(x) = ∀y (r (x, y) ⊃ β̂(y)).

Proposition .. Let α be a modal logic formula over P . en, α is satisfiable iff α̂(x)
is first-order satisfiable.

P (⇒) Suppose that M = (F ,V ), with F = (W , R) such that for w ∈ W ,


M , w ⊨ α. We use W as the underlying set of our structure M and set r M = R and
PiM = V ( pi ) for each pi ∈ P . We can then establish that (M , [x 7→ w]) ⊨ α̂(x) by
induction on the structure of α.
For brevity, we only consider one case in detail, when α is of the form □β. Recall
that α̂(x) is then given by ∀y (r (x, y) ⊃ β̂(y)). Since M , w ⊨ □β, we know that for
all elements w ′ ∈ W such that w R w ′ , M , w ′ ⊨ β. From the induction hypothesis
and the fact that r M = R, it follows that for each y such that [y 7→ w ′ ] and w R w ′ ,
(M , [y 7→ w ′ ]) ⊨ β̂(y). From the semantics of the universal quantifier, it then follows
that (M , [x 7→ w]) ⊨ α̂(x).
. Formalisations in first-order logic 

(⇐) Conversely, suppose that there is a structure M based on a set S such that
for some s ∈ S, (M , [x 7→ s ]) ⊨ α̂(x). We must show that α is satisfiable. We fix
our frame to be (S, r M ) and for each pi ∈ P , we fix V ( pi ) = PiM . Once again, by
induction on the structure of α, we can establish that M , s ⊨ α. We omit the details.

Our translation from modal logic to first-order logic allows us to reduce some
questions about modal logic to the framework of modal logic. For instance, by the
preceding proposition, questions about the satisfiability or validity of a formula α in
modal logic can be phrased in terms of the first-order satisfiability or first-order validity
of the corresponding formula α̂(x).
We can even reduce more sophisticated questions to first-order logic. For instance,
if we want to check whether a formula α is satisfiable over a frame whose accessibil-
ity relation is an equivalence relation, we can check the simultaneous satisfiability of
α̂(x) along with the three first-order sentences we saw earlier which capture the fact
that the relation r is an equivalence relation. In general, questions about “relativised
satisfiability” can be reduced to first-order logic whenever the properties demanded of
the accessibility relation can be captured using first-order sentences.
We can even talk about satisfiability with respect to classes of frames which cannot
be axiomatised in modal logic—for instance, the sentence ∀y (¬r (y, y)) describes the
class of irreflexive frames, which cannot be described in modal logic. In other words,
the formula ∀y (¬r (y, y)) ∧ α̂(x) is satisfiable iff α is satisfiable over an irreflexive
frame.
e disadvantage with reducing questions about modal logic to first-order logic is
that first-order logic is too powerful from a computational point of view—for instance,
we shall observe later that satisfiability is undecidable for first-order logic. On the other
hand, we showed that for many systems of modal logic, satisfiability is in fact decidable.

Exercise .. Let L be a finite first-order language and let M be a finite L-structure.
Show that there is an L-sentence φM the models of which are precisely the L-structures
isomorphic to M . ⊣

Exercise ..
(i) Let L = {+, ×, 0} where + and × are binary function symbols and 0 is a con-
stant symbol. Consider the L-structure (R, +, ×, 0), where R is the set of real
numbers with the conventional interpretation of +, × and 0 as addition, mul-
tiplication and zero.
. Formalisations in first-order logic 

Show that the relation < (”less than”) is elementary definable in (R, +, ×, 0)—that
is, there is a formula φ(x, y) over L such that for all a, b in R, ((R, +, ×, 0), [x 7→
a, y 7→ b ]) ⊨ φ(x, y) iff a < b .

(ii) Let L = {+, 0}. Show that the relation < is not elementary definable in (R, +, 0).
(Hint: Work with a suitable automorphism of (R, +, 0)—that is, a suitable iso-
morphism of (R, +, 0) onto itself ). ⊣

Exercise .. Let L = {r }, where r is a binary relation. Formalize the following


notions using sentences over L.

(i) r is an equivalence relation with at least two equivalence classes.

(ii) r is an equivalence relation with an equivalence class containing more than one
element. ⊣

Exercise .. A set M of natural numbers is called a spectrum if there is a language L


and a sentence φ over L such that

M = {n | φ has a model containing exactly n element}

Show that:

(i) Every finite subset of {1, 2, 3, . . .} is a spectrum.

(ii) For every m ≥ 1, the set of numbers greater than  which are divisible by m is
a spectrum.

(iii) e set of squares greater than  is a spectrum.

(iv) e set of nonprime numbers greater than  is a spectrum.

(v) e set of prime numbers is a spectrum. ⊣


. Satisfiability: Henkin’s reduction to propositional logic 

. Satisfiability: Henkin’s reduction to propositional logic


When is a set X ⊆ ΦL of sentences in a first-order language L satisfiable—in other
words, when can we find an L-structure M such that for each φ ∈ X , M ⊨ φ?
Henkin proposed a solution to this question which essentially reduces the problem to
one of satisfiability in a propositional framework.
For the rest of this discussion, we assume that we are working with a fixed first-
order language L = (R, F , C ).
Let r be a binary relation symbol in L and t1 , t2 be a pair of terms. It is immediate
that the formula r (t1 , t2 ) ∧ ¬(r (t1 , t2 )) is not satisfiable—we can treat r (t1 , t2 ) as an
atomic proposition and recognize that this is an instance of an unsatisfiable proposi-
tional formula. How about a formula of the form ∀x r (x, t2 )∧¬∃x r (x, t2 )? Since we
assumed that all structures are non-empty, we can check that this, too, is not satisfiable.
However, there is no immediate way to represent this as an unsatisfiable propositional
formula.
Henkin’s approach is the following. Expand the language L by adding new con-
stants. Use these new constants to define a special set of formulas and, using this set,
blow up the given set X of formulas whose satisfiability we want to check into a larger
set X ′ such that X is satisfiable iff X ′ is satisfiable. Show that X ′ is such that its
satisfiability can be deduced from its “propositional structure”.
We begin by defining a notion of “atomic proposition” with respect to first-order
formulas.

Prime formulas A prime formula over L is an atomic formula or a formula which begins
with the quantifier ∃. Let PL be the set of prime formulas over L.

Example  In the formula ∃x r (x) ∨ t1 ≡ t2 , the prime formulas are ∃x r (x) and
t1 ≡ t2 . In the formula ∀x s (x) ⊃ ∃x s (x), after rewriting ∀ in terms of ∃, we have
two prime formulas—∃x ¬s (x) and ∃x s (x).

Observe that every formula in ΦL can be constructed from prime formulas using
the propositional connectives ¬ and ∨. e idea is to treat each distinct prime formula
as an independent atomic proposition and deduce the satisfiability of a set X ⊆ ΦL
from the propositional structure of its prime formulas.

Propositional satisfiability We say a formula φ ∈ ΦL is propositionally satisfiable if there


is a valuation v : PL → {⊤, ⊥} such that the prime formula structure of φ evaluates
to ⊤ under v. An L-tautology is a formula in ΦL which evaluates to ⊤ for every
propositional valuation to the prime formulas PL .
. Satisfiability: Henkin’s reduction to propositional logic 

Example  e formula ∃x r (x)∨ t1 ≡ t2 is propositionally satisfiable. We can assign


either prime formula (or both) independently to ⊤ to satisfy this formula. On the other
hand, the formula ∃x r (x) ∧ ¬∃x r (x) is not propositionally satisfiable—the formula
is built up from a single prime formula and has the structure p ∧ ¬ p.
e formulas ∃x r (x) ∨ ¬(∃x r (x)) is a tautology over L—the formula is built
up from a single prime formula and has the structure p ∨ ¬ p. Another example of a
tautology over L is the formula, ∃x r1 (y) ⊃ ∀y r2 (y) ∨ ∃x r1 (x), which is of the form
p ⊃ q ∨ p.

Proposition .. Let I be an L-interpretation. ere exists a valuation v of PL such


that for each formula φ, I ⊨ φ iff v ⊨ φ.

P For each prime formula ψ, define v(ψ) = ⊤ if I ⊨ ψ and v(ψ) = ⊥ oth-
erwise. Since each first-order formula can be built up from prime formulas using the
connectives ¬ and ∨, the result follows. ⊣

Corollary .. Let X ⊆ ΦL be a set of formulas. If X is first-order satisfible, then X is


propositionally satisfiable.

e converse of the preceding Corollary is false. Consider the following examples.

Example  e set {c ≡ d , d ≡ e, ¬(c ≡ e)} is propositionally satisfiable—we can fix


a valuation which maps the prime formulas c ≡ d and d ≡ e to ⊤ and c ≡ e to ⊥.
However, it is clearly not first-order satifiable.
e set of formulas {∀x (r (x) ⊃ s (x)), ∀x r (x), ∃x ¬s (x)}, is propositionally sat-
isfiable—once again, the three formulas in the set are made up of different prime for-
mulas whose truth value can be assigned independently to make the whole set propo-
sitionally true. However, this set is not first-order satisfiable.

e preceding examples show that the prime formula structure of ΦL does not
accurately capture the effect of the equality relation and the role played by quantifiers
in the semantics of first-order logic. Henkin’s solution is to add extra formulas which
“tie together” formulas connected by the equality relation and quantifiers so that the
truth of one formula is linked to the truth of the other.
For instance, if we augment the set {c ≡ d , d ≡ e, ¬(c ≡ e)} with the formula
{(c ≡ d ) ∧ (d ≡ e) ⊃ (c ≡ e)}, the set is no longer propositionally satisfiable. e
new formula links the truth value of the prime formulas c ≡ d and d ≡ e to that of
c ≡ e. Clearly the formula we have added is true in any structure, so it has not altered
the first-order satisfiability of the original set.
. Satisfiability: Henkin’s reduction to propositional logic 

Similarly, consider the second example {∀x (r (x) ⊃ s (x)), ∀x r (x), ∃x ¬s (x)},
which may be rewritten as {¬∃x (r (x) ∧ ¬s (x)), ¬∃x ¬r (x), ∃x ¬s (x)}.
If a sentence of the form ∃y φ(y) is satisfied in a structure, we can use a term t to
denote the “witnessing” element where φ holds. With this intended interpretation of
t , we can append the sentence ∃y φ(y) ⊃ φ(t ) to the set containing ∃y φ(y) without
affecting its satisfiability.
Similarly, a sentence of the form ¬∃y φ(y) is satisfiable just in case ¬φ(t ) holds for
every term t . us, we can expand a set of formulas containing ¬∃y φ(y) by a sentence
¬∃y φ(y) ⊃ ¬φ(t ), where t is an arbitrary term, without affecting satisfiability.
If we apply this reasoning to the set {¬∃x (r (x)∧¬s (x)), ¬∃x ¬r (x), ∃x ¬s (x)},
we first identify a term t to witness the formula ∃x ¬s (x) and add the formula ∃x ¬s (x) ⊃
¬s (t ) to the set. Applying the rule for ¬∃y φ(y) to the other two formulas, we can then
add ¬∃x (r (x)∧¬s (x)) ⊃ ¬(r (t )∧¬s (t )) and ¬∃x ¬r (x) ⊃ ¬¬r (t ) to the set. A val-
uation which satisfies the three original formulas in the set must now also make the set
{¬(r (t ) ∧ ¬s (t )), ¬¬r (t ), ¬s (t )} true. is simplifies to {¬r (t ) ∨ s(t ), r (t ), ¬s (t )},
which is not propositionally satisfiable. In other words, the expanded set is not propo-
sitionally satisfiable, which reflects the fact that the original set of three formulas was
not first-order satisfiable.
Adding equality formulas, as we did in the first example, is not a problem. How-
ever, in the second case, we need to have a term to denote the witnessing element for
each sentence ∃y φ(y) in our set. It may be the case that the original language L does
not have enough terms to cover all existential sentences of this form! In general, we
have to expand the language in order to ensure that we do not run out of terms.

e Witnessing Expansion of L
Let L = (R, F , C ) be the original language, with X ⊆ ΦL the set of sentences whose
satisfiability we want to establish. We shall systematically add new constants to L in
order to ensure that we have enough terms in the language to “name” all witnessing
elements for existential sentences. Formally, we inductively define new sets of constants
C0 , C1 , . . . as follows:
• Let C0 = ; and let L0 = L.
• Assume we have defined Cn . Let Ln = (R, F , C ∪ C1 ∪ C2 ∪ · · · ∪ Cn ). For each
formula φ(x) of ΦLn \ ΦLn−1 , with exactly one free variable x, let cφ(x) be a new
constant, called the witnessing constant of the sentence ∃x φ(x).
Let Cn+1 be the set of such constants generated by ΦLn \ ΦLn−1 .

Let CH = i ≥0 Ci and let LH = (R, F , C ∪ CH ).
. Satisfiability: Henkin’s reduction to propositional logic 

Henkin and quantifier axioms

• e Henkin axioms are sentences over LH of the form ∃x φ(x) ⊃ φ(cφ(x) ).

• e quantifier axioms are sentences over LH of the form φ(t ) ⊃ ∃x φ(x), where
t is a closed term over LH .

It is clear that the quantifier axioms are true in any structure and are hence first-
order valid. On the other hand, the Henkin axioms are not automatically true—we
need to ensure that the witnessing constants are interpreted properly in the structure
in order for the axioms to be true.
Let ΦH denote the set of all instances of the Henkin axiom and ΦQ denote the set
of all instances of the quantifier axiom over the language LH .

e equality axioms
Adding the equality axioms is easier. Let LH be the witnessing expansion of L. To
ensure that our propositional valuations respect the notion of equality, we define the
following set of axioms capturing properties of equality. e equality axioms are all
instances of the following, where t , u, v with or without subscripts are uniformly sub-
stituted by arbitrary terms over LH , f is an arbitrary n-ary function symbol in L and
r is an arbitrary n-ary relation symbol in L.

t≡t
t≡u ⊃ u≡t
(t ≡ u ∧ u ≡ v) ⊃ t ≡ v
(t1 ≡ u1 ∧ t2 ≡ u2 ∧ · · · ∧ tn ≡ un ) ⊃ ( f (t1 , t2 , . . . , tn ) ≡ f (u1 , u2 , . . . , un ))
(t1 ≡ u1 ∧ t2 ≡ u2 ∧ · · · ∧ tn ≡ un ) ⊃ (r (t1 , t2 , . . . , tn ) ⊃ r (u1 , u2 , . . . , un ))

Let ΦE q denote all instances of the equality axioms over LH . Notice that though
these axioms are not, in general, sentences, each formula in ΦE q is satisfied in every
interpretation of LH ,
We now have the following lemma, which shows that satisifiability in first-order
logic can be reduced to a similar question in propositional logic.
Lemma .. (First-order satisfiability) Let L be a first-order language and let LH be
the witnessing expansion of L. For any set X of formulas over L, the following are equiva-
lent:
. Satisfiability: Henkin’s reduction to propositional logic 

(i) ere is an L-interpretation I = (M , σ) which is a model for X .

(ii) ere is an LH -interpretation (M , σ) which is a model for X .

(iii) X ∪ ΦH ∪ ΦQ ∪ ΦE q is propositionally satisfiable.

P e fact that (i) implies (iii) is easily proved. Let I = (M , σ) be an L-


interpretation which is a model for X , where M = (S, ι). Define an LH -interpretation
I ′ = ((S, ι′ ), σ), where ι′ is defined on constants as follows: ι′ (c) = ι(c) for c ∈ C ;
for cφ(x) ∈ CH , ι(cφ(x) ) = o ∈ S such that I ⊨ φ(o) if M ⊨ ∃x φ(x) and o is
arbitrary otherwise. It is clear that I ′ ⊨ X ∪ ΦH ∪ ΦQ ∪ ΦE q . It follows now from
Proposition .. that X ∪ ΦH ∪ ΦQ ∪ ΦE q is propositionally satisfiable.
at (ii) implies (i) is immediate. e details are left as an exercise.
So, what remains is to establish that (iii) implies (ii). In other words, if there is a
valuation v of the prime formulas over LH such that v ⊨ X ∪ ΦH ∪ ΦQ ∪ ΦE q , we
must be able to construct an interpretation I = (M , σ) where M = (S, ι), such that
I ⊨ X . We will in fact show that I has the property that for every formula φ over
LH , I ⊨ φ iff v ⊨ φ.
e main function of ΦH is to ensure that if v ⊨ ∃x φ(x), then v ⊨ φ(cφ(x) ) for
every existential sentence ∃x φ(x) over LH .
To define I , we must

(a) Define the underlying set S.

(b) Fix an interpretation r M ⊆ S n for each n-ary relation symbol r in LH .

(c) Fix an interpetation f M : S n → S for each n-ary function symbol f in LH .

(d) Fix an interpretation c M ∈ S for each constant symbol c in LH .

(e) Fix an assignment σ.

e construction of M is as follows.

(a) Let LH = (R, F , C ∪ CH ). To fix S, we define an equivalence relation ≃ on


terms over LH by
t ≃ u iff v ⊨ t ≡ u

e equality axioms guarantee that ≃ is in fact an equivalence relation. For


instance, let us show that ≃ is transitive. Suppose that t ≃ u and u ≃ w. As
an instance of the third equality axiom we have t ≡ u ∧ u ≡ w ⊃ t ≡ w. Since
. Satisfiability: Henkin’s reduction to propositional logic 

v ⊨ ΦE q , it must be the case that v ⊨ t ≡ u ∧ u ≡ w ⊃ t ≡ w. Since t ≃ u


and u ≃ w, v ⊨ t ≡ u and v ⊨ u ≡ w. Hence v ⊨ t ≡ w as well, which
means that t ≃ w as required. For each constant symbol t , let [t ] denote the
equivalence class containing t . We define S to be the set {[t ] | t is a term over
LH }.

(b) Let r be an n-ary relation symbol. Fix r M = {〈[t1 ], [t2 ], . . . , [tn ]〉 | v ⊨


r (t1 , t2 , . . . , tn )}. To check that this is well-defined, we must verify that when-
ever t1 ≃ u1 , t2 ≃ u2 , …, tn ≃ un and v ⊨ r (t1 , t2 . . . , tn ) then v ⊨ r (u1 , u2 . . . , un )
as well. As an instance of the last equality axiom, we have

t1 ≡ u1 ∧ t2 ≡ u2 ∧ · · · ∧ tn ≡ un ⊃ (r (t1 , t2 . . . , tn ) ⊃ r (u1 , u2 . . . , un ))

Since ti ≃ ui for i ∈ {1, 2, . . . , n}, we have v ⊨ ti ≡ ui for i ∈ {1, 2, . . . , n}.


We also know that v ⊨ r (t1 , t2 , . . . , tn ). Since v ⊨ ΦE q , it must then be the case
that v ⊨ r (u1 , u2 , . . . , un ) as required.

(c) Let t1 , . . . , tn be terms over LH and f an n-ary function symbol in F . We define


f M ([t1 ], . . . , [tn ]) to be [ f (t1 , . . . , tn )].
To check that f M is well-defined, we have to show that f (t1 , t2 , . . . , tn ) ≃
f (u1 , u2 , . . . , un ) whenever ti ≃ ui for i ∈ {1, 2, . . . , n}. Let ti ≃ ui for
i ∈ {1, 2, . . . , n}. is implies that v ⊨ ti ≡ ui for each i . From the fourth
equality axiom, it then follows that v ⊨ f (t1 , t2 , . . . , tn ) ≡ f (u1 , u2 , . . . , un ), so
f (t1 , t2 , . . . , tn ) ≃ f (u1 , u2 , . . . , un )

(d) For c ∈ C ∪ CH , let c M = [c].

(e) For x ∈ Var, let σ(x) = [x].

is completes the construction of M and, at the same time, establishes that for
atomic sentences φ, M ⊨ φ iff v ⊨ φ. Indeed, I ⊨ r (t1 , . . . , tn ) iff (by semantics)
〈[t1 ], . . . , [tn ]〉 ∈ r M iff (by definition) v ⊨ r (t1 , . . . , tn ). On the other hand, I ⊨
t1 ≡ t2 iff (by semantics) t1I = t2I iff (by definition) t1 ≃ t2 iff (by definition, again)
v ⊨ t1 ≡ t2 .
To extend this argument to all sentences φ, we proceed by induction on the struc-
ture of φ. e cases where φ = ¬ψ and φ = ψ1 ∨ ψ2 are straightforward, so suppose
that φ = ∃x ψ(x).
If (M , σ) ⊨ φ then there is an element s in the underlying set S such that
(M , σ[x 7→ s ]) ⊨ ψ(x). Since every element in S corresponds to an equivalence
class [t ] for some term t over LH , we can find a constant t s ∈ LH such that t sM = s .
. Compactness and the Löwenheim-Skolem Theorem 

Clearly, (M , σ) ⊨ ψ(t s ). By the induction hypothesis, v ⊨ ψ(t s ). Since ψ(t s ) ⊃


∃x ψ(x) is a quantifier axiom, we must have v ⊨ ψ(t s ) ⊃ ∃x ψ(x) and hence v ⊨
∃x ψ(x), as required.
Conversely, suppose that v ⊨ ∃x ψ(x). en, since ∃x ψ(x) ⊃ ψ(cψ(x) ) is a
Henkin axiom, we must have v ⊨ ψ(cψ(x) ) as well. By the induction hypothesis, it
then follows that I ⊨ ψ(cψ(x) ). From the semantics of the quantifier ∃, we must then
have I ⊨ ∃x ψ(x). ⊣

Exercise .. Let L be a first-order language and let LH be the witnessing expansion
of L. Prove that for any set X of formulas over L, if there is an LH -interpretation
which is a model for X , there is also an L-interpretation which is a model for X . ⊣

. Compactness and the Löwenheim-Skolem eorem


Using the First-Order Satisfiability Lemma (Lemma ..), we can immediately derive
some powerful and important results.

eorem .. (Compactness) Let X be any set of First-Order formulas and let φ be a
formula. en, X ⊨ φ iff there is a finite subset Y ⊆fin X such that Y ⊨ φ.

As we saw in the case of Propositional Logic (Page ), this follows directly once
we establish the following finite satisfiability result.

Lemma .. (Finite Satisfiability) Let L be a First-Order language and let X be a set
of formulas over L. en, X is satisfiable iff every Y ⊆fin X is satisfiable.

P e non-trivial half of the statement is to show that if every Y ⊆fin X is satis-
fiable then X is satisfiable. From the First-Order Satisfiability Lemma, it is sufficient
to establish that (X ∪ ΦH ∪ ΦQ ∪ ΦE q ) is propositionally satisfiable. From the Finite
Satisfiability Lemma for propositional logic (Lemma ..), it suffices to show that
every finite subset (X ∪ ΦH ∪ ΦQ ∪ ΦE q ) is propositionally satisfiable. By assumption,
each finite subset Y ⊆fin X is satisfiable. From the First-Order Satisfiability Lemma,
we can then conclude that for each Y ⊆fin X , (Y ∪ ΦH ∪ ΦQ ∪ ΦE q ) is proposi-
tionally satisfiable. Since each finite subset of (X ∪ ΦH ∪ ΦQ ∪ ΦE q ) is contained in
(Y ∪ ΦH ∪ ΦQ ∪ ΦE q ) for some Y ⊆fin X , it then follows that each finite subset of
(X ∪ ΦH ∪ ΦQ ∪ ΦE q ) is propositionally satisfiable. us, (X ∪ ΦH ∪ ΦQ ∪ ΦE q ) is
propositionally satisfiable, or, in other words, X is First-Order satisfiable. ⊣
. A Complete Axiomatisation 

To derive the Compactness eorem from the Finite Satisfiability eorem, we


use the same argument as in propositional logic (Page ).
e next result we derive from the First-Order Satisfiability Lemma has no coun-
terpart in propositional logic.

eorem .. (Löwenheim-Skolem) Let L be a first-order language and let X be a


set of formulas over L.

(i) If L is finite or countable, then if X is satisfiable, X is satisfiable in a structure whose


underlying set is countable.

(ii) If L is not countable, then if X is satisfiable, X is satisfiable in a structure whose


underlying set has a cardinality bounded by the cardinality of L.

P Let us look at the first case in detail. If L is finite or countable, then ΦL is
countable. If X is satisifiable, then it is satisfiable in the structure constructed in the
proof of Lemma ... e underlying set in that structure is bounded by the number
of constants in L together with the number of constants in the witnessing expansion of
L. Recall the construction of CH , the set of set of witnessing constants for L. Initially,
C1 contains a constant cφ(x) for each formula φ(x) ∈ ΦL . Since ΦL is countable, so
is C1 and, thus, L1 is countable. Inductively, assuming that Ln is countable, the same
argument establishes that the next set of witnessing constants Cn+1 is countable. us,
CH is the countable union of countable sets and is thus countable.
A similar argument applies in the second case. We omit the details. ⊣

In particular, the Löwenheim-Skolem eorem says that if L is a countable first-


order language, then no set of axioms over L can completely capture the properties of
real numbers. Any attempt to describe a theory of real numbers over L will have to
admit a countable model.

. A Complete Axiomatisation


Before exploring the semantic consequences of the Compactness and Löwenheim-
Skolem eorems, let us look at an axiomatisation of first-order logic.

Axiom System FOL-AX e axiom system FOL-AX consists of three categories ax-
ioms and two inference rules.
. A Complete Axiomatisation 

(A) All tautologies of propositional logic.


(Aa) x≡x
(Ab) t ≡ u ⊃ (φ(t ) ≡ φ(u)), where φ is an atomic formula
(A) φ(t ) ⊃ ∃x φ(x)
φ, φ ⊃ ψ
(MP: Modus Ponens)
ψ
φ(x) ⊃ ψ
(G: Generalisation) , where x ∈
/ FV(ψ)
∃x φ(x) ⊃ ψ

As usual, if X is a set of formulas over L, we write X ⊢ φ to indicate that there is a finite


sequence of formulas φ1 , φ2 , . . . , φn such that φn = φ and for each i ∈ {1, 2, . . . , n},
φi is either a member of X , an instance of the axioms (A)–(A) or is derived from
earlier formulas in the sequence using one of the two inference rules.
e following is an interesting lemma:

Lemma .. All the equality axioms over L can be derived using the above axioms and
rules.

P Consider the equality axiom t ≡ t for some term t . Here is a derivation of it:
1. x≡x Aa.
2. y≡y Aa.
3. ¬(x ≡ x) ⊃ ¬(y ≡ y) , PL.
4. ∃ x¬(x ≡ x) ⊃ ¬(y ≡ y) , rule (G).
5. ¬(t ≡ t ) ⊃ ∃ x¬(x ≡ x) A.
6. ¬(t ≡ t ) ⊃ ¬(y ≡ y) ,,PL.
7. t≡t ,,PL.
Now consider the equality axiom t ≡ u ⊃ u ≡ t . is is easily derivable as follows,
where we let α(x) be x ≡ t (note that α(t ) is t ≡ t and α(u) is u ≡ t ):
1. t ≡ t by the earlier derivation.
2. t ≡ u ⊃ (α(t ) ≡ α(u)) Ab.
3. t ≡ u ⊃ u ≡ t ,,PL.
Consider (t ≡ u ∧ u ≡ v) ⊃ t ≡ v. Again the following is an easy derivation,
letting α(x) be t ≡ x (note that α(u) is t ≡ u and α(v) is t ≡ v):
1. u ≡ v ⊃ (α(u) ≡ α(v)) Ab.
2. (t ≡ u ∧ u ≡ v) ⊃ t ≡ v , PL.
. A Complete Axiomatisation 

Now consider,without loss of generality, a ternary function symbol f and the


equality axiom (t1 ≡ u1 ∧ t2 ≡ u2 ∧ t3 ≡ u3 ) ⊃ ( f (t1 , t2 , t3 ) ≡ f (u1 , u2 , u3 )). We
provide a derivation below. We define α(x1 , x2 , x3 ) to be f (t1 , t2 , t3 ) ≡ f (x1 , x2 , x3 ),
where x1 , x2 , x3 do not occur in t1 , t2 , t3 , u1 , u2 , u3 . Also define α1 (x1 ) to be f (t1 , t2 , t3 ) ≡
f (x1 , t2 , t3 ), α2 (x2 ) to be f (t1 , t2 , t3 ) ≡ f (u1 , x2 , t3 ), and α3 (x3 ) to be f (t1 , t2 , t3 ) ≡
f (u1 , u2 , x3 ). Notice that α1 (t1 ) is the same as f (t1 , t2 , t3 ) ≡ f (t1 , t2 , t3 ). Also notice
that α3 (u3 ) is just f (t1 , t2 , t3 ) ≡ f (u1 , u2 , u3 ), so line 6 in the derivation below con-
tains the desired formula. Further note that α1 (u1 ) is the same as α2 (t2 ) and α2 (u2 ) is
the same as α3 (t3 ).
1. α1 (t1 ) instance of t ≡ t .
2. (t1 ≡ u1 ∧ t2 ≡ u2 ∧ t3 ≡ u3 ) ⊃ α1 (u1 ) Ab, , PL.
3. (t1 ≡ u1 ∧ t2 ≡ u2 ∧ t3 ≡ u3 ) ⊃ α2 (t2 ) , α1 (u1 ) ≡ α2 (t2 ), PL.
4. (t1 ≡ u1 ∧ t2 ≡ u2 ∧ t3 ≡ u3 ) ⊃ α2 (u2 ) , Ab, PL.
5. (t1 ≡ u1 ∧ t2 ≡ u2 ∧ t3 ≡ u3 ) ⊃ α3 (t3 ) , α2 (u2 ) ≡ α3 (t3 ), PL.
6. (t1 ≡ u1 ∧ t2 ≡ u2 ∧ t3 ≡ u3 ) ⊃ α3 (u3 ) , Ab, PL.
Consider, without loss of generality, a binary relation symbol r and the equality ax-
iom (t1 ≡ u1 ∧ t2 ≡ u2 ) ⊃ (r (t1 , t2 ) ⊃ r (u1 , u2 )). Let x1 , x2 not occur in t1 , t2 , u1 , u2 .
Now consider the following derivation:
1. t1 ≡ u1 ⊃ (r (t1 , x2 ) ⊃ r (u1 , x2 )) Ab.
2. (t1 ≡ u1 ∧ t2 ≡ u2 ) ⊃ (r (t1 , t2 ) ⊃ r (u1 , u2 )) 1,Ab,PL.
us we see that all the equality axioms are derivable in our axiom system. ⊣

e theorem we are after is the following:

eorem .. Let X be a set of formulas over L and φ a sentence over L. en X ⊢ L
iff X ⊨ L.

As usual, the proof of this theorem is in two parts, soundness and completeness.

Lemma .. (Soundness) If X ⊢ φ then X ⊨ φ.

P As usual, this is proved by the length of the derivation X ⊢ φ. It suffices to


argue that the axioms (A)–(A) are valid and that the rules (MP) and (G) preserve
validity.
e validity of (A) is obvious. We have already observed that (A) and (A) are
valid when discussing the witnessing expansion used in the proof of Lemma ...
As for the rules, when discussing propositional logic, we have already verified that
(MP) preserves validity. us, we just need to argue that (G) preserves validity.
. A Complete Axiomatisation 

Suppose that the formula φ(x) ⊃ ψ is valid, where x ∈ / FV(ψ). In other words,
for any interpretation (M , σ), (M , σ) ⊨ φ(x) ⊃ ψ.
Consider an arbitrary interpretation (M ′ , σ ′ ), where M ′ = (S ′ , ι′ ). We must
show that if (M ′ , σ ′ ) ⊨ ∃x φ(x) then (M ′ , σ ′ ) ⊨ ψ as well.
Suppose that (M ′ , σ ′ ) ⊨ ∃x φ(x). From the semantics of the quantifier ∃, (M ′ , σ ′ ) ⊨
∃x φ(x) iff for some s ∈ S ′ , (M ′ , σ ′ [x 7→ s ]) ⊨ φ(x). From the validity of φ(x) ⊃ ψ,
we can conclude that (M ′ , σ ′ [x 7→ s ]) ⊨ ψ. But, x ∈ / FV(ψ), so σ ′ [x 7→ s ] and σ ′
agree on FV(ψ). From Proposition .., it follows that (M ′ , σ ′ ) ⊨ ψ as well, as re-
quired. ⊣

To establish that the axiomatisation AX-FOL is complete, we need the following


lemma.
Lemma .. Let X be a set of formulas.

(i) If X ⊢ (φ ⊃ ψ) and X ⊢ (¬φ ⊃ ψ) then X ⊢ ψ.

(ii) If X ⊢ (φ ⊃ θ) ⊃ ψ, then X ⊢ (¬φ ⊃ ψ) and X ⊢ (θ ⊃ ψ).

(iii) If x ∈
/ FV(ψ) and X ⊢ [(∃y φ(y) ⊃ φ(x)) ⊃ ψ], then X ⊢ ψ.

P (i) Since [(φ ⊃ ψ) ⊃ ((¬φ ⊃ ψ) ⊃ ψ)] is a tautology, X ⊢ [(φ ⊃ ψ) ⊃


((¬φ ⊃ ψ) ⊃ ψ)]. Given X ⊢ (φ ⊃ ψ) and X ⊢ (¬φ ⊃ ψ), we can apply (MP)
twice to obtain X ⊢ ψ.

(ii) is follows from the fact that [((φ ⊃ θ) ⊃ ψ) ⊃ (¬φ ⊃ ψ)] and [((φ ⊃ θ) ⊃
ψ) ⊃ (θ ⊃ ψ)] are tautologies.

(iii) Suppose X ⊢ [(∃y φ(y) ⊃ φ(x)) ⊃ ψ], where x ∈ / FV(ψ). By (ii), X ⊢


(¬∃y φ(y) ⊃ ψ) and X ⊢ φ(x) ⊃ ψ. We can apply rule (G) to the second
formula and rename bound variables to obtain X ⊢ ∃y φ(y) ⊃ ψ. From (i), it
then follows that X ⊢ ψ.

Lemma .. (Completeness) If X ⊨ φ then X ⊢ φ.

P Suppose that X ⊨ φ. en, X ∪{¬φ} is not first-order satisfiable. By Lemma ..,
X ∪ ¬φ ∪ ΦH ∪ ΦQ ∪ ΦE q is not propositionally satisfiable. From the Compact-
ness eorem for propositional logic, it follows that there is a finite subset Y ⊆fin
X ∪ ΦH ∪ ΦQ ∪ ΦE q such that Y ∪ {¬φ} is not propositionally satisfiable.
Let the formulas in Y be listed in the order α1 , α2 , . . . , αn , β1 , β2 , . . . , β m , such
that:
. A Complete Axiomatisation 

• e sequence α1 , α2 , . . . , αn consists of those members of Y which belong to


X ∪ ΦQ ∪ ΦE q , listed in any order.

• e sequence β1 , β2 , . . . , β m are those members of Y which belong to ΦH .


ese sentences must be listed more carefully.
Recall that LH , the witnessing expansion of L, was constructed as the limit of a
sequence of languages L0 ⊊ L1 ⊊ · · · . For each formula ψ over LH , let the rank
of ψ be the least k such that ψ is a formula over Lk .
e list β1 , β2 , . . . , β m is arranged in such a way that the rank of βi is greater
than or equal to the rank of βi +1 for each i ∈ {1, 2, . . . , m−1}. Recall that each
βi is of the form ∃x ψ(x) ⊃ ψ(cψ(x) )—let us call cψ(x) the witnessing constant
for βi . By arranging the list β1 , β2 , . . . , β m in decreasing rank order, we ensure
that the witnessing constant for βi does not appear in βi+1 , βi +2 , . . . , β m for
each i ∈ {1, 2, . . . , m−1}.

Since Y ∪ {¬φ} is not propositionally satisfiable, we have

α1 ⊃ (α2 ⊃ · · · ⊃ (αn ⊃ (β1 ⊃ (β2 ⊃ · · · ⊃ (β m ⊃ φ) · · · )

to be a tautology, so we can derive

X ⊢ α1 ⊃ (α2 ⊃ · · · ⊃ (αn ⊃ (β1 ⊃ (β2 ⊃ · · · ⊃ (β m ⊃ φ) · · · )

If we replace each witnessing constant in this formula by a distinct variable, the


result
(α1′ ⊃ (α2′ ⊃ · · · ⊃ (α′n ⊃ (β′1 ⊃ (β′2 ⊃ · · · ⊃ (β′m ⊃ φ ′ ) · · · )
is still a tautology. Note however, that φ ′ is the same as φ, since φ is a formula over L
and does not contain any witnessing constants.
Each formula in α1′ , α2′ , . . . , α′n is either a member of X or a logical axiom, so we
may apply (MP) n times to obtain

X ⊢ β′1 ⊃ (β′2 ⊃ · · · ⊃ (β′m ⊃ φ) · · · )

Recall that each formula β′i is of the form ∃x ψ(x) ⊃ ψ(y), where the variable
y does not appear in β′i +1 , β′i +2 , . . . , β′m , φ. We can thus apply Lemma .. (iii) n
times to obtain X ⊢ φ. ⊣
. Variants of the Löwenheim-Skolem Theorem 

. Variants of the Löwenheim-Skolem eorem


e Löwenheim-Skolem and Compactness eorems play a dominant role in the se-
mantics of first-order languages and in applying them to mathematical structures. Here
we use the Compactness eorem to obtain variants of the Löwenheim-Skolem e-
orem.
Let us first prove the following easy consequence of the Compactness eorem.

eorem .. Let X be a set of formulas which has arbitrarily large finite models (i.e.
for every n ∈ N there is a model for X whose cardinality is at least n). en X also has a
countable model.
def
P Let Y = X ∪ {φ≥n | 2 ≤ n} (φ≥n was presented in Section . under the
head Questions of cardinality). Every model of Y is also a model of X and is infinite
in size. erefore we only need to prove that Y is satisfiable. By the Compactness
eorem it suffices to show that every finite subset Y0 of Y is satisfiable. Each such
def
Y0 is a subset of Xn0 = X ∪ {φ≥n | 2 ≤ n ≤ n0 } for an appropriate n0 ∈ N. But
according to hypothesis there is a model for X whose size is at least n0 . is is also a
model for Xn0 and hence Y0 . us we are done. ⊣

We next prove that if a set of formulas has a model of a certain cardinality, it has
models of every larger cardinality.

eorem .. (“Upward” Löwenheim-Skolem eorem) Let X be a set of formulas


which has an infinite model. en for every set A there is a model for X which has at least
as many elements as A (what we mean is that there is an injective map from A into the
underlying set).

P Let L be the language of X and let C be the set of constants in L. For each
/ C ) such that ca ̸= c b for distinct a, b ∈ A. Let L′
a ∈ A let ca be a new constant (ca ∈
be the language L augmented with the set of constants {ca | a ∈ A}. Suppose we show
def
that the set Y = X ∪ {¬(ca ≡ c b ) | a, b ∈ A, a ̸= b } of L′ -formulas is satisfiable.
Consider any model I of Y . Since I ⊨ ¬(ca ≡ c b ) for all distinct a, b ∈ A, it is clear
that I (ca ) ̸= I (c b ) for distinct a, b ∈ A. us {(a, I (ca )) | a ∈ A} is an injective
map from A into the underlying set of I , and the theorem would be proved.
We now turn our attention to proving that Y is indeed satisfiable. By Compactness
it suffices to show that all finite subsets Y0 of Y are satisfiable. But that is very easy
to see. Every such Y0 is a subset of Z = X ∪ {¬(cai ≡ ca j ) | 1 ≤ i , j ≤ n, i ̸= j }
for some appropriate subset {a1 , . . . an } of A. Now let I be some infinite model for
. Elementary Classes 

X . Clearly we can choose n distinct elements b1 , . . . , bn from the underlying set of


I . We can now extend I to L′ by setting I (ai ) = bi for i ≤ n, and giving I (ca )
an arbitrary value, for the other elements a occurring in A. It is easily checked that I ,
extended as above, is a model for Z—and hence for Y0 . We are done. ⊣

e above theorem can be put to good use in the study of algebraic theories. For
instance, let X be the set of group axioms. Since there exist infinite groups, the above
theorem says that there exist arbitrarily large groups. Similarly, there are arbitrarily
large orderings and arbitrarily large fields. While each of these facts can be derived
using algebraic methods specific to the theory, first-order logic provides us with the
framework and with methods to state and prove such results in a general form.

. Elementary Classes


For a set of L-formulas we call
def
ModL X = {I | I is an L-structure and I ⊨ X }

the class of models of X . We drop the superscript when there is no scope for confusion.
We also write Mod φ instead of Mod {φ}.

Definition .. Let C be a class of L-structures.

(i) C is called elementary if there is an L-formula φ such that C = Mod φ.

(ii) C is called ∆-elementary if there is as set X of L-formulas such that C = Mod X .

Every elementary class is ∆-elementary. Conversely, every ∆-elementary class is


∩ of elementary classes. is is because, for any set X of sentences,
the intersection
Mod X = Mod φ.
φ∈X
In the rest of this section we will see some examples of elementary and ∆-elementary
classes of structures. We will also see some examples of classes of structures that are not
elementary, and some which are not ∆-elementary. is gives us an indication of the
expressive power of first-order logic.
For example, the class of fields is elementary since it consists of precisely those
models which satisfy the conjunction of the (finitely many) field axioms. e class
of ordered fields is also elementary since order can also be characterised using a finite
number of axioms. Similarly the class of groups, the class of equivalence relations, the
class of partial orderings, the class of directed graphs, etc. are all elementary.
. Elementary Classes 

Fields of prime characteristic and of characteristic 0 Let p be a prime. A field F has char-
acteristic p if 1| + 1 +
{z· · · + 1} = 0. If there is no prime p for which F has characteristic
p times

p, F is said to have characteristic 0. For every prime p the field Z/( p) of the integers
modulo p has characteristic p. e field R of real numbers has characteristic 0. Let
φF be the conjunction of all the field axioms, and let χ p be the formula 1| + 1 +
{z· · · + 1}
p times

≡ 0 (we use the 0 and 1 both as constant symbols of the language of fields as well as
names of the additive and multiplicative identities of fields). en the class of fields
of characteristic p is exactly the same as Mod (φF ∧ χ p ). Hence this class is elemen-
tary. e class of fields of characteristic 0 is ∆-elementary — it is easily seen to be the
same as Mod ({φF } ∪ {¬χ p | p is prime}). In what follows, we show that it is not
elementary.
Let φ be a sentence in the lenaguage of fields which is valid in all fields of charac-
teristic 0, that is
{φF } ∪ {¬χ p | p is prime} ⊨ φ.
By the Compactness eorem there is an n0 such that

{φF } ∪ {¬χ p | p is prime, p < n0 } ⊨ φ.

Hence φ is valid in all fields of characteristic ≥ n0 . us we have proved the following
theorem.
eorem .. A sentence (in the language of fields) which is valid in all fields of char-
acteristic 0 is also valid in all fields whose characteristic is sufficiently large.

From this we conclude that the class of fields of characteristic 0 is not elementary,
for otherwise, there would have to be a sentence φ (characterising the class) which is
valid precisely in all the fields of characteristic 0.

e class of finite structures and the class of infinite structures It is easily seen that the class
of finite L-structures (for a fixed L), the class of finite groups, the class of finite fields
are not ∆-elementary. e proof is simple: If, for example, the class of finite groups
were of the form Mod X , then X would be a set of formulas having arbitrarily large
finite models (groups of the form Z/( p)) but no infinite model. at would contradict
eorem ...
On the other hand the corresponding classes of infinite structures is ∆-elementary.
In fact, let C be any ∆-elementary class of structures, characterised by the set of for-
mulas X . en the class C ∞ of infinite structures in C is characterised by X ∪
{φ≥n | n ≥ 2}.
. Elementary Classes 

Torsion groups A group G is called a torsion group if every element is of finite order,
i.e. if for every a ∈ G there is an n ≥ 1 such that |a + a +
{z· · · + a} = 0. An ad-hoc
n times

formalization of this property would be

∀x(x ≡ 0 ∨ x + x ≡ 0 ∨ x + x + x ≡ 0 ∨ · · · ).

However, we may not form infinitely long disjunctions in first-order logic. Indeed, the
class of torsion groups is not even ∆-elementary.
Suppose, for a contradiction, that X is a set of formulas that characterises the class
of torsion groups. Let
def
Y = X ∪ {¬( |x + x +
{z· · · + x} ≡ 0) | n ≥ 1}.
n times

Every finite subset Y0 of Y has a model. Choose an n0 such that Y0 ⊆ X ∪{¬( |x + x +


{z· · · + x} ≡
n times

0) | 1 ≤ n < n0 }. en every cyclic group of order n0 is a model of Y0 if x is inter-


preted as the generating element. Since every finite subset of Y is satisfiable, Y is also
satisfiable. Let I be a model of Y . en I (x) does not have a finite order, showing
that I is a model of X but not a torsion group, a contradiction.

e class of connected graphs A graph G = (V , E) is said to be connected if, for arbitrary


a, b ∈ V with a ̸= b , there are n ≥ 2 and a1 , . . . , an ∈ V with

a1 = a, an = b , ai E ai +1 for i = 1, . . . , n − 1

(i.e., if for any two distinct elements in V there is a path connecting them). For
n ∈ N, the regular (n + 1)-gon Gn with the vertices 0, . . . , n is a connected graph.
def
More precisely, Gn is the structure (Vn , En ) with Vn = {0, . . . , n} and
def
En = {(i , i + 1) | i < n} ∪ {(i , i − 1) | 1 ≤ i ≤ n} ∪ {(0, n), (n, 0)}.

We now prove that the class of connected graphs is not ∆-elementary. Assume,
towards a contradiction, that a set X of formulas characterises the class of connected
graphs. For n ≥ 2 we set
def
ψn = ¬(x ≡ y) ∧ ¬∃x1 . . . ∃xn (x1 ≡ x ∧ xn ≡ y ∧ x1 E x2 ∧ · · · ∧ xn−1 E xn )

and
def
Y = X ∪ {ψn | n ≥ 2}.
. Elementarily Equivalent Structures 

en every subset Y0 of Y has a model: For Y0 choose an n0 such that Y0 ⊆ X ∪


{ψn | 2 ≤ n < n0 }; then G2·n0 is a model of Y0 if x is interpreted by 0 and y by n0 .
Since every finite subset of Y has a model, Y is also satisfiable. Let I be a model of
Y . en there is no path connecting I (x) and I (y). erefore I is a model of X
but not a connected graph. is contradicts the assumption on X .

. Elementarily Equivalent Structures


In this section, we look at a new notion of equivalence between structures based on the
set of formulas they satisfy. is offers another interesting means of studying the power
of first-order formulas. While in the previous section we were concerned with the
expressive power of first-order logic (i.e., what classes of structures can be charaterised
by first-order formulas?), in this section we look at the distinguishing power of first-
order logic (i.e., when can a first-order formula tell two structures apart?). We can also
sometimes prove facts about expressibility using facts about distinguishability. We will
see some examples of this later. But first we begin by introducing two new notions.

Definition ..

(i) Two structures (for the same language) M and M ′ are called elementarily equiv-
alent (written: M ≡ M ′ ) if for every formula φ (in the appropriate language) we
have M ⊨ φ iff M ′ ⊨ φ.
def
(ii) For an interpretation M let Th(M ) = {φ | M ⊨ φ}. Th(M ) is called the
(first-order) theory of M .

Lemma .. For two structures M and M ′ ,

M ≡ M ′ iff M ′ ⊨ Th(M ).

P If M ≡ M ′ then, since M ⊨ Th(M ), also M ′ ⊨ Th(M ). Conversely,


suppose M ′ ⊨ Th(M ). Consider a sentence φ. If M ⊨ φ then φ ∈ Th(M ) and
hence M ′ ⊨ φ. If, on the other hand, M ⊭ φ then M ⊨ ¬φ and thus ¬φ ∈ Th(M ).
Hence M ′ ⊨ ¬φ and therefore M ′ ⊭ φ. us M ≡ M ′ . ⊣

It can be easily seen by a simple (but probably tedious) induction that any two
isomorphic structures satisfy the same first-order formulas. In other words, they are
elementarily equivalent. e converse is not immediately clear though: Are any two
elementarily equivalent structures isomorphic to each other?
. Elementarily Equivalent Structures 

eorem .. For every structure M , the class C = {M ′ | M ′ ≡ M } is ∆-


elementary; in fact C = Mod Th(M ). Moreover, C is the smallest ∆-elementary class
which contains M .

P From Lemma .. it is clear that M ′ ∈ Mod Th(M ) iff M ′ ≡ M . Now


if Mod X is another ∆-elementary class containing M , then M ⊨ X and therefore
M ′ ⊨ X for every M ′ elementarily equivalent to M . Hence {M ′ | M ′ ≡ M } ⊆
Mod X . ⊣

eorem .. If M is infinite then the class of all structures isomorphic to M is not
∆-elementary; in other words, no infinite structure can be characterised up to isomorphism
by a set of first-order formulas.

P Let M = (S, ι) be an infinite structure. Suppose, towards a contradiction, that


X is a set of first-order formulas whose models are exactly the structures isomorphic to
M . X has an infinite model, and hence by the upward Löwenheim-Skolem theorem,
X has a model M ′ with at least as many elements as the power set of S. But then M ′
is a model of X but not isomorphic to M , contrary to what we supposed. is proves
the theorem. ⊣

If we choose X = Th(M ) in the above proof, then M and M ′ are elemen-


tarily equivalent but not isomorphic. is shows that not all elementarily equivalent
structures are isomorphic to each other.
eorem .. tells us that a ∆-elementary class contains, together with any given
structure, all elementarily equivalent ones. In certain cases, one can use this to show
that a class C is not ∆-elementary. We simply specify two elementarily equivalent
structures, of which one belongs to C , and the other does not. We illustrate this
method in the case of archimedean fields.
An ordered field F is called archimedean if for every a ∈ F there is n ∈ N such
that a < 1| + 1 + {z· · · + 1}. For example, the ordered field of rational numbers and the
n times

ordered field of reals R< are archimedean. We show that there is an non-archimedean
ordered field elementarily equivalent to the ordered field of real numbers. is will
prove the following.

eorem .. e class of archimedean fields is not ∆-elementary.

P Let
def
X = Th(R< ) ∪ {0 < x, 1 < x, 2 < x, . . .},
. Elementarily Equivalent Structures 

where 0, 1, 2, . . . stand for the terms 0, 1, 1 + 1, . . . in the language of arithmetic. Every


finite subset of X is satisfiable, for instance, by an interpretation of the form (R< , σ),
where σ(x) is a sufficiently large natural number. By the Compactness eorem there
is a model (M ′ , σ ′ ) of X . Since M ′ ⊨ Th(R< ), M ′ is an ordered field elementarily
equivalent to R< , but (as shown by the element σ ′ (x)) is not archimedean. ⊣

e use of the Compactness eorem in the above is typical. We provide further


examples of its use below, when we turn our attention to the structure N of natural
numbers, and the structure N< of ordered natural numbers. (Note that the signatures
of interest here are {0, s, +, ·} and {0, s , +, ·, <}.) ese structures can be characterised
up to isomorphism by a finite set of axioms, usually called Peano’s axioms, which in-
cludes a second-order induction axiom. But in what follows, we show that no system
of first-order axioms can characterise the structures N and N< up to isomorphism.
From this it follows that the induction axiom cannot be formulated as set of first-order
formulas.
A theory is said to be categorical if all its models are isomorphic to one another.
e (second-order) Peano’s axiom system is an example of a categorical theory. No
first-order theory can be categorical; this is a consequence of the Löwenheim-Skolem
theorems (both “upward” and “downward”). It is more interesting to study if a theory is
ℵ-categorical for a given cardinal number ℵ. In particular, we are interested in seeing
whether arithmetic is ℵ0 -categorical, i.e. if all the countable models of Th(N) are
isomorphic to one another. e following two theorems say that Th(N) and Th(N< )
are not ℵ0 -categorical.
Let us introduce another bit of terminology before stating the results. A structure
which is elementarily equivalent, but not isomorphic to N is called a nonstandard model
of arithmetic.

eorem .. (Skolem’s eorem) ere is a countable nonstandard model of arith-


metic.

P Let
def
X = Th(N) ∪ {¬(x ≡ 0), ¬(x ≡ 1), ¬(x ≡ 2), . . .},
where 0, 1, 2, . . . stands for the terms 0, s (0), s (s (0)), . . .. Every finite subset of X has
a model of the form (N, σ), where σ(x) is a sufficiently large natural number. By the
Compactness eorem there is a model (M ′ , σ ′ ) of X , which by the countability of
the language of arithmetic and the Löwenheim-Skolem theorem we may assume to be
at most countable. M is a structure elementarily equivalent to N. Since for m ̸= n
the sentence ¬(m ≡ n) belongs to Th(N), M is infinite and hence is countable. M
. Elementarily Equivalent Structures 

and N are not isomorphic, since an isomorphism from N onto M would have to map
the interpretation of n in the structure N (this turns out to be the number n) to the
interpretation of n in the structure M , and thus σ(x) would not belong to the range
of the isomorphism at all. ⊣

Considering the set Th(N< ) ∪ {¬(x ≡ 0), ¬(x ≡ 1), ¬(x ≡ 2), . . .}, we obtain the
following theorem.

eorem .. ere is a countable nonstandard model of Th(N< ).

What do nonstandard models of Th(N) or Th(N< ) look like? In the following we


gain some insight into the order structure of a nonstandard model M of Th(N< ).
In N< the sentences

∀x(0 ≡ x ∨ 0 < x),


0 < 1 ∧ ∀x(0 < x ⊃ (1 ≡ x ∨ 1 < x)),
1 < 2 ∧ ∀x(1 < x ⊃ (2 ≡ x ∨ 2 < x)), …

hold. ey say that 0 is the smallest element, 1 is the next smallest element after 0, 2
is the next smallest element after 1, and so on. Since these sentences also hold in M ,
the “initial segment” of M looks as follows:

.
M
0 1M 2M 3M

In addition, S (the underlying set of M ) contains a further element, say a, since


otherwise M and N< are isomorphic. Furthermore, N< satisfies a sentence φ which
says that for every element there is an immediate successor and for every element other
than 0 there is an immediate predecessor. From this it follows easily that S contains,
in addition to a, infinitely many other elements which together with a are ordered like
the integers in M :

.
0M
1M 2M 3M a

If we consider the element a + a we are led to further elements of S:


. An Algebraic Characterisation of Elementary Equivalence 

.
0M
1M 2M 3M a a +M a

It is clear that a + a lies in a different copy of Z than a. If they belonged to the same
copy, then a + a = a + n for some natural number n. By the cancellation law for
addition, a = n, which is a contradiction. We can also show that between every two
copies of Z< in M there lies another. is is because N< satisfies a sentence φ which
says that for any two elements m and n, if m < n there exists a “midpoint” p (i.e.
m + n = 2 · p or m + n = 2 · p + 1). e same statement is satisfied by M as well.
If we now consider two elements a and b which lie in different copies of Z< in M ,
they have a midpoint c which has to lie in between a and b but cannot lie in either
of their copies Z< (since that would imply that a and b lie in the same copy). us
any nonstandard model of arithmetic looks like the rational line (to the right of and
including the point 0) with the point 0 replaced by a copy of N< and every other point
replaced by a copy of Z< .

. An Algebraic Characterisation of Elementary Equivalence


In the previous sections we saw that the notion of elementary equivalence was weaker
than the notion of isomorphism. is leads us to ask whether there is a purely algebraic
notion (not referring to first-order formulas and the like) which is equivalent to elemen-
tary equivalence. is would be of much use, since we can now prove two structures
elementarily equivalent through means other than showing that the two structures sat-
isfy the same formulas. In this section, we provide such an algebraic characterisation
(due to Fraisse) and give examples of its use.

Fraisse’s theorem
In the following, we provide a simple proof of Fraisse’s theorem. We assume that we are
working with the signature of graphs, consisting of a single binary relation symbol R. It
is easy to see that what we prove here can be generalised to all signatures containing only
relational symbols. Later we will show how to extend the result to arbitrary signatures.
We introduce the following notation to simplify the presentation. We use a to
denote a tuples of elements. |a| denotes the number of elements in the tuple. We also
write φ(x) (where x = x1 , . . . , x r ) to indicate the fact that FV(φ) ⊆ {x1 , . . . , x r }. For
. An Algebraic Characterisation of Elementary Equivalence 

a structure M , a tuple a of elements from M , and a formula φ(x) with |x| = |a|, we
write (M , a) ⊨ φ(x) to mean that (M , σ) ⊨ φ, with σ(xi ) = ai for all i ≤ |x|.

Definition .. Let G = (V , E) and H = (W , F ) be two graphs. Let a and b be


finite tuples of elements from V and W respectively, such that |a| = |b |. We say that (G, a)
and (H , b ) are m-equivalent—in symbols (G, a) ≡ m (H , b )—if for every formula φ(x)
whose quantifier rank (the maximum nesting depth of quantifiers in the formula) is not
more than m, (G, a) ⊨ φ(x) if and only if (H , b ) ⊨ φ(x).

Note that for the above definition to make sense, |x| should be equal to |a|. But
we will not crib about such minor details here and in what follows.
We now motivate the notion of m-isomorphism. e least we require is that any
two m-isomorphic graphs are m-equivalent. Consider two graphs G = (V , E) and
H = (W , F ), a from V and b from W . Suppose that (G, a) ̸≡ m (H , b ). Let us
say that there is a formula φ(x, y) with quantifier rank ≤ m − 1 such that (G, a) ⊨
∃yφ(x, y) and (H , b ) ⊭ ∃yφ(x, y). is means that for some c ∈ V and for all
d ∈ W , (G, ac) ⊨ φ(x, y) and (H , b d ) ⊭ φ(x, y). us there is c ∈ V such that
for all d ∈ W , (G, ac) ̸≡ m−1 (H , b d ). In the symmetric case involving the universal
quantifier, we infer that there is d ∈ W such that for all c ∈ V , (G, ac) ̸≡ m−1 (H , b d ).
We have proved the following
Lemma .. Suppose that for every c ∈ V there is a d ∈ W such that (G, ac) ≡ m−1
(H , b d ) and that for every d ∈ W there is a c ∈ V such that (G, ac) ≡ m−1 (H , b d ).
en (G, a) ≡ m (H , b ).

is lemma leads to the following definition.

Definition .. Let G = (V , E) and H = (W , F ) be two graphs, and let a and b


be tuples of elements from V and W respectively. We say that (G, a) is 0-isomorphic to
(H , b )—in symbols, (G, a) ∼ =0 (H , b )—if a 7−→ b is a partial isomorphism from G to
H (i.e. for any two i , j ≤ |a|, Eai a j iff F bi b j ).
For m > 0, we say that (G, a) ∼ = (H , b ) if and only if
m

• for all c ∈ V , there is a d ∈ W such that (G, ac) ∼


= m−1 (H , b d ), and

• for all d ∈ W , there is a c ∈ V such that (G, ac) ∼


= m−1 (H , b d ).

It is easy to see that (G, a) ∼


=0 (H , b ) iff (G, a) ≡0 (H , b ). is can be used
as the base case in a proof by induction that for any m, if (G, a) ∼ = m (H , b ) then
. An Algebraic Characterisation of Elementary Equivalence 

(G, a) ≡ m (H , b ). e induction step follows immediately from the above definition


and the previous lemma.
Fraisse’s theorem says that the other direction also holds. For proving that we need
the following lemma.

Lemma .. ere are only finitely many inequivalent formulas of quantifier depth
≤ m having at most k free variables.

P Let C (m, k) denote the number of formulas of quantifier depth ≤ m having
at most k free variables. (To be precise, C (m, k) is the size of a maximal set of pairwise
inequivalent formulas each of which is of quantifier depth ≤ m and has at most k free
variables.) We prove by induction on m that for all k, C (m, k) is finite.
For any k, there are exactly p = 2 · k 2 atomic formulas, xi ≡ x j and Rxi x j
p
where i, j ≤ k. us there are at most 22 inequivalent quantifier-free formulas. us
C (0, k) is finite.
For the case where m > 0, we know by the induction hypothesis that C (m −1, k)
is finite for all k. A formula of quantifier depth ≤ m is a boolean combination of
formulas of quantifier depth ≤ m − 1 and formulas of the form ∀yφ(x, y) where φ is
2·C (m−1,k+1)
of quantifier depth ≤ m − 1. us C (m, k) ≤ 22 and is hence finite. ⊣

eorem .. If (G, a) ≡ m (H , b ) then (G, a) ∼


= m (H , b ).

P When m = 0, the theorem is immediate, as has already been noted.


Suppose m > 0 and that (G, a) ∼ ̸ m (H , b ). en one of the folowing two cases
=
holds and in both cases we prove that (G, a) ̸≡ m (H , b ).

• ere is c ∈ V such that for all d ∈ W , (G, ac) ∼ ̸ m−1 (H , b d ). By induc-


=
tion hypothesis, (G, ac) ̸≡ m−1 (H , b d ). us for each d ∈ W , there is a for-
mula φd (x, y) of quantifier depth ≤ m − 1 such that (G, ac) ⊨ φd (x, y) and
(H , b d ) ⊭ φd (x, y). Since there are only finitely many φd ’s which are inequiv-
alent their conjunction is equivalent to a formula ψ(x, y) of quantifier depth ≤
m−1. Now (G, ac) ⊨ ψ(x, y) but for all d ∈ W , (H , b d ) ⊭ ψ(x, y). erefore
(G, a) ⊨ ∃yψ(x, y) but (H , b ) ⊭ ∃yψ(x, y). is shows that (G, a) ̸≡ m (H , b ).

• ere is d ∈ W such that for all c ∈ V , (G, ac) ∼ ̸ m−1 (H , b d ). By induc-


=
tion hypothesis, (G, ac) ̸≡ m−1 (H , b d ). us for each c ∈ V , there is a for-
mula φc (x, y) of quantifier depth ≤ m − 1 such that (G, ac) ⊨ φc (x, y) and
. An Algebraic Characterisation of Elementary Equivalence 

(H , b d ) ⊭ φc (x, y). Since there are only finitely many φc ’s which are inequiva-
lent their disjunction is equivalent to a graph formula ψ(x, y) of quantifier depth
≤ m − 1. Now for all c ∈ V , (G, ac) ⊨ ψ(x, y) but (H , b d ) ⊭ ψ(x, y). ere-
fore (G, a) ⊨ ∀yψ(x, y) but (H , b ) ⊭ ∀yψ(x, y). is shows that (G, a) ̸≡ m
(H , b ). ⊣

We say that (G, a) is finitely isomorphic to (H , b ) – (G, a) ∼


= f (H , b ) in symbols
– iff (G, a) is m-isomorphic to (H , b ) for all m ≥ 0. From the definitions and the
previous theorem, the following immediately follows, giving us the required algebraic
characterisation of elementary equivalence.
eorem .. (Fraisse’s theorem) For any two graphs G and H , and tuples a and b
of the same length from G and H respectively,

(G, a) ∼
= f (H , b ) iff (G, a) ≡ (H , b ).

Extending the theorem to arbitrary signatures


It is clear that the definitions and proofs in the above section extend to arbitrary (fi-
nite) relational signatures (signatures containing only relation symbols) almost ver-
batim. e definition of partial isomorphism needs to be extended, but that is fairly
straightforward. Note also that for any relational signature, there are only finitely many
inequivalent formulas with k free variables and quantifier depth ≤ m. is property
does not hold for signatures containing function symbols.
Let L be an arbitrary (finite) signature. For every n-ary function symbol f occur-
ring in L, define a new (n + 1)-ary relation symbol F and, for each constant symbol c
occurring in L, define a new unary relation C . Let L r consist of the relation symbols
from L together with the new relation symbols. L r is relational. For an L-structure
M , let M r be the L r structure obtained from M , with the following interpretation
r r
for the new relation symbols: F M (a1 , . . . , an , a) iff f M (a1 , . . . , an ) = a, and C M (a)
iff c M = a. One can systematically construct an L r -formula φ r for every L-formula
φ such that M ⊨ φ iff M r ⊨ φ r . For example, if φ is the formula f ( f (g (c, x))) = y,
then φ r is the formula ∃z1 z2 z3 [C (z1 )∧G(z1 , x, z2 )∧ F (z2 , z3 )∧ F (z3 , y)]. We leave
it as an exercise to the reader to formally state and prove the result. From the above
considerations, it follows that (M , a) ≡ (M ′ , b ) iff (M r , a) ≡ ((M ′ ) r , b ). (But
note that it is not the case that (M , a) ≡ m (M ′ , b ) iff (M r , a) ≡ m ((M ′ ) r , b ) for
all m.)
We also need to extend the definition of partial isomorphism to arbitrary signa-
tures. Here it is.
. An Algebraic Characterisation of Elementary Equivalence 

Definition .. Let M = (S, ι) and M ′ = (S ′ , ι′ ) be two L-structures and let p be


a partial function from S to S ′ . We call p a partial isomorphism iff:

• p is injective.

• p is a homomorphism in the following sense:

– For n-ary relation symbols P in L and a1 , . . . , an ∈ dom( p),



P M (a1 , . . . , an ) iff P M ( p(a1 ), . . . , p(an )).

– For n-ary function symbols f in L and a1 , . . . , an , a ∈ dom( p),



f M (a1 , . . . , an ) = a iff f M ( p(a1 ), . . . , p(an )) = p(a).

– For constant symbols c in L and a ∈ dom( p),



c M = a iff c M = p(a).

From the above definition it is clear that a given p is a partial isomorphism from
M to M ′ iff it is a partial isomorphism from M r to (M ′ ) r . us it follows that
M∼ = m M ′ iff M r ∼ = m (M ′ ) r , for any given m. We can now easily prove Fraisse’s
theorem for arbitrary finite signatures. M ∼ = f M ′ iff M r ∼ = f (M ′ ) r iff M r ≡
(M ′ ) r iff M ≡ M ′ .

Examples
We give two examples in this section, which illustrate the use of the easier half of
Fraisse’s theorem.

Example  Suppose L = (s , 0) where s is a unary “successor” function symbol and 0


is a constant. Let X consist of the “successor axioms”:

• ∀x(¬(x ≡ 0) ≡ ∃y(s (y) ≡ x)),

• ∀x∀y((s (x) ≡ s(y)) ⊃ (x ≡ y)), and


def
• for every m ≥ 1 : ∀x¬(s m (x) ≡ x). (s 0 (x) = x and for all m ≥ 0, s m+1 (x) =
s (s m (x)).)
. An Algebraic Characterisation of Elementary Equivalence 

e natural numbers with the usual successor function is a model of X . We want to


prove that any two models of X are elementarily equivalent. Towards that end we
prove that any two models of X are finitely isomorphic. If any two models of X are
elementarily equivalent, then for any sentence φ, either all models of X satisfy φ or
all models of X satisfy ¬φ. us for any sentence φ, X ⊨ φ or X ⊨ ¬φ. us X is
an example of a so-called complete theory, a theory which can decide any statement one
way or the other. A further point to note is that X is a recursive set of sentences, and
so forms the basis of a procedure to decide the truth or falsity of any L-sentence φ in
the structure N. Simply enumerate longer and longer proofs which use formulas from
X as additional axioms, apart from the standard axioms and rules. Since either X ⊢ φ
or X ⊢ ¬φ, eventually a proof of φ or ¬φ will turn up. Halt and announce the result
at that point.
Let us return to our present concern, which is that of proving any two models of
X finitely isomorphic. First, we fix the following notation: For a model M = (S, ι)
def
of X and a ∈ S we set a (m) = f m (a), where f = s M . For every n ∈ N we define a
“distance function” dn on S × S by

m if a (m) = a ′ and m ≤ 2n
def 
dn (a, a ′ ) = −m if (a ′ )(m) = a and m ≤ 2n

∞ otherwise.

Now suppose M = (S, ι) and M ′ = (S ′ , ι′ ) are two models of X . For notational


simplicity we will assume that every tuple a we mention below contains 0M as the

first element and every tuple b contains 0M as the first element. Let a and b be
tuples of elements from M and M ′ respectively, both having the same number of
elements. We say that (M , a) and (M ′ , b ) are “dn -equivalent” if (M , a) ∼ =0 (M ′ , b )
and for all i , j ≤ |a|, dn (ai , a j ) = dn (bi , b j ). We wish to prove that whenever (M , a)
and (M ′ , b ) are dn -equivalent, (M , a) ∼ =n (M ′ , b ). e base case is quite easy, since
whenever (M , a) and (M , b ) are d -equivalent, a = 0M , b = 0M , and (M , a) ∼

0 0 0 =0
(M ′ , b ) by definition. Suppose (M , a) and (M ′ , b ) are dn+1 -equivalent. Consider
an arbitrary c ∈ S. Now it could be the case that for some i ≤ |a|, |dn (ai , c)| ≤ 2n . If
that is so, choose d ∈ S ′ with dn (bi , d ) = dn (ai , c). It is easy to check that (M , ac)
and (M ′ , b d ) are dn -equivalent. If |dn (ai , c)| > 2n for all i ≤ |a| then choose d ∈ S ′
such that |dn (bi , d )| > 2n for all i (such an element d must exist since every model
of X is infinite!). Now again it is easy to see that (M , ac) and (M ′ , b d ) are dn -
equivalent. But by the induction hypothesis (on n) this means that for all c ∈ S there
exists a d ∈ S ′ such that (M , ac) ∼ =n (M ′ , b d ). By symmetric reasoning, we can
. An Algebraic Characterisation of Elementary Equivalence 

show that for any d ∈ S ′ , there exists a c such that (M , ac) ∼


=n (M ′ , b d ). ese two
facts imply, by definition, that (M , a) ∼ =n+1 (M ′ , b d ).

In earlier sections, we showed that some classes of structures are not ∆-elementary.
e arguments involved the Compactness eorem and used infinite structures. With
the techniques at our disposal now, we can show that certain properties cannot be
expressed by a first-order sentence, even if we restrict ourselves to finite structures. We
illustrate this approach by the following example.

eorem .. Let L be the language of graphs. ere is no L-sentence whose finite
models are the finite connected graphs. (Hence, in particular, the class of connected graphs
is not elementary.)

P For k ≥ 0 let Gk = (Vk , Ek ) be the graph corresponding to the regular (k +1)-
gon, where
Vk = {0, . . . , k}
and
Ek = {(i , i + 1) | i < k} ∪ {(i , i − 1) | 1 ≤ i ≤ k} ∪ {(0, k), (k, 0)},
and let Hk = (Wk , Fk ) consist of two disjoint copies of Gk , say,

Wk = {0, . . . , k} × {0, 1}

and
Fk = {((i , 0), ( j , 0)) | (i , j ) ∈ Ek } ∪ {((i , 1), ( j , 1)) | (i , j ) ∈ Ek }.
We claim that:
For all k ≥ 2 m : Gk ∼
= m Hk .
en we are done. In fact, let φ be an L-sentence and m be the quantifier rank of
φ. en we have that G2m ∼ = m H2m , i.e. G2m ≡ m H2m and therefore G2m ⊨ φ iff
H2m ⊨ φ. Since G2m is connected, but H2m is not, the class of finite models of φ
cannot be identical with the class of all finite connected graphs.
For proving that for all k ≥ 2 m : Gk ∼ = m Hk , we proceed as follows. For fixed
k ≥ 2 and n ≥ 0, we define “distance functions” d on Vk ×Vk and d ′ on Wk ×Wk ,
m

as follows:
(
def length of the shortest path connecting a and b in Gk , if this length is ≤ 2 ;
m
d (a, b ) =
∞, otherwise;
. An Algebraic Characterisation of Elementary Equivalence 

(
def d (a, b ) if i = j ;
d ′ ((a, i ), (b , j )) =
∞ otherwise.

We say that (Gk , a) and (Hk , b ) are (d , d ′ )-equivalent iff for all i, j ≤ |a|, d (ai , a j ) =
d ′ (bi , b j ). Just like in the previous example, we can prove that whenever (Gk , a) and
(H , b ) are (d , d ′ )-equivalent, (G , a) ∼
k k = (H , b ).
m k ⊣

Ehrenfeucht Games
e algebraic description of elementary equivalence is well-suited for many purposes.
However, it lacks the intuitive appeal of a game-theoretical characterisation due to
Ehrenfeucht, which we look at in the present section.
Let L be an arbitrary signature and let M = (S, ι) and M ′ = (S ′ , ι′ ) be L-
structures. To simplify the formulation we assume S ∩ S ′ = ;. e Ehrenfeucht
game G (M , M ′ ) corresponding to M and M ′ is played by two players, Spoiler and
Duplicator, according to the following rules:
Each play of the game begins with Spoiler choosing a natural number r ≥ 1; r
is the number of subsequent moves each player has to make in the course of the play.
ese subsequent moves are begun by the Spoiler, and both players move alternately.
Each move consists of choosing an element from S ∪ S ′ . If Spoiler chooses an element
ai ∈ S in his i -th move, then Duplicator must choose an element bi ∈ S ′ in his i -
th move. If Spoiler chooses an element bi ∈ S ′ in his i -th move, then Duplicator
must choose an element ai ∈ S in his i -th move. After the r -th move of Duplicator
the play is completed. Altogether some number r ≥ 1, elements a1 , . . . , a r ∈ S and
b1 , . . . , b r ∈ S ′ have been chosen. Duplicator has won the play iff (M , a) ∼
=0 (M ′ , b ).
We say that Duplicator has a winning strategy in G (M , M ′ ) and write “Duplicator
wins G (M , M ′ )” if it is possible for him to win each play. (Following Ebbinghaus,
Flum, and omas, we omit an exact definition of the notion of “winning strategy”.)

Lemma .. M ∼
= f M ′ iff Duplicator wins G (M , M ′ ).

P We prove a more general statement: (M , a) ∼


= f (M ′ , b ) iff Duplicator wins
G (M , a, M ′ , b ).
Suppose (M , a) ∼ = f (M ′ , b ). We describe a winning strategy for Duplicator:
If Spoiler chooses the number r at the beginning of a G (M , a, M ′ , b )-play, then
for i = 1, . . . , r Duplicator should choose the elements ci ∈ S (or respectively di ∈ S ′ )
so as to maintain (M , ac1 . . . ci ) ∼
= r −i (M ′ , b d1 · · · di ). at we can always do this
. Decidability 

follows from the fact that (M , a) ∼


= r (M ′ , b ). For i = r it follows that Duplicator
has a winning strategy for the game.
Conversely, suppose (M , a) ∼
̸ f (M ′ , b ). en we give a winning strategy in r
=
moves for Spoiler in G (M , M ′ ). If (M , a) ∼ ̸ (M ′ , b ) then it is immediate that
= 0
Spoiler wins all plays in G (M , a, M ′ , b ), even the play with no moves. Suppose
(M , a) ∼ ̸ r (M ′ , b ). en Spoiler chooses r at the beginning of the game. Now it
=
is clear that either there is a c ∈ S such that for all d ∈ S ′ , (M , ac) ∼
̸ r −1 (M ′ , b d ),
=
or there is a d ∈ S ′ such that for all c ∈ S, (M , ac) ∼ ̸ r −1 (M ′ , b d ). Suppose the
=
former. en the Spoiler chooses the element c such that for all d ∈ S ′ , (M , ac) ̸∼ = r −1

(M , b d ). From this and the induction hypothesis it follows that no matter what d
Duplicator plays, Spoiler has a winning strategy in r −1 moves in G (M , ac, M ′ , b d ).
Similarly in the case where there is a d ∈ S ′ such that for all c ∈ S, (M , ac) ∼
̸ r −1
=
(M ′ , b d ). us Spoiler has a winning strategy in r moves in the original game. ⊣

e above lemma and Fraisse’s theorem together yield the following:


eorem .. (Ehrenfeucht’s eorem) Let L be a finite signature. en for any
L-structures M and M ′ :

M ≡ M ′ iff Duplicator wins G (M , M ′ ).

. Decidability
We consider in this section the satisfiability problem for first-order logic. is is the
problem of determining whether a given first-order formula is satisfiable. We saw
earlier that the corresponding problem for propositional logic, many modal logics,
and dynamic logic is decidable. In contrast, the problem is undecidable for first-order
logic. We present a particularly simple proof of this result here. Our undecidability
proof proceeds by reducing the reachability problem for two-counter machines to the
satisfiability problem.
A two-counter machine is a finite-state automaton equipped with two counters
which can contain arbitrary natural numbers. Formally it is a tuple M = (Q, q0 , ∆, F )
where:
• Q is a finite set of states,

• q0 ∈ Q is the initial state,

• F ⊆ Q is the set of final states, and


. Decidability 

• ∆ ⊆ Q × {0, 1}2 × Q × {−1, 0, 1}2 is the transition relation satisfying the fol-
lowing condition:

– for all (q, z1 , z2 , q ′ , δ1 , δ2 ) ∈ ∆ and i ∈ {1, 2}, if δi = −1 then zi = 1.

In a transition (q, z1 , z2 , q ′ , δ1 , δ2 ), for i ∈ {1, 2}, zi = 0 denotes the fact that


the value of the i -th counter is zero, and zi = 1 denotes the fact that the value of the
i -th counter is nonzero. δi specifies the value to be added to the i-th counter. e
condition on transitions reflect the fact that we can decrement only positive counters.
A configuration of a two-counter machine M is a triple (q, m1 , m2 ) ∈ Q × N ×
N. For a transition t = (r, z1 , z2 , r ′ , δ1 , δ2 ) and configurations (q, m1 , m2 ) and
t
(q ′ , m1′ , m2′ ), (q, m1 , m2 )−→(q ′ , m1′ , m2′ ) exactly when q = r , q ′ = r ′ , and for
i ∈ {1, 2}, (i) zi = 0 iff mi = 0 and (ii) mi′ = mi + δi .

We say that (q, m1 , m2 )−→(q ′ , m1′ , m2′ ) iff there is a sequence of transitions lead-
ing from (q, m1 , m2 ) to (q ′ , m1′ , m2′ ). e configuration (q0 , 0, 0) is called the initial
configuration. (q, m1 , m2 ) is called a final configuration if q ∈ F . e reachability
problem for two-counter machines is the problem of determining whether a final con-
figuration is reachable from the initial configuration. We assume that the reader is
familiar with the fact that this problem is undecidable.
Given a two-counter machine M one can define a first-order language LM and a
first-order formula φM over LM such that a final configuration is reachable in M iff
φM is valid. It is easy to see now that the satisfiability problem for first-order logic
is undecidable. Suppose, on the contrary, that it is decidable. en we could decide
the reachability problem for two-counter machines as follows: given any two-counter
machine M , construct φM and declare a final configuration to be reachable exactly
when ¬φM is not satisfiable.

e reduction
Let M = (Q, q0 , ∆, F ) be a given two-counter machine. en LM is defined to be
(CM , FM , RM ) where:

• CM = {q | q ∈ Q} ∪ {},

• FM = {s} with #(s) = 1, and

• RM = {conf } with #(conf ) = 3.

For each t = (q, z1 , z2 , q ′ , δ1 , δ2 ) ∈ ∆ we define a formula φ t . Rather than giving


the most general definition, we show the construction for two representative examples:
. Decidability 

Let t = (q, 0, 1, q ′ , 1, −1). en φ t is the following formula:

∀x [(conf (q, , x) ∧ ∃y (x = s(y))) ⊃ conf (q′ , s(), y)].

Let t = (q, 1, 1, q ′ , 0, 1). en φ t is the following formula:

∀x y [(conf (q, x, y) ∧ ∃x ′ y ′ (x = s(x ′ ) ∧ y = s(y ′ ))) ⊃ conf (q′ , x, s(y))].

Now we define the following sequence of formulas:


def
• init = conf (q , , ).
def

• final = ∃x ∃y conf (q, x, y).
q∈F

def

• φ∆ = φt .
t ∈∆

def
• φM = (φ∆ ∧ init) ⊃ final.

e following two lemmas prove that the reduction is correct. (We use m as an
abbreviation for s m () in the formulas, in what follows.)

Lemma .. For every configuration (q, m1 , m2 ) of M ,



(q0 , 0, 0)−→(q, m1 , m2 ) =⇒⊨ (φ∆ ∧ init) ⊃ conf(q, m , m ).

In particular, if a final configuration is reachable in M then φM is valid.



P We prove that whenever (q0 , 0, 0)−→(q, m1 , m2 ), it is also the case that

⊨ (φ∆ ∧ init) ⊃ conf (q, m , m ).

We do this by induction on the number of steps it takes to reach (q, m1 , m2 ).


Basis: e only configuration reachable in zero steps is (q0 , 0, 0) itself, and sure enough,
⊨ (φ∆ ∧ init) ⊃ conf (q , , ).
∗ t
Induction step: Suppose (q0 , 0, 0)−→(q ′ , m1′ , m2′ )−→(q, m1 , m2 ) and ⊨ (φ∆ ∧init) ⊃
conf (q′ , m′ , m′ ). Now it is an easy exercise to check that ⊨ (conf (q′ , m′ , m′ )∧φ t ) ⊃
conf (q, m , m ). It follows that ⊨ (φ∆ ∧ init) ⊃ conf (q, m , m ), as desired. ⊣

Lemma .. If φM is valid, then a final configuration is reachable in M .


. Decidability 

P We prove the desired statement in the contrapositive form. Suppose no fi-


nal state is reachable from the initial configuration. Now we define an LM structure
M = (S, ι) as follows: S = N; for each q ∈ Q, ι(q) is an arbitrary distinct natural
def
number; ι() = 0; ι(s) is the successor function on N; and ι(conf ) = {(ι(q′ ), m1′ , m2′ ) |

(q0 , 0, 0)−→(q ′ , m1′ , m2′ )}. It is again an easy exercise to check that M ⊨ φ t for all
t ∈ ∆, and of course, M ⊨ init ∧ ¬final. us we see that M ⊭ (φ∆ ∧ init) ⊃ final.
It follows that φM is not valid.
us we see that if φM is valid, then a final configuration is reachable in M . ⊣

e above two lemmas, in conjunction with the fact that the reachability problem
for two-counter machines is undecidable, immediately yields the following theorem.

eorem .. e satisfiability problem (as also the validity problem) for first-order
logic is undecidable.

e above reduction uses a language with a unary function symbol, a ternary relation
symbol, and some constants. Using some coding tricks, we can get by with using just
the ternary relation symbol and constants. Working out the minimal expressive power
which leads to undecidability is an interesting problem, which has generated a lot of
research over the years. In fact, there are books solely devoted to the study of the status
of decidability of various fragments of first-order logic.

You might also like