Fields of Logic and Computation Essays Dedicated To Yuri Gurevich On The Occasion of His 70th Birthday PDF
Fields of Logic and Computation Essays Dedicated To Yuri Gurevich On The Occasion of His 70th Birthday PDF
Fields of Logic and Computation Essays Dedicated To Yuri Gurevich On The Occasion of His 70th Birthday PDF
Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Alfred Kobsa
University of California, Irvine, CA, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
TU Dortmund University, Germany
Madhu Sudan
Microsoft Research, Cambridge, MA, USA
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max-Planck Institute of Computer Science, Saarbruecken, Germany
Andreas Blass Nachum Dershowitz
Wolfgang Reisig (Eds.)
Fields of Logic
and Computation
Essays Dedicated to Yuri Gurevich
on the Occasion of His 70th Birthday
13
Volume Editors
Andreas Blass
University of Michigan, Mathematics Department
Ann Arbor, MI 48109-1043, USA
E-mail: [email protected]
Nachum Dershowitz
Tel Aviv University, School of Computer Science
Ramat Aviv, Tel Aviv 69978, Israel
E-mail: [email protected]
Wolfgang Reisig
Humboldt-Universität zu Berlin, Institut für Informatik
Unter den Linden 6, 10099 Berlin, Germany
E-mail: [email protected]
Credits
The frontispiece photograph was taken by Bertrand Meyer at the Eidgenössische
Technische Hochschule (ETH) in Zürich, Switzerland on May 16, 2004. Used with
permission.
ISSN 0302-9743
ISBN-10 3-642-15024-1 Springer Berlin Heidelberg New York
ISBN-13 978-3-642-15024-1 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
springer.com
© Springer-Verlag Berlin Heidelberg 2010
Printed in Germany
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper 06/3180
Dedicated to
Yuri Gurevich
in honor of his 70th birthday,
with deep admiration and affection.
(Psalms 126:5)
Yuri Gurevich has played a major role in the discovery and development of ap-
plications of mathematical logic to theoretical and practical computer science.
His interests have spanned a broad spectrum of subjects, including decision pro-
cedures, the monadic theory of order, abstract state machines, formal methods,
foundations of computer science, security, and much more.
In May 2010, Yuri celebrated his 70th birthday. To mark that occasion, on
August 22, 2010, a symposium was held in Brno, the Czech Republic, as a satel-
lite event of the 35th International Symposium on Mathematical Foundations
of Computer Science (MFCS 2010) and of the 19th EACSL Annual Conference
on Computer Science Logic (CSL 2010). The meeting received generous support
from Microsoft Research.
In preparation for this 70th birthday event, we asked Yuri’s colleagues
(whether or not they were able to attend the symposium) to contribute to a
volume in his honor. This book is the result of that effort. The collection of
articles herein begins with an academic biography, an annotated list of Yuri’s
publications and reports, and a personal tribute by Jan Van den Bussche. These
are followed by 28 technical contributions. These articles – though they cover
a broad range of topics – represent only a fraction of Yuri’s multiple areas of
interest.
Each contribution was reviewed by one or two readers. In this regard, the
editors wish to thank several anonymous individuals for their assistance.
We offer this volume to Yuri in honor of his birthday and in recognition of
his grand contributions to the fields of logic and computation.
On Yuri Gurevich
Yuri, Logic, and Computer Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Andreas Blass, Nachum Dershowitz, and Wolfgang Reisig
Technical Papers
Tracking Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Sergei Artemov
Yuri Gurevich was born on May 7, 1940, in Nikolayev, Ukraine, which was a part
of Soviet Union at the time. A year later, World War II reached the Soviet Union,
and Yuri’s father was assigned to work in a tank body factory near Stalingrad. So
that’s where Yuri spent the second year of his life, until the battle of Stalingrad
forced the family, except for his father, to flee. Their home was destroyed by
bombing only hours after they left. But fleeing involved crossing the burning
Volga and then traveling in a vastly overcrowded train, in which many of the
refugees died; in fact, Yuri was told later that he was the only survivor among
children of his age. His mother decided that they had to leave the train, and the
family lived for two years in Uzbekistan. In May 1944, the family reunited in
Chelyabinsk, in the Ural Mountains, where the tank body factory had moved in
the meantime, and that is where Yuri attended elementary and high school.
An anecdote from his school days (recorded in [123]1 ) can serve as a premoni-
tion of the attention to resources that later flowered in Yuri’s work on complexity
theory. To prove some theorem about triangles, the teacher began with “Take
another triangle such that . . . .” Yuri asked, “Where does another triangle come
from? What if there are no more triangles?” (commenting later that shortages
were common in those days). For the sake of completeness, we also record the
teacher’s answer, “Shut up.”
After graduating from high school, Yuri spent three semesters at the Chelya-
binsk Polytechnik. Because of a dissatisfaction with the high ratio of memoriza-
tion to knowledge in the engineering program, Yuri left after a year and a half
and enrolled in Ural State University to study mathematics.
Yuri obtained four academic degrees associated with Ural State University: his
master’s degree in 1962, his candidate’s degree (equivalent to the Western Ph.D.)
in 1964, his doctorate (similar to habilitation, but essentially guaranteeing an
appointment as full professor) in 1968, and an honorary doctorate in 2005. At
Ural State University, Yuri ran a flourishing logic seminar, and he founded a
Mathematical Winter School that is still functioning today. It should also be
noted that the four-year interval between the candidate’s and doctor’s degrees
was unusually short.
1
Numerical references are to the annotated bibliography in this volume; they match
the numbering on Yuri’s web site.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 1–6, 2010.
c Springer-Verlag Berlin Heidelberg 2010
2 A. Blass, N. Dershowitz, and W. Reisig
Gurevich taught us to think freely. It was helpful that his specialty was
logic – the science of proofs. He tried unobtrusively to impress upon us
that the final judgment is ours and not that of the Central Committee
of the Communist Party or that of Marx–Engels.
It all started with a seminar on axiomatic set theory. The idea of a
winter school was born there. The schedule of the Winter Math School
included not only studies but also mandatory daily skiing and various
entertainment activities. For example, Gurevich liked debates à la me-
dieval scholastic disputes. He would volunteer to argue any ridiculous
Yuri, Logic, and Computer Science 3
and obviously false thesis of our choice in order to demonstrate the art
of arguing. Therein lay his secret “counter-revolutionary Zionist” (in the
terminology of the time) plot: to teach us to argue, doubt, prove, refute.
In general to teach us to think independently.
4
Edited by G. Rozenberg, A. Salomaa, and (for the last two) G. Paun; published by
World Scientific in 1993, 2001, and 2004.
5
Originally called “dynamic structures” and subsequently “evolving algebras.”
6 A. Blass, N. Dershowitz, and W. Reisig
The following list of publications and annotations is derived from Yuri Gurevich’s
website,1
http://research.microsoft.com/en-us/um/people/gurevich/annotated.htm .
Abbreviations:
0. Egon Börger, Erich Grädel, Yuri Gurevich: The Classical Decision Problem.
Springer Verlag, Perspectives in Mathematical Logic, 1997. Second printing,
Springer Verlag, 2001. Review in Journal of Logic, Language and Information
8:4 (1999), 478–481. Review in ACM SIGACT News 35:1 (March 2004), 4–7
The classical decision problem is (in its modern meaning) the problem of
classifying fragments of first-order logic with respect to the decidability and com-
plexity of the satisfiability problem as well as the satisfiability problem over finite
domains. The results and methods employed are used in logic, computer science
and artificial intelligence.
The book gives the most complete and comprehensive treatment of the classical
decision problem to date, and includes an annotated bibliography of 549 items.
Much of the material is published for the first time in book form; this includes
the classifiability theory, the classification of the so-called standard fragments,
and the analysis of the reduction method. Many proofs have been simplified and
there are many new results and proofs.
1. Yuri Gurevich: Groups covered by proper characteristic subgroups. Trans. of Ural
University 4:1 (1963), 32–39 (Russian, Master’s thesis)
2. Yuri Gurevich, Ali I. Kokorin: Universal equivalence of ordered abelian groups.
Algebra and Logic 2:1 (1963), 37–39 (Russian)
We prove that no universal first-order property distinguishes between any two
ordered abelian groups.
1
The editors thank Zoe Gurevich for her help.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 7–48, 2010.
c Springer-Verlag Berlin Heidelberg 2010
8 Annotated List of Publications of Yuri Gurevich
15. Yuri Gurevich: Minsky machines and the ∀∃∀&∃∗ case of the decision problem.
Trans. of Ural University 7:3 (1970), 77–83 (Russian)
An observation that Minsky machines may be more convenient than Turing
machines for reduction purposes is illustrated by simplifying the proof from [7]
that some [∀∃∀&∃∗,(k,1)] is a reduction class.
16. Yuri Gurevich, Igor O. Koriakov: A remark on Berger’s paper on the domino
problem. Siberian Mathematical Journal, 13 (1972), 459–463 (Russian)
Berger proved that the decision problem for the unrestricted tiling problem
(a.k.a. the unrestricted domino problem) is undecidable. We strengthen Berger’s
result. The following two collection of domino sets are recursively inseparable:
(1) those that can tile the plane periodically (equivalently, can tile a torus) and
(2) those that cannot tile the plane at all.
It follows that the collection of domino sets that can tile a torus is undecidable.
16a. Yuri Gurevich, Igor O. Koriakov: A remark on Berger’s paper on the domino
problem. Siberian Mathematical Journal 13 (1972), 319–321 (English)
This is an English translation of [16].
17. Yuri Gurevich, Tristan Turashvili: Strengthening a result of Suranyi. Bulletin of
the Georgian Academy of Sciences 70 (1973), 289–292 (Russian)
18. Yuri Gurevich: Formulas with one universal quantifier. In: Selected Questions of
Algebra and Logic, Volume dedicated to the memory of A.I. Malcev, Publishing
house Nauka – Siberian Branch, Novosibirsk, (1973), 97–110 (Russian)
The main result, announced in [9], is that the ∃∗ ∀∃∗ class of first-order logic
with functions but without equality has the finite model property (and therefore
is decidable for satisfiability and finite satisfiability). This result completes the
solution in [9] for the classical decision problem for first-order logic with functions
but without equality.
19. Yuri Gurevich: The decision problem for the expanded theory of ordered abelian
groups. Soviet Institute of Scientific and Technical Information (VINITI), 6708:73
(1974), 1–31 (Russian)
20. Yuri Gurevich: The decision problem for first-order logic. Manuscript (1971), 124
pages (Russian)
This was supposed to be a book (and eventually it became the core of the
book [0]), but the publication of the original Russian book was aborted when the
author left USSR. A German translation of the manuscript can be found in Uni-
versitätsbibliothek Dortmund (Ostsprachen-Übersetzungsdienst) and Technische
Informationsbibliothek und Universitätsbibliothek Hannover.
21. Yuri Gurevich: The decision problem for standard classes. JSL 41 (1976), 460–464
The classification of prefix-signature fragments of (first-order) predicate logic
with equality, completed in [7], is extended to first-order logic with equality and
functions. One case was solved (confirming a conjecture of this author) by Saharon
Shelah.
22. Yuri Gurevich: Semi-conservative reduction. Archiv für Math. Logik und Grund-
lagenforschung 18 (1976), 23–25
23. Ilya Gertsbakh, Yuri Gurevich: Constructing an optimal fleet for a transportation
schedule. Transportation Science 11 (1977), 20–36
A general method for constructing all optimal fleets is described.
Annotated List of Publications of Yuri Gurevich 11
24. Yuri Gurevich: Intuitionistic logic with strong negation. Studia Logica 36 (1977),
49–59
Classical logic is symmetric with respect to True and False but intuitionis-
tic logic is not. We introduce and study a conservative extension of first-order
intuitionistic logic that is symmetric with respect to True and False.
25. Yuri Gurevich: Expanded theory of ordered abelian groups. Annals of Mathemat-
ical Logic 12 (1977), 193–228
The first-order theory of ordered abelian groups was analyzed in [3]. How-
ever, algebraic results on ordered abelian groups in the literature usually cannot
be stated in first-order logic. Typically they involve so-called convex subgroups.
Here we introduce an expanded theory of ordered abelian groups that allows
quantification over convex subgroups and expresses almost all relevant algebra.
We classify ordered abelian groups by the properties expressible in the expanded
theory, and we prove that the expanded theory of ordered abelian groups is de-
cidable. Curiously, the decidability proof is simpler than that in [3]. Furthermore,
the decision algorithm is primitive recursive.
26. Yuri Gurevich: Monadic theory of order and topology, I. Israel Journal of Math-
ematics 27 (1977), 299–319
We disprove two of Shelah’s conjectures and prove some more results on the
monadic theory of linearly orderings and topological spaces. In particular, if the
Continuum Hypothesis holds then there exist monadic formulæ expressing the
predicates “X is countable” and “X is meager” over the real line and over Cantor’s
Discontinuum.
27. Yuri Gurevich: Monadic theory of order and topology, II. Israel Journal of Math-
ematics 34 (1979), 45–71
Assuming the Continuum Hypothesis, we interpret the theory of (the cardinal
of) the continuum with quantification over constructible (monadic, dyadic, etc.)
predicates in the monadic (second-order) theory of real line, in the monadic the-
ory of any other short non-modest chain, in the monadic topology of Cantor’s
Discontinuum and some other monadic theories. We exhibit monadic sentences
defining the real line up to isomorphism under some set-theoretic assumptions.
There are some other results.
28. Yuri Gurevich: Modest theory of short chains, I. JSL 44 (1979), 481–490
The composition (or decomposition) method of Feferman-Vaught is generalized
and made much more applicable.
29. Yuri Gurevich, Saharon Shelah: Modest theory of short chains, II. JSL 44 (1979),
491–502
We analyze the monadic theory of the rational line and the theory of the real
line with quantification over “small” subsets. The results are in some sense the
best possible.
30. Yuri Gurevich: Two notes on formalized topology. Fundamenta Mathematicae 57
(1980), 145–148
31. Yuri Gurevich, W. Charles Holland: Recognizing the real line. Transactions of
American Math. Society 265 (1981), 527–534
We exhibit a first-order statement about the automorphism group of the real
line that characterizes the real line among all homogeneous chains.
12 Annotated List of Publications of Yuri Gurevich
32. Andrew M. W. Glass, Yuri Gurevich, W. Charles Holland, Saharon Shelah: Rigid
homogeneous chains. Math. Proceedings of Cambridge Phil. Society 89 (1981),
7–17
33. Andrew M. W. Glass, Yuri Gurevich, W. Charles Holland, Michèle Jambu-
Giraudet: Elementary theory of automorphism groups of doubly homogeneous
chains. Springer Lecture Notes in Mathematics 859 (1981), 67–82
34. Yuri Gurevich: Crumbly spaces. Sixth International Congress for Logic, Method-
ology and Philosophy of Science (1979) North-Holland (1982), 179–191
Answering a question of Henson, Jockush, Rubel and Takeuti, we prove that
the rationals, the irrationals and the Cantor set are all elementarily equivalent as
topological spaces.
35. Stal O. Aanderaa, Egon Börger, Yuri Gurevich: Prefix classes of Krom formulas
with identity. Archiv für Math. Logik und Grundlagenforschung 22 (1982), 43–49
36. Yuri Gurevich: Existential interpretation, II. Archiv für Math. Logik und Grund-
lagenforschung 22 (1982), 103–120
37. Yuri Gurevich, Saharon Shelah: Monadic theory of order and topology in ZFC.
Annals of Mathematical Logic 23 (1982), 179–198
In the 1975 Annals of Mathematics, Shelah interpreted true first-order arith-
metic in the monadic theory of order under the assumption of the continuum
hypothesis. The assumption is removed here.
38. Ilya Gertsbakh, Yuri Gurevich: Homogeneous optimal fleet. Transportation Re-
search 16B (1982), 459–470
39. Yuri Gurevich: A review of two books on the decision problem. Bulletin of the
American Mathematical Society 7 (1982), 273–277
40. Yuri Gurevich, Leo Harrington: Automata, trees, and games. 14th Annual Sym-
posium on Theory of Computing, ACM (1982), 60–65
We prove a forgetful determinacy theorem saying that, for a wide class of
infinitary games, one of the players has a winning strategy that is virtually mem-
oryless: the player has to remember only boundedly many bits of information. We
use forgetful determinacy to give a transparent proof of Rabin’s celebrated result
that the monadic second-order theory of the infinite tree is decidable.
41. Yuri Gurevich, Harry R. Lewis: The inference problem for template dependencies.
Information and Control 55 (1982), 69–79
Answering a question of Jeffrey Ullman, we prove that the problem in the title
is undecidable.
42. Andreas Blass, Yuri Gurevich: On the unique satisfiability problem. Information
and Control 55 (1982), 80–88
Papadimitriou and Yannakakis were interested whether Unique Sat is hard
for {L − L : L, L ∈ N P } when NP differs from co-NP (otherwise the answer is
obvious). We show that this is true under one oracle and false under another.
43. Edmund M. Clarke, Nissim Francez, Yuri Gurevich, A. Prasad Sistla: Can message
buffers be characterized in linear temporal logic? Symposium on Principles of
Distributed Computing, ACM (1982), 148–156
In the case of unbounded buffers, the negative answer follows from a result
in [28].
Annotated List of Publications of Yuri Gurevich 13
44. Yuri Gurevich: Decision problem for separated distributive lattices. JSL 48 (1983),
193–196
It is well known that for all recursively enumerable sets X1 , X2 there are
disjoint recursively enumerable sets Y1 , Y2 such that Yi ⊆ Xi and (Y1 ∪ Y2 ) =
(X1 ∪ X2 ). Alistair Lachlan called distributive lattices satisfying this property
separated. He proved that the first-order theory of finite separated distributive
lattices is decidable. We prove here that the first-order theory of all separated
distributive lattices is undecidable.
45. Yuri Gurevich, Menachem Magidor, Saharon Shelah: The monadic theory of ω2 .
JSL 48 (1983), 387–398
In a series of papers, Büchi proved the decidability of the monadic (second-
order) theory of ω0 , of all countable ordinals, of ω1 , and finally of all ordinals < ω2 .
Here, assuming the consistency of a weakly compact cardinal, we prove that, in
different set-theoretic worlds, the monadic theory of ω2 may be arbitrarily difficult
(or easy).
46. Yuri Gurevich, Saharon Shelah: Interpreting second-order logic in the monadic
theory of order. JSL 48 (1983), 816–828
Under a weak set-theoretic assumption, we interpret full second-order logic in
the monadic theory of order.
47. Yuri Gurevich, Saharon Shelah: Rabin’s Uniformization Problem. JSL 48 (1983),
1105–1119
The negative solution is given.
48. Yuri Gurevich, Saharon Shelah: Random models and the Gödel case of the decision
problem. JSL 48 (1983), 1120–1124
We replace Gödel’s sophisticated combinatorial argument with a simple prob-
abilistic one.
49. Andrew M. W. Glass, Yuri Gurevich: The word problem for lattice-ordered groups.
Transactions of American Math. Society 280 (1983), 127–138
The problem is proven to be undecidable.
50. Yuri Gurevich: Critiquing a critique of Hoare’s programming logics. Communica-
tions of ACM (May 1983), 385 (Tech. communication)
51. Yuri Gurevich: Algebras of feasible functions. 24th Annual Symposium on Foun-
dations of Computer Science, IEEE Computer Society Press, 1983, 210–214
We prove that, under a natural interpretation over finite domains,
(i) a function is primitive recursive if and only if it is logspace computable, and
(ii) a function is general recursive if and only if it is polynomial time computable.
52. Yuri Gurevich, Peter H. Schmitt: The theory of ordered abelian groups does not
have the independence property. Trans. of American Math. Society 284 (1984),
171–182
53. Yuri Gurevich, Harry R. Lewis: The word problem for cancellation semigroups
with zero. JSL 49 (1984), 184–191
In 1947, Post showed the word problem for semigroups to be undecidable. In
1950, Turing strengthened this result to cancellation semigroups, i.e. semigroups
satisfying the cancellation property
14 Annotated List of Publications of Yuri Gurevich
(1) if xy = xz or yx = zx then y = z.
No semigroup with zero satisfies (1). The cancellation property for semigroups
with zero and identity is
(2) if xy = xz = 0 or yx = zx = 0 then y = z.
The cancellation property for semigroups with zero but without identity is the
conjunction of (2) and
(3) if xy = x or yx = x then x = 0.
Whether or not a semigroup with zero has an identity, we refer to it as a cancel-
lation semigroup with zero if it satisfies the appropriate cancellation property. It
is shown in [8] that the word problem for finite semigroups is undecidable. Here
we show that the word problem is undecidable for finite cancellation semigroups
with zero; this holds for semigroups with identity and also for semigroups without
identity. (In fact, we prove a stronger effective inseparability result.) This provides
the necessary mathematical foundation for [41]).
54. Yuri Gurevich, Larry J. Stockmeyer, Uzi Vishkin: Solving NP-hard problems on
graphs that are almost trees, and an application to facility location problems.
Journal of the ACM 31 (1984), 459–473
Imagine that you need to put service stations (or MacDonald’s restaurants)
on roads in such a way that every resident is within, say, 10 miles of the nearest
station. What is the minimal number of stations and how does one find an optimal
placement? In general, the problem is NP hard; however in important special cases
there are feasible solutions.
55. Andreas Blass, Yuri Gurevich: Equivalence relations, invariants, and normal
forms. SIAM Journal on Computing 13 (1984), 682–689
For an equivalence relation E on the words in some finite alphabet, we consider
the following four problems.
Recognition. Decide whether two words are equivalent.
Invariant. Calculate a function constant on precisely the equivalence classes.
Normal form. Calculate a particular member of an equivalence class, given
an arbitrary member.
First member. Calculate the first member of an equivalence class, given an
arbitrary member.
A solution for any of these problems yields solutions for all earlier ones in the list.
We show that, for polynomial time recognizable E, the first member problem is
always in the class ΔP2 (solvable in polynomial time with an oracle for an NP set)
and can be complete for this class even when the normal form problem is solvable
in polynomial time. To distinguish between the other problems in the list, we
construct an E whose invariant problem is not solvable in polynomial time with
an oracle for E (although the first member problem is in NPE ∩ co-NPE ), and
we construct an E whose normal form problem is not solvable in polynomial time
with an oracle for a certain solution of its invariant problem.
56. Andreas Blass, Yuri Gurevich: Equivalence relations, invariants, and normal
forms, II. Springer LNCS 171 (1984), 24–42
We consider the questions whether polynomial time solutions for the easier
problems of the list for [55] yield NP solutions for the harder ones, or vice versa.
We show that affirmative answers to several of these questions are equivalent to
natural principles like NP = co-NP, (NP ∩ co-NP) = P, and the shrinking principle
for NP sets. We supplement known oracles with enough new ones to show that
Annotated List of Publications of Yuri Gurevich 15
all questions considered have negative answers relative to some oracles. In other
words, these questions cannot be answered affirmatively by means of relativizable
polynomial-time Turing reductions. Finally, we show that the analogous questions
in the framework where Borel sets play the role of polynomial time decidable sets
have negative answers.
57. Yuri Gurevich, Saharon Shelah: The monadic theory and the ‘next world’. Israel
Journal of Mathematics 49 (1984), 55–68
Let r be a Cohen real over a model V of ZFC. Then the second-order V [r]-
theory of the integers (even the reals if V satisfies CH) is interpretable in the
monadic V -theory of the real line. Contrast this with the result of [79].
58. Warren D. Goldfarb, Yuri Gurevich, Saharon Shelah: A decidable subclass of the
minimal Gödel case with identity. JSL Logic 49 (1984), 1253–1261
59. Yuri Gurevich, Harry R. Lewis: A logic for constant depth circuits. Information
and Control 61 (1984), 65–74
We present an extension of first-order logic that captures precisely the compu-
tational complexity of (the uniform sequences of) constant-depth polynomial-time
circuits.
60. Yuri Gurevich: Toward logic tailored for computational complexity. In: M. Richter
et al. (eds.) Computation and Proof Theory, Springer Lecture Notes in Math. 1104
(1984), 175–216
The pathos of this paper is that classical logic, developed to confront the
infinite, is ill prepared to deal with finite structures, whereas finite structures,
e.g. databases, are of so great importance in computer science. We show that
famous theorems about first-order logic fail in the finite case, and discuss various
alternatives to classical logic. The message has been heard.
60.5. Yuri Gurevich: Reconsidering Turing’s thesis (toward more realistic semantics of
programs). Technical report CRL-TR-36-84 University of Michigan, September
1984
The earliest publication on the abstract state machine project.
61. John P. Burgess, Yuri Gurevich: The decision problem for linear temporal logic.
Notre Dame JSL 26 (1985), 115–128
The main result is the decidability of the temporal theory of the real order.
62. Yuri Gurevich, Saharon Shelah: To the decision problem for branching time logic.
In: P. Weingartner and G. Dold (eds.) Foundations of Logic and Linguistics:
Problems and their Solutions, Plenum (1985), 181–198
63. Yuri Gurevich, Saharon Shelah: The decision problem for branching time logic.
JSL 50 (1985), 668–681
Define a tree to be any partial order satisfying the following requirement:
the predecessors of any element x are linearly ordered, i.e. if (y < x and z <
x) then (y < z or y = z or y > z). The main result of the two papers [62,63]
is the decidability of the theory of trees with additional unary predicates and
quantification over nodes and branches. This gives the richest decidable temporal
logic.
64. Yuri Gurevich: Monadic second-order theories. In: J. Barwise and S. Feferman
(eds.) Model-Theoretical Logics, Springer-Verlag, Perspectives in Mathematical
Logic (1985), 479–506
16 Annotated List of Publications of Yuri Gurevich
In this chapter we make a case for the monadic second-order logic (that is
to say, for the extension of first-order logic allowing quantification over monadic
predicates) as a good source of theories that are both expressive and manageable.
We illustrate two powerful decidability techniques here. One makes use of au-
tomata and games. The other is an offshoot of a composition theory where one
composes models as well as their theories. Monadic second-order logic appears to
be the most natural match for the composition theory.
Undecidability proofs must be thought out anew in this area; for, whereas true
first-order arithmetic is reducible to the monadic theory of the real line R, it
is nevertheless not interpretable in the monadic theory of R. A quite unusual
undecidability method is another subject of this chapter.
In the last section we briefly review the history of the methods thus far devel-
oped and mention numerous results obtained using the methods.
64.5. Yuri Gurevich: A new thesis. Abstracts, American Mathematical Society 6:4 (Au-
gust 1985), p. 317, abstract 85T-68-203
The first announcement of the “new thesis”, later known as the Abstract State
Machine thesis.
65. Andreas Blass, Yuri Gurevich, Dexter Kozen: A zero-one law for logic with a
fixed-point operator. Information and Control 67 (1985), 70–90
The zero-one law, known to hold for first-order logic but not for monadic
or even existential monadic second-order logic, is generalized to the extension
of first-order logic by the least (or iterative) fixed-point operator. We also show
that the problem of deciding, for any π, whether it is almost-sure is complete
for exponential time, if we consider only π’s with a fixed finite vocabulary (or
vocabularies of bounded arity) and complete for double-exponential time if π is
unrestricted.
66. Andreas Blass, Yuri Gurevich: Henkin quantifiers and complete problems. Annals
of Pure and Applied Logic 32 (1986), 1–16
We show that almost any non-linear quantifier, applied to quantifier-free first-
order formulas, suffices to express an NP-complete predicate; the remaining non-
linear quantifiers express exactly co-NL predicates (NL is Nondeterministic Log-
space).
67. Larry Denenberg, Yuri Gurevich, Saharon Shelah: Definability by constant-depth
polynomial-size circuits. Information and Control 70 (1986), 216–240
We investigate the expressive power of constant-depth polynomial-size circuit
models. In particular, we construct a circuit model whose expressive power is
precisely that of first-order logic.
68. Amnon Barak, Zvi Drezner, Yuri Gurevich: On the number of active nodes in a
multicomputer system. Networks 16 (1986), 275– 282
Simple probabilistic algorithms enable each active node to find estimates of the
fraction of active nodes in the system of n nodes (with a direct communication
link between any two nodes) in time o(n).
69. Yuri Gurevich: What does O(n) mean? SIGACT NEWS 17:4 (1986), 61–63
70. Yuri Gurevich, Saharon Shelah: Fixed-point extensions of first-order logic. Annals
of Pure and Applied Logic 32 (1986), 265–280
We prove that the three extensions of first-order logic by means of positive,
monotone and inflationary inductions have the same expressive power in the case
Annotated List of Publications of Yuri Gurevich 17
first part is based on lectures given at the 1984 Udine Summer School on Compu-
tation Theory and summarized in the technical report “Logic and the Challenge
of Computer Science”, CRL-TR-10-85, Sep. 1985, Computing Research Lab, Uni-
versity of Michigan, Ann Arbor, Michigan.
In the second part, we introduce a new computation model: evolving algebras
(later renamed abstract state machines). This new approach to semantics of com-
putations and, in particular, to semantics of programming languages emphasizes
dynamic and resource-bounded aspects of computation. It is illustrated on the
example of Pascal. The technical report mentioned above contained an earlier
version of part 2. The final version was written in 1986.
75. Yuri Gurevich: Algorithms in the world of bounded resources. In: R. Herken (ed.)
The universal Turing machine – a half-century story, Oxford University Press
(1988), 407–416
In the classical theory of algorithms, one addresses a computing agent with
unbounded resources. We argue in favor of a more realistic theory of multiple
addressees with limited resources.
76. Yuri Gurevich: Average case completeness. J. Computer and System Sciences 42:3
(June 1991), 346–398 (a special issue with selected papers of FOCS’87)
We explain and advance Levin’s theory of average case complexity. In particu-
lar, we exhibit the second natural average-case-complete problem and prove that
deterministic reductions are inadequate.
77. Yuri Gurevich, James M. Morris: Algebraic operational semantics and Modula-2.
CSL’87, 1st Workshop on Computer Science Logic, Springer LNCS 329 (1988),
81–101
Jim Morris was a PhD student of Yuri Gurevich at the Electrical Engineering
and Computer Science Department of the University of Michigan, the first PhD
student working on the abstract state machine project. This is an extended ab-
stract of Jim Morris’s 1988 PhD thesis (with the same title) and the first example
of the ASM semantics of a whole programming language.
78. Yuri Gurevich: On Kolmogorov machines and related issues. Originally in
BEATCS 35 (June 1988), 71–82. Reprinted in: Current Trends in Theoretical
Computer Science. World Scientific (1993), 225–234
One contribution of the article was to formulate the Kolmogorov-Uspensky
thesis. In “To the Definition of an Algorithm” [Uspekhi Mat. Nauk 13:4 (1958),
3–28 (Russian)], Kolmogorov and Uspensky wrote that they just wanted to com-
prehend the notions of computable functions and algorithms, and to convince
themselves that there is no way to extend the notion of computable function. In
fact, they did more than that. It seems that their thesis was this:
Every computation, performing only one restricted local action at a time,
can be viewed as (not only being simulated by, but actually being) the
computation of an appropriate KU machine (in the more general form).
Uspensky agreed [J. Symb. Logic 57 (1992), p. 396]. Another contribution of the
paper was a popularization of the following beautiful theorem of Leonid Levin.
Theorem. For every computable function F (w) = x from binary strings
to binary strings, there exists a KU algorithm A such that A conclusively
inverts F and (Time of A on x) = O(Time of B on x) for every KU
algorithm B that conclusively inverts F .
Annotated List of Publications of Yuri Gurevich 19
that there are no other properties expressible in first-order logic and in Datalog;
in other words, no unbounded Datalog query is expressible in first-order logic.
We prove the conjecture; that is our main theorem. It can be seen as a kind of
compactness theorem for finite structures. In addition, we give counterexamples
delimiting the main result.
84. Yuri Gurevich: Infinite games. Originally in BEATCS (June 1989), 93–100.
Reprinted in: Current Trends in Theoretical Computer Science. World Scientific
(1993), 235–244
Infinite games are widely used in mathematical logic. Recently infinite games
were used in connection to concurrent computational processes that do not nec-
essarily terminate. For example, operating system may be seen as playing a game
“against” the disruptive forces of users. The classical question of the existence
of winning strategies turns out to be of importance to practice. We explain a
relevant part of the infinite game theory.
85. Yuri Gurevich: The challenger-solver game: Variations on the theme of P=?NP.
BEATCS (October 1989), 112–121. Reprinted in: Current Trends in Theoretical
Computer Science. World Scientific (1993), 245–253
?
The question P=NP is the focal point of much research in theoretical computer
science. But is it the right question? We find it biased toward the positive answer.
It is conceivable that the negative answer is established without providing much
evidence for the difficulty of NP problems in practical terms. We argue in favor
?
of an alternative to P=NP based on the average-case complexity.
86. Yuri Gurevich: Games people play. In: S. Mac Lane and D. Siefkes (eds.) Collected
Works of J. Richard Büchi, Springer-Verlag (1990), 517–524
87. Yuri Gurevich, Saharon Shelah: Nondeterministic linear-time tasks may require
substantially nonlinear deterministic time in the case of sublinear work space.
Journal of the ACM 37:3 (1990), 674–687
We develop a technique to prove time-space trade-offs and exhibit natural
search problems (e.g. Log-size Clique Problem) that are solvable in linear time on
polylog-space (and sometimes even log-space) nondeterministic Turing machine,
but no deterministic machine (in a very general sense of this term) with sequential-
access read-only input tape and work space nσ solves the problem within time
n1+τ if σ + 2τ < 12 .
88. Yuri Gurevich: Matrix decomposition problem is complete for the average case.
FOCS’90, 31st Annual Symposium on Foundations of Computer Science, IEEE
Computer Society Press (1990), 802–811
The first algebraic average-case complete problem is presented. See [97] in this
connection.
89. Yuri Gurevich, Lawrence S. Moss: Algebraic operational semantics and Occam.
CSL’89, 3rd Workshop on Computer Science Logic, Springer LNCS 440 (1990),
176–192
We give evolving algebra semantics to the Occam programming language,
generalizing in the process evolving algebras to the case of distributed concurrent
computations.
Later note: this was the first example of a distributed abstract state machine.
90. Yuri Gurevich: On finite model theory. In: S. R. Buss et al. (eds.) Feasible Math-
ematics, (1990), 211–219
Annotated List of Publications of Yuri Gurevich 21
This is a little essay on finite model theory. Section 1 gives some counterexam-
ples to classical theorems in the finite case. Section 2 gives a finite version of the
classical compactness theorem. Section 3 announces two Gurevich-Shelah results.
One is a new preservation theorem that implies that a first-order formula p pre-
served by any homomorphism from a finite structure into another finite structure
is equivalent to a positive existential formula q. The other result is a lower bound
result according to which a shortest q may be non-elementary longer than p.
A later note: Unfortunately, the proof of preservation theorem fell through – a
unique such case in the history of the Gurevich-Shelah collaboration – and was
later proved by Benjamin Rossman; see Proceedings of LICS 2005. Rossman also
provided details for our lower bound proof.
91. Yuri Gurevich: On the classical decision problem. Originally in BEATCS (October
1990), 140–150. Reprinted in: Current Trends in Theoretical Computer Science.
World Scientific (1993), 254–265
92. Yuri Gurevich: Evolving algebras: An introductory tutorial. Originally in
BEATCS 43 (February 1991), 264–284. This slightly revised version appeared
in: Current Trends in Theoretical Computer Science. World Scientific (1993),
266–292
Computation models and specification methods seem to be worlds apart. The
evolving algebra project is as an attempt to bridge the gap by improving on
Turing’s thesis. We seek more versatile machines able to simulate arbitrary algo-
rithms, on their natural abstraction levels, in a direct and essentially coding-free
way. The evolving algebra thesis asserts that evolving algebras are such versatile
machines. Here sequential evolving algebras are defined and motivated. In addi-
tion, we sketch a speculative “proof” of the sequential evolving algebra thesis:
Every sequential algorithm can be lock-step simulated by an appropriate sequen-
tial evolving algebra on the natural abstraction level of the algorithm.
93. Andreas Blass, Yuri Gurevich: On the reduction theory for average-case complex-
ity. CSL’90, 4th Workshop on Computer Science Logic, Springer LNCS 533 (1991)
17–30
A function from instances of one problem to instances of another problem is
a reduction if together with any admissible algorithm for the second problem it
gives an admissible algorithm for the first problem. This is an example of a de-
scriptive definition of reductions. We slightly simplify Levin’s usable definition of
deterministic average-case reductions and thus make it equivalent to the appro-
priate descriptive definition. Then we generalize this to randomized average-case
reductions.
94. Yuri Gurevich: Average case complexity. ICALP’91, International Colloquium on
Automata, Languages and Programming, Madrid, Springer LNCS 510 (1991),
615–628
We motivate, justify and survey the average-case reduction theory.
95. Yuri Gurevich: Zero-one laws. Originally in BEATCS 51 (February 1991), 90–106.
Reprinted in: Current Trends in Theoretical Computer Science. World Scientific
(1993), 293–309
96. Andreas Blass, Yuri Gurevich: Randomizing reductions of search problems. SIAM
J. on Computing 22:5 (1993), 949–975
This is the journal version of an invited talk at FST&TCS’91, 11th Conference
on Foundations of Software Technology and Theoretical Computer Science, New
Delhi, India; see Springer LNCS 560 (1991), 10–24.
22 Annotated List of Publications of Yuri Gurevich
First, we clarify the notion of a (feasible) solution for a search problem and prove
its robustness. Second, we give a general and usable notion of many-one random-
izing reductions of search problems and prove that it has desirable properties.
All reductions of search problems to search problems in the literature on average
case complexity can be viewed as such many-one randomizing reductions. This
includes those reductions in the literature that use iterations and therefore do not
look many-one.
97. Andreas Blass, Yuri Gurevich: Matrix transformation is complete for the average
case. SIAM J. on Computing 24:1 (1995), 3–29
This is a full paper corresponding to the extended abstract [88] by the second
author. We present the first algebraic problem complete for the average case under
a natural probability distribution. The problem is this: Given a unimodular matrix
X of integers, a set S of linear transformations of such unimodular matrices and
a natural number n, decide if there is a product of at most n (not necessarily
different) members of S that takes X to the identity matrix.
98. Yuri Gurevich, James K. Huggins: The semantics of the C programming language.
In E. Börger et al. (eds.) CSL’92 (Computer Science Logics), Springer LNCS 702
(1993), 274–308
The method of successive refinement is used. The observation that C expres-
sions do not contain statements gives rise to the first evolving algebra (ealgebra)
which captures the command part of C; expressions are evaluated by an oracle.
The second ealgebra implements the oracle under the assumptions that all the
necessary declarations have been provided and user-defined functions are eval-
uated by another oracle. The third ealgebra handles declarations. Finally, the
fourth ealgebra revises the combination of the first three by incorporating the
stack discipline; it reflects all of C. (A later note: evolving algebras are now called
abstract state machines.)
99. Thomas Eiter, Georg Gottlob, Yuri Gurevich: Curb your theory! A circumscriptive
approach for inclusive interpretation of disjunctive information. In: R. Bajcsy,
M. Kaufman (eds.) Proc. 13th Intern. Joint Conf. on AI (IJCAI’93) (1993), 634–
639
We introduce, study and analyze the complexity of a new nonmonotonic tech-
nique of common sense reasoning called curbing. Like circumscription, curbing
is based on model minimality, but, unlike circumscription, it treats disjunction
inclusively.
100. Yuri Gurevich: Feasible functions. London Mathematical Society Newsletter 206
(June 1993), 6–7
Some computer scientists, notably Steve Cook, identify feasibility with
polynomial-time computability. We argue against that point of view. Polynomial-
time computations may be infeasible, and feasible computations may be not poly-
nomial time.
101. Yuri Gurevich: Logic in computer science. In: G. Rozenberg and A. Salomaa
(eds.) Current Trends in Theoretical Computer Science, World Scientific Series in
Computer Science 40 (1993), 223–394
102. Yuri Gurevich: The AMAST phenomenon. Originally in BEATCS 51 (October
1993), 295–299. Reprinted in: Current Trends in Theoretical Computer Science.
World Scientific (2001), 247–253
Annotated List of Publications of Yuri Gurevich 23
108. Yuri Gurevich, Neil Immerman, Saharon Shelah: McColm’s conjecture. LICS
1994, Symp. on Logic in Computer Science, IEEE Computer Society Press (1994),
10–19
Gregory McColm conjectured that, over any class K of finite structures, all pos-
itive elementary inductions are bounded if every FOL + LFP formula is equivalent
to a first-order formula over K. Here FOL + LFP is the extension of first-order
logic with the least fixed point operator. Our main results are two model-theoretic
constructions – one deterministic and one probabilistic – each of which refutes
McColm’s conjecture.
109. Erich Grädel, Yuri Gurevich: Metafinite model theory. Information and Compu-
tation 140:1 (1998), 26–81. Preliminary version in D. Leivant (ed.) Logic and
Computational Complexity, Selected Papers, Springer LNCS 960 (1995), 313–366
Earlier, the second author criticized database theorists for admitting arbitrary
structures as databases: databases are finite structures [60]. However, a closer
investigation reveals that databases are not necessarily finite. For example, a
query may manipulate numbers that do not even appear in the database, which
shows that a numerical structure is somehow involved. It is true nevertheless that
database structures are special. The phenomenon is not restricted to databases;
for example, think about the natural structure to formalize the traveling salesman
problem. To this end, we define metafinite structures. Typically such a structure
consists of (i) a primary part, which is a finite structure, (ii) a secondary part,
which is a (usually infinite) structure, e.g. arithmetic or the real line, and (iii) a
set of “weight” functions from the first part into the second. Our logics do not
allow quantification over the secondary part. We study definability issues and
their relation to complexity. We discuss model-theoretic properties of metafinite
structures, present results on descriptive complexity, and sketch some potential
applications.
110. Andreas Blass, Yuri Gurevich: Evolving algebras and linear time hierarchy. In:
B. Pehrson and I. Simon (eds.) IFIP 1994 World Computer Congress, Volume I:
Technology and Foundations, North-Holland, Amsterdam, 383–390
A precursor of [118].
111. Yuri Gurevich, James K. Huggins: Evolving algebras and partial evaluation. In:
B. Pehrson and I. Simon (eds.) IFIP 1994 World Computing Congress, Volume
1: Technology and Foundations, Elsevier, Amsterdam, 587–592
The authors present an automated (and implemented) partial evaluator for
sequential evolving algebras.
112. Yuri Gurevich: Evolving algebras. In: B. Pehrson and I. Simon (eds.) IFIP 1994
World Computer Congress, Volume I: Technology and Foundations, Elsevier, Am-
sterdam, 423–427
The opening talk at the first workshop on evolving algebras. Sections: Intro-
duction, The EA Thesis, Remarks, Future Work.
113. Yuri Gurevich, Saharon Shelah: On rigid structures. JSL 61:2 (June 1996), 549–
562
This is related to the problem of defining linear order on finite structures. If
a linear order is definable on a finite structure A, then A is rigid (which means
that its only automorphism is the identity). There had been a suspicion that if
K is the collection of all finite structures of a finitely axiomatizable class and if
Annotated List of Publications of Yuri Gurevich 25
harder and more important because of the greater flexibility of the ASM model.
One long-term goal of this line of research is to prove linear lower bounds for
linear time problems.
119. Yuri Gurevich, Marc Spielmann: Recursive abstract state machines. Springer J.
of Universal Computer Science 3:4 (April 1997), 233–246
The abstract state machine (ASM) thesis, supported by numerous applications,
asserts that ASMs express algorithms on their natural abstraction levels directly
and essentially coding-free. The only objection raised to date has been that ASMs
are iterative in their nature, whereas many algorithms are naturally recursive.
There seems to be an inherent contradiction between (i) the ASM idea of explicit
and comprehensive states, and (ii) higher level recursion with its hiding of the
stack.
But consider recursion more closely. When an algorithm A calls an algorithm
B, a clone of B is created and this clone becomes a slave of A. This raises the
idea of treating recursion as an implicitly multi-agent computation. Slave agents
come and go, and the master/slave hierarchy serves as the stack.
Building upon this idea, we suggest a definition of recursive ASMs. The implicit
use of distributed computing has an important side benefit: it leads naturally to
concurrent recursion. In addition, we reduce recursive ASMs to distributed ASMs.
If desired, one can view recursive notation as mere abbreviation.
120. Andreas Blass, Yuri Gurevich, Saharon Shelah: Choiceless polynomial time. An-
nals of Pure and Applied Logic 100 (1999), 141–187
The question “Is there a computation model whose machines do not distinguish
between isomorphic structures and compute exactly polynomial time properties?”
became a central question of finite model theory. One of us conjectured a negative
answer [74]. A related question is what portion of PTIME can be naturally cap-
tured by a computation model. (Notice that we speak about computation whose
inputs are arbitrary finite structures, e.g. graphs. In a special case of ordered
structures, the desired computation model is that of PTIME-bounded Turing
machines.) Our idea is to capture the portion of PTIME where algorithms are
not allowed arbitrary choice but parallelism is allowed and, in some cases, im-
plements choice. Our computation model is a PTIME version of abstract state
machines. Our machines are able to PTIME simulate all other PTIME machines
in the literature, and they are more programmer-friendly. A more difficult theorem
shows that the computation model does not capture all PTIME.
121. Scott Dexter, Patrick Doyle, Yuri Gurevich: Gurevich abstract state machines and
Schoenhage storage modification machines. Springer J. of Universal Computer
Science 3:4 (April 1997), 279–303
We show that, in a strong sense, Schoenhage’s storage modification machines
are equivalent to unary basic abstract state machines without external functions.
The unary restriction can be removed if the storage modification machines are
equipped with a pairing function in an appropriate way.
122. Charles Wallace, Yuri Gurevich, Nandit Soparkar: A formal approach to recov-
ery in transaction-oriented database systems. Springer J. of Universal Computer
Science 3:4 (April 1997), 320–340
Failure resilience is an essential requirement for transaction-oriented database
systems, yet there has been little effort to specify and verify techniques for failure
recovery formally. The desire to improve performance has resulted in algorithms of
Annotated List of Publications of Yuri Gurevich 27
The article (written in a popular form) explains that a number of different algo-
rithmic problems related to Herbrand’s theorem happen to be equivalent. Among
these problems are the intuitionistic provability problem for the existential fragment
of first-order logic with equality, the intuitionistic provability problem for the prenex
fragment of first-order with equality, and the simultaneous rigid E-unification prob-
lem (SREU). The article explains an undecidability proof of SREU and decidability
proofs for special cases. It contains an extensive bibliography on SREU.
126. Yuri Gurevich, Margus Veanes: Logic with equality: Partisan corroboration and
shifted pairing. Information and Computation 152:2 (August 1999), 205–235
Herbrand’s theorem plays a fundamental role in automated theorem proving
methods based on tableaux. The crucial step in procedures based on such meth-
ods can be described as the corroboration (or Herbrand skeleton) problem: given
a positive integer m and a quantifier-free formula, find a valid disjunction of m
instantiations of the formula. In the presence of equality (which is the case in
this paper), this problem was recently shown to be undecidable. The main con-
tributions of this paper are two theorems. The Partisan Corroboration Theorem
relates corroboration problems with different multiplicities. The Shifted Pairing
Theorem is a finite tree-automata formalization of a technique for proving unde-
cidability results through direct encodings of valid Turing machine computations.
The theorems are used to explain and sharpen several recent undecidability re-
sults related to the corroboration problem, the simultaneous rigid E-unification
problem and the prenex fragment of intuitionistic logic with equality.
127a. Anatoli Degtyarev, Yuri Gurevich, Paliath Narendran, Margus Veanes, Andrei
Voronkov: The decidability of simultaneous rigid E-unification with one variable.
RTA’98, 9th Conf. on Rewriting Techniques and Applications, Tsukuba, Japan,
March 30 – April 1, 1998
The title problem is proved decidable and in fact EXPTIME-complete. Fur-
thermore, the problem becomes PTIME-complete if the number of equations is
bounded by any (positive) constant. It follows that the ∀∗ ∃∀∗ fragment of intu-
itionistic logic with equality is decidable, which contrasts with the undecidability
of the EE fragment [126]. Notice that simultaneous rigid E-unification with two
variables and only three rigid equations is undecidable [126].
127b. Anatoli Degtyarev, Yuri Gurevich, Paliath Narendran, Margus Veanes, Andrei
Voronkov: Decidability and complexity of simultaneous rigid E-unification with
one variable and related results. Theoretical Computer Science 243:1–2 (August
2000), 167–184
The journal version of [127a] containing also a decidability proof for the case
of simultaneous rigid E-unification when each rigid equation either contains (at
most) one variable or else has a ground left-hand side and the right-hand side of
the form x = y where x and y are variables.
128a. Yuri Gurevich, Andrei Voronkov: Monadic simultaneous rigid E-unification and
related problems. ICALP’97, 24th Intern. Colloquium on Automata, Languages
and Programming, Springer LNCS 1256 (1997), 154–165
We study the monadic case of a decision problem known as simultaneous rigid
E-unification. We show its equivalence to an extension of word equations. We
prove decidability and complexity results for special cases of this problem.
128b. Yuri Gurevich, Andrei Voronkov: Monadic simultaneous rigid E-unification. The-
oretical Computer Science 222:1–2 (1999), 133–152
Annotated List of Publications of Yuri Gurevich 29
136. Yuri Gurevich: The sequential ASM thesis. Originally in BEATCS 67 (February
1999), 98–124. Reprinted in: Current Trends in Theoretical Computer Science,
World Scientific (2001), 363–392
The thesis is that every sequential algorithm, on any level of abstraction, can be
viewed as a sequential abstract state machine. (Abstract state machines, ASMs,
used to be called evolving algebras.) The sequential ASM thesis and its extensions
inspired diverse applications of ASMs. The early applications were driven, at
least partially, by the desire to test the thesis. Different programming languages
were the obvious challenges. (A programming language L can be viewed as an
algorithm that runs a given L program on given data.) From there, applications
of (not necessarily sequential) ASMs spread into many directions. So far, the
accumulated experimental evidence seems to support the sequential thesis. There
is also a speculative philosophical justification of the thesis. It was barely sketched
in the literature, but it was discussed at much greater length in numerous lectures
of mine. Here I attempt to write down some of those explanations. This article
does not presuppose any familiarity with ASMs.
A later note: [141] is a much revised and polished journal version.
137. Giuseppe Del Castillo, Yuri Gurevich, Karl Stroetmann: Typed abstract state
machines. Unfinished manuscript (1998)
This manuscript was never published. The work, done sporadically in 1996–
98, was driven by the enthusiasm of Karl Stroetmann of Siemens. Eventually he
was reassigned away from ASM applications, and the work stopped. The item
wasn’t removed from the list because some of its explorations may be useful. (An
additional minor reason was to avoid changing the numbers of the subsequent
items.)
138. Yuri Gurevich, Dean Rosenzweig: Partially ordered runs: A case study. In: Ab-
stract State Machines: Theory and Applications, Springer LNCS 1912 (2000),
131–150
We look at some sources of insecurity and difficulty in reasoning about partially
ordered runs of distributed abstract state machines, and propose some techniques
to facilitate such reasoning. As a case study, we prove in detail correctness and
deadlock–freedom for general partially ordered runs of distributed ASM models
of Lamport’s Bakery Algorithm.
139. Andreas Blass, Yuri Gurevich, Jan Van den Bussche: Abstract state machines
and computationally complete query languages. Information and Computation
174:1 (2002), 20–36. An earlier version in: Abstract State Machines: Theory and
Applications, Springer LNCS 1912 (2000), 22–33
Abstract state machines (ASMs) form a relatively new computation model
holding the promise that they can simulate any computational system in lock-
step. In particular, an instance of the ASM model has recently been introduced
for computing queries to relational databases [120]. This model, to which we refer
as the BGS model, provides a powerful query language in which all computable
queries can be expressed. In this paper, we show that when one is only interested
in polynomial-time computations, BGS is strictly more powerful than both QL
and WHILE NEW, two well-known computationally complete query languages.
We then show that when a language such as WHILE NEW is extended with a
duplicate elimination mechanism, polynomial-time simulations between the lan-
guage and BGS become possible.
Annotated List of Publications of Yuri Gurevich 31
140. Yuri Gurevich, Wolfram Schulte, Charles Wallace: Investigating Java concurrency
using abstract state machines. In: Abstract State Machines: Theory and Applica-
tions, Springer LNCS 1912 (2000), 151–176
We present a mathematically precise, platform-independent model of Java con-
currency using the Abstract State Machine method. We cover all aspects of Java
threads and synchronization, gradually adding details to the model in a series
of steps. We motivate and explain each concurrency feature, and point out sub-
tleties, inconsistencies and ambiguities in the official, informal Java specification.
141. Yuri Gurevich: Sequential abstract state machines capture sequential algorithms.
ACM TOCL 1:1 (July 2000), 77–111
What are sequential algorithms exactly? Our claim, known as the sequential
ASM thesis, has been that, as far as behavior is concerned, sequential algorithms
are exactly sequential abstract state machines: For every sequential algorithm
A, there is a sequential abstract state machine B that is behaviorally identical
to A. In particular, B simulates A step for step. In this paper we prove the
sequential ASM thesis, so that it becomes a theorem. But how can one possibly
prove a thesis? Here is what we do. We formulate three postulates satisfied by all
sequential algorithms (and, in particular, by sequential abstract state machines).
This leads to the following definition: a sequential algorithm is any object that
satisfies the three postulates. At this point the thesis becomes a precise statement.
And we prove the statement.
This is a non-dialog version of the dialog [136]. An intermediate version was
published in MSR-TR-99-65
141a. Yuri Gurevich: Sequential abstract state machines capture sequential algorithms.
Russian translation of [141], by P.G. Emelyanov. In: Marchuk A.G. (ed.) Formal
Methods and Models of Informatics, System Informatics 9 ( 2004), 7–50, Siberian
Branch of the Russian Academy of Sciences
142. Andreas Blass, Yuri Gurevich: The underlying logic of Hoare logic. Originally in
BEATCS 70 (February 2000), 82–110. Reprinted in: Current Trends in Theoretical
Computer Science, World Scientific (2001), 409–436
Formulas of Hoare logic are asserted programs ϕ P ψ where P is a program
and ϕ, ψ are assertions. The language of programs varies; in the 1980 survey by
Krzysztof Apt, one finds the language of while programs and various extensions
of it. But the assertions are traditionally expressed in first-order logic (or exten-
sions of it). In that sense, first-order logic is the underlying logic of Hoare logic.
We question the tradition and demonstrate, on the simple example of while pro-
grams, that alternative assertion logics have some advantages. For some natural
assertion logics, the expressivity hypothesis in Cook’s completeness theorem is
automatically satisfied.
143. Andreas Blass, Yuri Gurevich: Background, reserve, and Gandy machines. In: P.
Clote and H. Schwichtenberg (eds.) CSL’2000, Springer LNCS 1862 (2000), 1–17
Algorithms often need to increase their working space, and it may be conve-
nient to pretend that the additional space was really there all along but was not
previously used. In particular, abstract state machines have, by definition [103],
an infinite reserve. Although the reserve is a naked set, it is often desirable to have
some external structure over it. For example, in [120] every state was required to
include all finite sets of its atoms, all finite sets of these, etc. In this connection,
32 Annotated List of Publications of Yuri Gurevich
we define the notion of a background class of structures. Such a class specifies the
constructions (like finite sets or lists) available as “background” for algorithms.
The importation of reserve elements must be non-deterministic, since an algo-
rithm has no way to distinguish one reserve element from another. But this sort
of non-determinism is much more benign than general non-determinism. We cap-
ture this intuition with the notion of inessential non-determinism. Alternatively,
one could insist on specifying a particular one of the available reserve elements
to be imported. This is the approach used in [Robin Gandy, “Church’s thesis and
principles for mechanisms”. In: J. Barwise et al. (eds.) The Kleene Symposium,
North-Holland, 1980, 123–148]. The price of this insistence is that the specifica-
tion cannot be algorithmic. We show how to turn a Gandy-style deterministic,
non-algorithmic process into a non-deterministic algorithm of the sort described
above, and we prove that Gandy’s notion of “structural” for his processes corre-
sponds to our notion of “inessential non-determinism.”
144. Andreas Blass, Yuri Gurevich: Choiceless polynomial time computation and the
Zero-One Law. In: P. Clote and H. Schwichtenberg (eds.) CSL’2000, Springer
LNCS 1862 (2000), 18–40
This paper is a sequel to [120], a commentary on [Saharon Shelah (#634)
“Choiceless polynomial time logic: inability to express”, same proceedings], and
an abridged version of [149] that contains complete proofs of all the results pre-
sented here. The BGS model of computation was defined in [120] with the inten-
tion of modeling computation with arbitrary finite relational structures as inputs,
with essentially arbitrary data structures, with parallelism, but without arbitrary
choices. It was shown that choiceless polynomial time, the complexity class de-
fined by BGS programs subject to a polynomial time bound, does not contain
the parity problem. Subsequently, Shelah proved a zero-one law for choiceless-
polynomial-time properties. A crucial difference from the earlier results is this:
Almost all finite structures have no non-trivial automorphisms, so symmetry con-
siderations cannot be applied to them. Shelah’s proof therefore depends on a more
subtle concept of partial symmetry.
After struggling for a while with Shelah’s proof, we worked out a presentation
which we hope will be helpful for others interested in Shelah’s ideas. We also
added some related results, indicating the need for certain aspects of the proof
and clarifying some of the concepts involved in it. Unfortunately, this material
is not yet fully written up. The part already written, however, exceeds the space
available to us in the present volume. We therefore present here an abridged
version of that paper and promise to make the complete version available soon.
145. Mike Barnett, Egon Börger, Yuri Gurevich, Wolfram Schulte, Margus Veanes:
Using abstract state machines at Microsoft: A case study. In: P. Clote and H.
Schwichtenberg (eds.) CSL’2000, Springer LNCS 1862 (2000), 367–379
Our goal is to provide a rigorous method, clear notation and convenient tool
support for high-level system design and analysis. For this purpose we use abstract
state machines (ASMs). Here we describe a particular case study: modeling a
debugger of a stack based runtime environment. The study provides evidence for
ASMs being a suitable tool for building executable models of software systems
on various abstraction levels, with precise refinement relationships connecting the
models. High level ASM models of proposed or existing programs can be used
throughout the software development cycle. In particular, ASMs can be used
to model inter-component behavior on any desired level of detail. This allows
Annotated List of Publications of Yuri Gurevich 33
x = y = z = 0 {P } false
is partially correct on N but any loop invariant I(x, y, z) for this asserted program
is undecidable.
147. Yuri Gurevich, Alexander Rabinovich: Definability in rationals with real order in
the background. Journal of Logic and Computation 12:1 (2002), 1–11
The paper deals with logically definable families of sets of rational numbers. In
particular, we are interested whether the families definable over the real line with
a unary predicate for the rationals are definable over the rational order alone.
Let ϕ(X, Y ) and ψ(Y ) range over formulas in the first-order monadic language of
order. Let Q be the set of rationals and F be the family of subsets J of Q such that
ϕ(Q, J) holds over the real line. The question arises whether, for every formula
ϕ, the family F can be defined by means of a formula ψ(Y ) interpreted over the
rational order. We answer the question negatively. The answer remains negative
if the first-order logic is strengthened to weak monadic second-order logic. The
34 Annotated List of Publications of Yuri Gurevich
answer is positive for the restricted version of monadic second-order logic where
set quantifiers range over open sets. The case of full monadic second-order logic
remains open.
148. Andreas Blass, Yuri Gurevich: A new zero-one law and strong extension axioms.
Originally in BEATCS 72 (October 2000), 103–122. Reprinted in: Current Trends
in Theoretical Computer Science, World Scientific (2004), 99–118
This article is a part of the continuing column on Logic in Computer Science.
One of the previous articles in the column was devoted to the zero-one laws for
a number of logics playing prominent role in finite model theory: first-order logic
FO, the extension FO+LFP of first-order logic with the least fixed-point operator,
and the infinitary logic where every formula uses finitely many variables [95].
Recently Shelah proved a new, powerful, and surprising zero-one law. His proof
uses so-called strong extension axioms. Here we formulate Shelah’s zero-one law
and prove a few facts about these axioms. In the process we give a simple proof
for a “large deviation” inequality à la Chernoff.
149. Andreas Blass, Yuri Gurevich: Strong extension axioms and Shelah’s zero-one law
for choiceless polynomial time. JSL 68:1 (2003), 65–131
This paper developed from Shelah’s proof of a zero-one law for the complexity
class “choiceless polynomial time,” defined by Shelah and the authors. We present
a detailed proof of Shelah’s result for graphs, and describe the extent of its gen-
eralizability to other sorts of structures. The extension axioms, which form the
basis for earlier zero-one laws (for first-order logic, fixed-point logic, and finite-
variable infinitary logic) are inadequate in the case of choiceless polynomial time;
they must be replaced by what we call the strong extension axioms. We present
an extensive discussion of these axioms and their role both in the zero-one law
and in general. ([144] is an abridged version of this paper, and [148] is a popular
version of this paper.)
150. Andreas Blass, Yuri Gurevich, Saharon Shelah: On polynomial time computation
over unordered structures. JSL 67:3 (2002), 1093–1125
This paper is motivated by the question whether there exists a logic captur-
ing polynomial time computation over unordered structures. We consider several
algorithmic problems near the border of the known, logically defined complexity
classes contained in polynomial time. We show that fixpoint logic plus count-
ing is stronger than might be expected, in that it can express the existence of
a complete matching in a bipartite graph. We revisit the known examples that
separate polynomial time from fixpoint plus counting. We show that the examples
in a paper of Cai, Fürer, and Immerman, when suitably padded, are in choice-
less polynomial time yet not in fixpoint plus counting. Without padding, they
remain in polynomial time but appear not to be in choiceless polynomial time
plus counting. Similar results hold for the multipede examples of Gurevich and
Shelah, except that their final version of multipedes is, in a sense, already suitably
padded. Finally, we describe another plausible candidate, involving determinants,
for the task of separating polynomial time from choiceless polynomial time plus
counting.
150a. Andreas Blass, Yuri Gurevich: A quick update on the open problems in Arti-
cle [150] (December 2005).
151. Yuri Gurevich: Logician in the Land of OS: Abstract state machines at Microsoft.
LICS 2001, IEEE Symp. on Logic in Computer Science, IEEE Computer Society
(2001), 129–136
Annotated List of Publications of Yuri Gurevich 35
157-1. Andreas Blass, Yuri Gurevich: Abstract state machines capture parallel algo-
rithms. ACM TOCL 4:4 (October 2003), 578–651
We give an axiomatic description of parallel, synchronous algorithms. Our main
result is that every such algorithm can be simulated, step for step, by an abstract
state machine with a background that provides for multisets. See also [157-2].
157-2. Andreas Blass, Yuri Gurevich: Abstract state machines capture parallel algo-
rithms: Correction and extension. ACM TOCL 9:3 (June 2008), Article 19
We consider parallel algorithms working in sequential global time, for example
circuits or parallel random access machines (PRAMs). Parallel abstract state
machines (parallel ASMs) are such parallel algorithms, and the parallel ASM
thesis asserts that every parallel algorithm is behaviorally equivalent to a parallel
ASM. In an earlier paper [157-1], we axiomatized parallel algorithms, proved the
ASM thesis and proved that every parallel ASM satisfies the axioms. It turned out
that we were too timid in formulating the axioms; they did not allow a parallel
algorithm to create components on the fly. This restriction did not hinder us from
proving that the usual parallel models, like circuits or PRAMs or even alternating
Turing machines, satisfy the postulates. But it resulted in an error in our attempt
to prove that parallel ASMs always satisfy the postulates. To correct the error,
we liberalize our axioms and allow on-the-fly creation of new parallel components.
We believe that the improved axioms accurately express what parallel algorithms
ought to be. We prove the parallel thesis for the new, corrected notion of parallel
algorithms, and we check that parallel ASMs satisfy the new axioms.
158. Andreas Blass, Yuri Gurevich: Algorithms vs. machines. Originally in BEATCS
77 (June 2002), 96–118. Reprinted in: Current Trends in Theoretical Computer
Science, World Scientific (2004), 215–236
In a recent paper, the logician Yiannis Moschovakis argues that no state ma-
chine describes mergesort on its natural level of abstraction. We do just that. Our
state machine is a recursive ASM.
159. Uwe Glässer, Yuri Gurevich, Margus Veanes: Abstract communication model for
distributed systems. IEEE Transactions on Software Engineering 30:7 (July 2004),
458–472
In some distributed and mobile communication models, a message disappears
in one place and miraculously appears in another. In reality, of course, there are no
miracles. A message goes from one network to another; it can be lost or corrupted
in the process. Here we present a realistic but high-level communication model
where abstract communicators represent various nets and subnets. The model was
originally developed in the process of specifying a particular network architecture,
namely the Universal Plug and Play architecture. But it is general. Our contention
is that every message-based distributed system, properly abstracted, gives rise
to a specialization of our abstract communication model. The purpose of the
abstract communication model is not to design a new kind of network; rather it
is to discover the common part of all message-based communication networks.
The generality of the model has been confirmed by its successful reuse for very
different distributed architectures. The model is based on distributed abstract
state machines. It is implemented in the specification language AsmL and is being
used for testing distributed systems.
160. Andreas Blass, Yuri Gurevich: Pairwise testing. Originally in BEATCS 78 (Oc-
tober 2002), 100–132. Reprinted in: Current Trends in Theoretical Computer
Science, World Scientific (2004), 237–266
Annotated List of Publications of Yuri Gurevich 37
We discuss the following problem, which arises in software testing. Given some
independent parameters (of a program to be tested), each having a certain finite
set of possible values, we intend to test the program by running it several times.
For each test, we give the parameters some (intelligently chosen) values. We want
to ensure that for each pair of distinct parameters, every pair of possible values
is used in at least one of the tests. And we want to do this with as few tests as
possible.
161. Yuri Gurevich, Nikolai Tillmann: Partial updates. Theoretical Computer Science
336:2–3 (26 May 2005), 311–342. (A preliminary version in: Abstract State Ma-
chines 2003, Springer LNCS 2589 (2003), 57–86)
A datastructure instance, e.g. a set or file or record, may be modified indepen-
dently by different parts of a computer system. The modifications may be nested.
Such hierarchies of modifications need to be efficiently checked for consistency
and integrated. This is the problem of partial updates in a nutshell. In our first
paper on the subject [156], we developed an algebraic framework which allowed
us to solve the partial update problem for some useful datastructures including
counters, sets and maps. These solutions are used for the efficient implementation
of concurrent data modifications in the specification language AsmL. The two
main contributions of this paper are (i) a more general algebraic framework for
partial updates and (ii) a solution of the partial update problem for sequences
and labeled ordered trees.
162. Yuri Gurevich, Saharon Shelah: Spectra of monadic second-order formulas with
one unary function. LICS 2003, 18th Annual IEEE Symp. on Logic in Computer
Science, IEEE Computer Society (2003), 291–300
We prove that the spectrum of any monadic second-order formula F with one
unary function symbol (and no other function symbols) is eventually periodic, so
that there exist natural numbers p > 0 (a period) and t (a p-threshold) such that
if F has a model of cardinality n > t then it has a model of cardinality n + p.
(In the web version, some additional proof details are provided because some
readers asked for them.)
163. Mike Barnett, Wolfgang Grieskamp, Yuri Gurevich, Wolfram Schulte, Nikolai Till-
mann, Margus Veanes: Scenario-oriented modeling in AsmL and its instrumenta-
tion for testing. In: 2nd International Workshop on Scenarios and State Machines:
Models, Algorithms, and Tools, (2003) 8–14, held at ICSE 2003, International
Conference on Software Engineering 2003
We present an approach for modeling use cases and scenarios in the Abstract
state machine Language and discuss how to use such models for validation and
verification purposes.
164. Andreas Blass, Yuri Gurevich: Algorithms: A quest for absolute definitions. Orig-
inally in BEATCS 81 (October 2003), 195–225. Reprinted in: Current Trends in
Theoretical Computer Science, World Scientific (2004), 283–311. Reprinted in:
A. Olszewski et al. (eds.) Church’s Thesis After 70 Years, Ontos Verlag (2006),
24–57
What is an algorithm? The interest in this foundational problem is not only
theoretical; applications include specification, validation and verification of soft-
ware and hardware systems. We describe the quest to understand and define the
notion of algorithm. We start with the Church-Turing thesis and contrast Church’s
and Turing’s approaches, and we finish with some recent investigations.
38 Annotated List of Publications of Yuri Gurevich
165. Yuri Gurevich: Abstract state machines: An overview of the project. In: D. Seipel
and J. M. Turull-Torres (eds.) Foundations of Information and Knowledge Sys-
tems, Springer LNCS 2942 (2004), 6–13
We quickly survey the ASM project, from its foundational roots to industrial
applications.
166. Andreas Blass, Yuri Gurevich: Ordinary interactive small-step algorithms, I. ACM
TOCL 7:2 (April 2006), 363–419. A preliminary version was published as MSR-
TR-2004-16
This is the first in a series of papers extending the Abstract State Machine
Thesis – that arbitrary algorithms are behaviorally equivalent to abstract state
machines – to algorithms that can interact with their environments during a step,
rather than only between steps. In the present paper, we describe, by means
of suitable postulates, those interactive algorithms that (1) proceed in discrete,
global steps, (2) perform only a bounded amount of work in each step, (3) use only
such information from the environment as can be regarded as answers to queries,
and (4) never complete a step until all queries from that step have been answered.
We indicate how a great many sorts of interaction meet these requirements. We
also discuss in detail the structure of queries and replies and the appropriate
definition of equivalence of algorithms. Finally, motivated by our considerations
concerning queries, we discuss a generalization of first-order logic in which the
arguments of function and relation symbols are not merely tuples of elements but
orbits of such tuples under groups of permutations of the argument places.
167. Yuri Gurevich: Intra-step interaction. In: W. Zimmerman and B. Thalheim (eds.)
Abstract State Machines 2004, Springer LNCS 3052 (2004), 1–5
For a while it seemed possible to pretend that all interaction between an al-
gorithm and its environment occurs inter-step, but not anymore. Andreas Blass,
Benjamin Rossman and the speaker are extending the Small-Step Characteri-
zation Theorem (that asserts the validity of the sequential version of the ASM
thesis) and the Wide-Step Characterization Theorem (that asserts the validity of
the parallel version of the ASM thesis) to intra-step interacting algorithms.
A later comment: This was my first talk on intra-step interactive algorithms.
The intended audience was the ASM community. [174] is a later talk on this topic,
and it is addressed to a general computer science audience.
168. Yuri Gurevich, Rostislav Yavorskiy: Observations on the decidability of transi-
tions. In: W. Zimmerman and B. Thalheim (eds.) Abstract State Machines 2004,
Springer LNCS 3052 (2004), 161–168
Consider a multiple-agent transition system such that, for some basic types
T1 , . . . , Tn , the state of any agent can be represented as an element of the Carte-
sian product T1 × · · · × Tn . The system evolves by means of global steps. During
such a step, new agents may be created and some existing agents may be up-
dated or removed, but the total number of created, updated and removed agents
is uniformly bounded. We show that, under appropriate conditions, there is an al-
gorithm for deciding assume-guarantee properties of one-step computations. The
result can be used for automatic invariant verification as well as for finite state
approximation of the system in the context of test-case generation from AsmL
specifications.
169. Yuri Gurevich, Benjamin Rossman, Wolfram Schulte: Semantic essence of AsmL.
Theoretical Computer Science 343:3 (17 October 2005), 370–412 Originally pub-
lished as MSR-TR-2004-27
Annotated List of Publications of Yuri Gurevich 39
182. Andreas Blass, Yuri Gurevich, Dean Rosenzweig, Benjamin Rossman: Interactive
small-step algorithms II: Abstract state machines and the Characterization The-
orem. Logical Methods in Computer Science 3:4 (2007), paper 4. A preliminary
version appeared as MSR-TR-2006-171
In earlier work, the Abstract State Machine Thesis – that arbitrary algorithms
are behaviorally equivalent to abstract state machines – was established for several
classes of algorithms, including ordinary, interactive, small-step algorithms. This
was accomplished on the basis of axiomatizations of these classes of algorithms.
In a companion paper [176] the axiomatization was extended to cover interac-
tive small-step algorithms that are not necessarily ordinary. This means that the
algorithms (1) can complete a step without necessarily waiting for replies to all
queries from that step and (2) can use not only the environment’s replies but
also the order in which the replies were received. In order to prove the thesis for
algorithms of this generality, we extend here the definition of abstract state ma-
chines to incorporate explicit attention to the relative timing of replies and to the
possible absence of replies. We prove the characterization theorem for extended
ASMs with respect to general algorithms as axiomatized in [176].
183. Dan Teodosiu, Nikolaj Bjørner, Yuri Gurevich, Mark Manasse, Joe Porkka: Opti-
mizing file replication over limited-bandwidth networks using remote differential
compression. MSR-TR-2006-157
Remote Differential Compression (RDC) protocols can efficiently update files
over a limited-bandwidth network when two sites have roughly similar files; no
site needs to know the content of another’s files a priori. We present a heuristic
approach to identify and transfer the file differences that is based on finding similar
files, subdividing the files into chunks, and comparing chunk signatures. Our work
significantly improves upon previous protocols such as LBFS and RSYNC in three
ways. Firstly, we present a novel algorithm to efficiently find the client files that
are the most similar to a given server file. Our algorithm requires 96 bits of meta-
data per file, independent of file size, and thus allows us to keep the metadata in
memory and eliminate the need for expensive disk seeks. Secondly, we show that
RDC can be applied recursively to signatures to reduce the transfer cost for large
files. Thirdly, we describe new ways to subdivide files into chunks that identify
file differences more accurately. We have implemented our approach in DFSR, a
state-based multimaster file replication service shipping as part of Windows Server
2003 R2. Our experimental results show that similarity detection produces results
comparable to LBFS while incurring a much smaller overhead for maintaining the
metadata. Recursive signature transfer further increases replication efficiency by
up to several orders of magnitude.
184. Martin Grohe, Yuri Gurevich, Dirk Leinders, Nicole Schweikardt, Jerzy
Tyszkiewicz, Jan Van den Bussche: Database query processing using finite cursor
machines. Theory of Computing Systems 44:4 (April 2009), 533–560. An earlier
version appeared in: ICDT 2007, International Conference on Database Theory,
Springer LNCS 4353 (2007), 284–298
We introduce a new abstract model of database query processing, finite cursor
machines, that incorporates certain data streaming aspects. The model describes
quite faithfully what happens in so-called “one-pass” and “two-pass query pro-
cessing”. Technically, the model is described in the framework of abstract state
machines. Our main results are upper and lower bounds for processing relational
Annotated List of Publications of Yuri Gurevich 43
algebra queries in this model, specifically, queries of the semijoin fragment of the
relational algebra.
185. Andreas Blass, Yuri Gurevich: Zero-one laws: Thesauri and parametric conditions.
BEATCS 91 (February 2007), 125–144. Reprinted in: A. Gupta et al. (eds.) Logic
at the Crossroads: An Interdisciplinary View, Allied Publishers Pvt. Ltd., New
Delhi (2007), 187–206
The zero-one law for first-order properties of finite structures and its proof via
extension axioms were first obtained in the context of arbitrary finite structures
for a fixed finite vocabulary. But it was soon observed that the result and the
proof continue to work for structures subject to certain restrictions. Examples
include undirected graphs, tournaments, and pure simplicial complexes. We dis-
cuss two ways of formalizing these extensions, Oberschelp’s parametric conditions
(Springer Lecture Notes in Mathematics 969, 1982) and our thesauri of [149]. We
show that, if we restrict thesauri by requiring their probability distributions to be
uniform, then they and parametric conditions are equivalent. Nevertheless, some
situations admit more natural descriptions in terms of thesauri, and the thesaurus
point of view suggests some possible extensions of the theory.
186. Andreas Blass, Yuri Gurevich: Background of computation. BEATCS, 92 (June
2007)
In a computational process, certain entities (for example, sets or arrays) and
operations on them may be automatically available, for example by being pro-
vided by the programming language. We define background classes to formalize
this idea, and we study some of their basic properties. The present notion of back-
ground class is more general than the one we introduced in an earlier paper [143],
and it thereby corrects one of the examples in that paper. The greater general-
ity requires a non-trivial notion of equivalence of background classes, which we
explain and use. Roughly speaking, a background class assigns to each set (of
atoms) a structure (for example, of sets or arrays or combinations of these and
similar entities), and it assigns to each embedding of one set of atoms into an-
other a standard embedding between the associated background structures. We
discuss several, frequently useful, properties that background classes may have,
for example that each element of a background structure depends (in some sense)
on only finitely many atoms, or that there are explicit operations by which all
elements of background structures can be produced from atoms.
187. Robert H. Gilman, Yuri Gurevich, Alexei Miasnikov: A geometric zero-one law.
JSL 74:3 (September 2009)
Each relational structure X has an associated Gaifman graph, which endows
X with the properties of a graph. If x is an element of X, let Bn (x) be the ball of
radius n around x. Suppose that X is infinite, connected and of bounded degree.
A first-order sentence s in the language of X is almost surely true (resp. a.s. false)
for finite substructures of X if for every x in X, the fraction of substructures of
Bn (x) satisfying s approaches 1 (resp. 0) as n approaches infinity. Suppose further
that, for every finite substructure, X has a disjoint isomorphic substructure. Then
every s is a.s. true or a.s. false for finite substructures of X. This is one form of the
geometric zero-one law. We formulate it also in a form that does not mention the
ambient infinite structure. In addition, we investigate various questions related to
the geometric zero-one law.
44 Annotated List of Publications of Yuri Gurevich
algorithms. One of our algorithms, the local maximum chunking method, has
been implemented and found to work better in practice than previously used
algorithms.
Theoretical comparisons between the various algorithms can be based on several
criteria, most of which seek to formalize the idea that chunks should be neither
too small (so that hashing and sending hash values become inefficient) nor too
large (so that agreements of entire chunks become unlikely). We propose a new
criterion, called the slack of a chunking method, which seeks to measure how much
of an interval of agreement between two files is wasted because it lies in chunks
that don’t agree.
Finally, we show how to efficiently find the cutpoints for local maximum chunk-
ing.
191. Yuri Gurevich, Itay Neeman: DKAL: Distributed-Knowledge Authorization Lan-
guage. MSR-TR-2008-09. First appeared as MSR-TR-2007-116
DKAL is an expressive declarative authorization language based on existential
fixed-point logic. It is considerably more expressive than existing languages in
the literature, and yet feasible. Our query algorithm is within the same bounds
of computational complexity as, e.g., that of SecPAL. DKAL’s distinguishing
features include
– explicit handling of knowledge and information,
– targeted communication that is beneficial with respect to confidentiality, se-
curity, and liability protection,
– the flexible use and nesting of functions, which in particular allows principals
to quote (to other principals) whatever has been said to them,
– flexible built-in rules for expressing and delegating trust,
– information order that contributes to succinctness.
191a. Yuri Gurevich, Itay Neeman: DKAL: Distributed-Knowledge Authorization Lan-
guage. CSF 2008, 21st IEEE Computer Security Foundations Symposium, 149–
162
This is an extended abstract of [191]. DKAL is a new declarative authoriza-
tion language for distributed systems. It is based on existential fixed-point logic
and is considerably more expressive than existing authorization languages in the
literature. Yet its query algorithm is within the same bounds of computational
complexity as, e.g., that of SecPAL. DKAL’s communication is targeted, which
is beneficial for security and for liability protection. DKAL enables flexible use of
functions; in particular, principals can quote (to other principals) whatever has
been said to them. DKAL strengthens the trust delegation mechanism of Sec-
PAL. A novel information order contributes to succinctness. DKAL introduces a
semantic safety condition that guarantees the termination of the query algorithm.
192. Andreas Blass, Nachum Dershowitz, Yuri Gurevich: When are two algorithms the
same? Bulletin of Symbolic Logic 15:2 (2009), 145–168. An earlier version was
published as MSR-TR-2008-20
People usually regard algorithms as more abstract than the programs that
implement them. The natural way to formalize this idea is that algorithms are
equivalence classes of programs with respect to a suitable equivalence relation.
We argue that no such equivalence relation exists.
46 Annotated List of Publications of Yuri Gurevich
193. Andreas Blass, Yuri Gurevich: Two forms of one useful logic: Existential fixed
point logic and liberal Datalog, BEATCS 95 (June 2008), 164–182
A natural liberalization of Datalog is used in the Distributed Knowledge Au-
thorization Language (DKAL). We show that the expressive power of this liberal
Datalog is that of existential fixed-point logic. The exposition is self-contained.
194. Andreas Blass, Yuri Gurevich: One useful logic that defines its own truth. MFCS
2008, 33rd International Symposium on Mathematical Foundations of Computer
Science, Springer LNCS 5162 (2008), 1–15
Existential fixed point logic (EFPL) is a natural fit for some applications,
and the purpose of this talk is to attract attention to EFPL. The logic is also
interesting in its own right as it has attractive properties. One of those properties
is rather unusual: truth of formulas can be defined (given appropriate syntactic
apparatus) in the logic. We mentioned that property elsewhere, and we use this
opportunity to provide the proof.
195. Nikolaj Bjørner, Andreas Blass, Yuri Gurevich, Madan Musuvathi: Modular dif-
ference logic is hard. MSR-TR-2008-140
In connection with machine arithmetic, we are interested in systems of con-
straints of the form x + k ≤ y + l. Over integers, the satisfiability problem for
such systems is polynomial time. The problem becomes NP complete if we restrict
attention to the residues for a fixed modulus N .
196. Andreas Blass, Yuri Gurevich: Persistent queries in the behavioral theory of algo-
rithms. ACM TOCL, to appear. An earlier version appeared as MSR-TR-2008-150
We propose a syntax and semantics for interactive abstract state machines to
deal with the following situation. A query is issued during a certain step, but the
step ends before any reply is received. Later, a reply arrives, and later yet the
algorithm makes use of this reply. By a persistent query, we mean a query for which
a late reply might be used. Syntactically, our proposal involves issuing, along with
a persistent query, a location where a late reply is to be stored. Semantically, it
involves only a minor modification of the existing theory of interactive small-step
abstract state machines.
197. Yuri Gurevich, Arnab Roy: Operational semantics for DKAL: Application and
analysis. TrustBus 2009, 6th International Conference on Trust, Privacy and Se-
curity in Digital Business, Springer LNCS 5695 (2009), 149–158
DKAL is a new authorization language based on existential fixed-point logic
and more expressive than existing authorization languages in the literature. We
present some lessons learned during the first practical application of DKAL and
some improvements that we made to DKAL as a result. We develop operational
semantics for DKAL and present some complexity results related to the opera-
tional semantics.
198. Yuri Gurevich, Itay Neeman: Infon logic: The propositional case. ACM TOCL,
to appear. The TOCL version is a correction and slight extension of the version
called “The infon logic” published in BEATCS 98 (June 2009), 150–178
Infons are statements viewed as containers of information (rather then repre-
sentations of truth values). In the context of access control, the logic of infons is a
conservative extension of logic known as constructive or intuitionistic. Distributed
Knowledge Authorization Language uses additional unary connectives “p said”
and “p implied” where p ranges over principals. Here we investigate infon logic
Annotated List of Publications of Yuri Gurevich 47
and a narrow but useful primal fragment of it. In both cases, we develop the model
theory and analyze the derivability problem: Does the given query follow from the
given hypotheses? Our more involved technical results are on primal infon logic.
We construct an algorithm for the multiple derivability problem: Which of the
given queries follow from the given hypotheses? Given a bound on the quotation
depth of the hypotheses, the algorithm works in linear time. We quickly discuss
the significance of this result for access control.
199. Nikolaj Bjørner, Yuri Gurevich, Wolfram Schulte, Margus Veanes: Symbolic
bounded model checking of abstract state machines. International Journal of Soft-
ware and Informatics 3:2–3 (June/September 2009), 149–170
Abstract State Machines (ASMs) allow us to model system behaviors at any
desired level of abstraction, including levels with rich data types, such as sets or
sequences. The availability of high-level data types allows us to represent state
elements abstractly and faithfully at the same time. AsmL is a rich ASM-based
specification and programming language. In this paper we look at symbolic analy-
sis of model programs written in AsmL with a background T of linear arithmetic,
sets, tuples, and maps. We first provide a rigorous account of the update seman-
tics of AsmL in terms of background T, and we formulate the problem of bounded
path exploration of model programs, or the problem of Bounded Model Program
Checking (BMPC), as a satisfiability modulo T problem. Then we investigate the
boundaries of decidable and undecidable cases for BMPC. In a general setting,
BMPC is shown to be highly undecidable (Σ11 -complete); restricted to finite sets,
the problem remains RE-hard (Σ01 -hard). On the other hand, BMPC is shown
to be decidable for a class of basic model programs that are common in prac-
tice. We apply Satisfiability Modulo Theories (SMT) tools to BMPC. The recent
SMT advances allow us to directly analyze specifications using sets and maps with
specialized decision procedures for expressive fragments of these theories. Our ap-
proach is extensible; background theories need in fact only be partially solved by
the SMT solver; we use simulation of ASMs to support additional theories that
are beyond the scope of available decision procedures.
200. Yuri Gurevich, Itay Neeman: DKAL 2 – A simplified and improved authorization
language. MSR-TR-2009-11
Knowledge and information are central notions in DKAL, a logic based au-
thorization language for decentralized systems, the most expressive among such
languages in the literature. Pieces of information are called infons. Here we present
DKAL 2, a surprisingly simpler version of the language that expresses new im-
portant scenarios (in addition to the old ones) and that is built around a natural
logic of infons. Trust became definable, and its properties, postulated earlier as
DKAL house rules, are now proved. In fact, none of the house rules postulated
earlier is now needed. We identify also a most practical fragment of DKAL where
the query derivation problem is solved in linear time.
201. Andreas Blass, Nachum Dershowitz, Yuri Gurevich: Exact exploration and hang-
ing algorithms. CSL 2010, 19th EACSL Annual Conference on Computer Science
Logic (August 2010), to appear
Recent analysis of sequential algorithms resulted in their axiomatization and
in a representation theorem stating that, for any sequential algorithm, there is
an abstract state machine (ASM) with the same states, initial states and state
transitions. That analysis, however, abstracted from details of intra-step compu-
tation, and the ASM, produced in the proof of the representation theorem, may
48 Annotated List of Publications of Yuri Gurevich
and often does explore parts of the state unexplored by the algorithm. We refine
the analysis, the axiomatization and the representation theorem. Emulating a
step of the given algorithm, the ASM, produced in the proof of the new represen-
tation theorem, explores exactly the part of the state explored by the algorithm.
That frugality pays off when state exploration is costly. The algorithm may be
a high-level specification, and a simple function call on the abstraction level of
the algorithm may hide expensive interaction with the environment. Furthermore,
the original analysis presumed that state functions are total. Now we allow state
functions, including equality, to be partial so that a function call may cause the
algorithm as well as the ASM to hang. Since the emulating ASM does not make
any superfluous function calls, it hangs only if the algorithm does.
202. Andreas Blass, Yuri Gurevich, Efim Hudis: The Tower-of-Babel problem, and
security assessment sharing. MSR-TR-2010-57. BEATCS 101 (June 2010), to ap-
pear
The Tower-of-Babel problem is rather general: How to enable a collaboration
among experts speaking different languages? A computer security version of the
Tower-of-Babel problem is rather important. A recent Microsoft solution for that
security problem, called Security Assessment Sharing, is based on this idea: A tiny
common language goes a long way. We construct simple mathematical models
showing that the idea is sound.
Database Theory, Yuri, and Me
Dedicated to Yuri Gurevich, the “man with a plan”, on his 70th birthday.
1 Database Theory
The theory of database systems is a very broad field of theoretical computer sci-
ence, concerned with the theoretical design and analysis of all data management
aspects of computer science. One can get a good idea of the current research in
this field by looking at the proceedings of the two main conferences in the area:
the International Conference on Database Theory, and the ACM Symposium on
Principles of Database Systems. As data management research in general follows
the rapid changes in computing and software technology, database theory can
appear quite trendy to the outsider. Nevertheless there are also timeless top-
ics such as the theory of database queries, to which Yuri Gurevich has made a
number of fundamental contributions.
An in-depth treatment of database theory until the early 1990s can be found
in the book of Abiteboul, Hull and Vianu [1]; Yuri Gurevich appears nine times
in the bibliography.
A relational database schema is a finite relational vocabulary, i.e., a finite set
of relation names with associated arities. Instead of numbering the columns of
a relation with numbers, as usual in mathematical logic, in database theory it
is also customary to name the columns with attributes. In that case the arity
is replaced by a finite set of attributes (called a relation scheme). A database
instance over some schema is a finite relational structure over that schema, i.e.,
an assignment of a concrete, finite, relation content to each of the relation names.
So, if R is a relation name of arity k and D is a database, then D(R) is a finite
subset of U k , where U is some universe of data elements. The idea is that the
contents of a database can be updated frequently, hence the term “instance”.
We will often drop this term, however, and simply talk about a “database”.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 49–60, 2010.
c Springer-Verlag Berlin Heidelberg 2010
50 J. Van den Bussche
The semantics is that of implication, where all variables are assumed universally
quantified.
The implication problem for template dependencies then is: given a finite
set Σ of template dependencies and another template dependency σ, decide
whether each database instance satisfying all dependencies of Σ also satisfies σ.
The typed version, which is easier but still shown undecidable by Gurevich and
Lewis, restricts attention to relations with pairwise disjoint columns.
Interestingly, Gurevich and Lewis were working on this result concurrently
with Moshe Vardi [36,37]. To obtain his sharpest results, Vardi could apply
work by Gurevich and by Lewis on the word problem for semigroups [22,29].
3 Database Queries
One of the main purposes of a database is to query it. For many good reasons
into which we cannot go here, the answer of a query to a relational database
takes again the form of a relation. For example, on our Hobby relation, we could
pose the query “list all pairs (n, h) such that n performs hobby h in some location
where nobody else performs any hobby in that location”.
So, in the most general terms, one could define a query simply as a mapping
from database instances to relations. But to get a good theory, we need to
Database Theory, Yuri, and Me 51
impose some criteria on this mapping. First, it is convenient that all answer
relations of a same query have the same arity. Second, the basic theory restricts
attention to answer relations containing only values that already appear in the
input database. We call such queries “domain-preserving”. Third, a domain-
preserving query should be “logical” in the sense of Tarski [34], i.e., it should
commute with permutations of data elements.1 This captures the intuition that
the query need not distinguish between isomorphic databases; all the information
required to answer the query should already be present in the database [3]. Thus,
formally, a query of arity k over some database schema S is a domain-preserving
mapping from database instances over S to k-ary relations on U , such that for
every permutation ρ of U , and for every database instance D over S, we have
Q(ρ(D)) = ρ(Q(D)).
For example, if the database schema consists of a single unary relation name,
so that an instance is just a naked set, a function that picks an arbitrary element
out of each instance is not a query, because it is not logical. The intuition is that
the database does not provide any information that would substantiate favoring
one of the elements above the others.
The definition of query as recalled above was formulated by Chandra and Harel
[11,12]. This notion of query comes very naturally to a logician; indeed, Gurevich
independently introduced the very same notion under the name “global relation”
or “global predicate” in his two seminal papers on finite model theory [23,24].
These papers also widely publicized one of the most fundamental open problems
in database theory, the QPTIME problem [13,14,30,31,35]: is there a reasonable
programming language in which only, and all, queries can be expressed that are
computable in polynomial time? Gurevich’s conjecture is that the answer is neg-
ative. The QPTIME problem has been actively investigated since its inception,
as can be learned from two surveys, one by Kolaitis from 1995 [32] and one by
Grohe from 2008 [19]. We will get back to it in Section 8. The problem also
nicely illustrate how database theory lies at the basis of the areas of finite model
theory and descriptive complexity which grew afterwards.
In the 1980s, much attention was devoted to the query language Datalog. One
of the toughest nuts in this research was cracked by Ajtai and Gurevich [4,5].
A Datalog program is a set of implications, called rules, that are much like the
template dependencies we have seen above. But an essential difference is that
the relation name in the head of a Datalog rule is not from the database schema;
it is a so-called “intensional” relation name. Thus a Datalog program defines
1
We mean here a permutation not just of the data elements that appear in some
input database, but of the entire global universe of possible data elements.
52 J. Van den Bussche
a number of new relations from the existing ones in the database instance; the
semantics is that we take the smallest expanded database instance that satisfies
the rules. For example, on our Hobby relation, consider the following program:
This program computes, in relation T , the transitive closure of the binary relation
that relates a person x to a person y if they have some common hobby.
Since the transitive closure is not first-order definable [3,17], the above Datalog
program is not equivalent to a first-order formula. Neither is it “bounded”: there
is no fixed constant so that, on any database instance, the rules have to be
fired only so many times until we reach a fixpoint. As a matter of fact, a non-
first-order Datalog program cannot be bounded, as bounded Datalog programs
are obviously first-order; an equivalent first-order formula can by obtained by
unfolding the recursive rules a constant number of times.
The converse is much less obvious, however: every first-order Datalog program
must in fact be bounded, and this is the above-mentioned celebrated result by
Ajtai and Gurevich.
6 Metafinite Structures
7 Honorary Doctorate
All the work described up to know happened before I had ever personally met
Yuri. That would happen in May 1996, on the occasion of an AMS Benelux
meeting at the University of Antwerp, Belgium, where I was working as a postdoc
at the time. I had noticed that the famous Yuri Gurevich was scheduled to give
an invited talk at the logic session, and since I had some ideas related to the
QPTIME problem, I approached him and asked if he would be interested in
having dinner in Antwerp together and talk mathematics. To my most pleasant
surprise he readily accepted. It was my first personal encounter with Yuri and
we spent an agreeable evening. He patiently listened to my ideas and made
suggestions. Being a native from Antwerp I could give him a flash tour of the
city and also knew a typical restaurant, things he could certainly appreciate.
Database Theory, Yuri, and Me 53
Shortly afterwards, I would join the faculty of what was then known as the
Limburgs Universitair Centrum in Diepenbeek, Belgium; nowadays it is called
Hasselt University. The university was just preparing for its 25th anniversary
in the year 1998, and there was an internal call for nominations for honorary
doctorates to be awarded during the Dies Natalis ceremony. Given his fame
in finite model theory and database theory, and given the pleasant experience I
had had with Yuri, I nominated him before the Faculty of Sciences. Obviously he
was such a strong candidate that my nomination was enthusiastically accepted.
Thus in May 1998, Yuri received the honorary doctorate and became a friend of
Hasselt University, and of me personally as well.
Appendix A contains a transcript, translated into English, of the nomina-
tion speech I gave (in Dutch) for the honorary degree. In the course of preparing
that speech, I collected information from many people around Yuri. These people
gave me so much information that much of it could not be used for the short and
formal speech. On the occasion of Yuri’s 60th birthday, however, Egon Börger
organised a special session at the CSL 2000 conference in Fischbachau, Ger-
many, and in a speech given there I could use that material, consisting mainly
of anecdotes. Appendix B contains a transcript of that speech.
Upon getting to know him better, I started to collaborate with Yuri on a scientific
level. I remember a meeting on finite model theory in Oberwolfach in 1998, just
a few months before the honorary doctorate ceremony, where he gave a talk
on Choiceless Polynomial Time [8], a very expressive database query language
in which only polynomial-time queries can be expressed. The language is nice
because it borrows its high expressivity from set theory in a natural way, and
also because it is based on Gurevich’s Abstract State Machines (ASM [25]) . It
is nice to see how Yuri’s work on ASMs, originally disjoint from the QPTIME
problem, is applied to that problem.
Yuri wondered about the precise relationship between choiceless polynomial
time and the work that was going on in database theory, e.g., by Abiteboul and
Vianu on “generic machines” [2]. We collaborated on that question and that led
to our joint paper (with Andreas Blass) on ASMs and computationally complete
query languages [6,7]. In short, it turns out that choiceless polynomial time is
the same as the polynomial-time fragment of a natural complete query language
based on first-order logic, object creation [10], and iteration. We also showed that
the “non-flat” character of choiceless polynomial time, be it through arbitrarily
deeply nested sets, or through object creation, is essential to its high expressive
power.
We note that extensions of choiceless polynomial time are still being actively
researched in connection with the QPTIME problem [9,16,15].
A small personal recollection I have on this joint research is that, when working
on the proof, I visited Yuri in Paris, where he liked to spend his summers. We
worked at the stuffy apartment where Yuri rented a room, and in the evening,
54 J. Van den Bussche
Yuri felt like going to the movies and asked me to suggest a good movie. I
remembered my father telling me the week before that he had been impressed by
the then-running movie “Seven years in Tibet” (with Brad Pitt). So I suggested
we go to that movie, and indeed, the movie impressed Yuri and me greatly.
10 Conclusion
My life has been enriched in many ways through my encounters with such a
great person as Yuri Gurevich. Yuri, I wish you much happiness on the occasion
of your 70th birthday!
References
1. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley,
Reading (1995)
2. Abiteboul, S., Vianu, V.: Generic computation and its complexity. In: Proceedings
23rd ACM Symposium on the Theory of Computing, pp. 209–219 (1991)
3. Aho, A., Ullman, J.: Universality of data retrieval languages. In: Conference
Record, 6th ACM Symposium on Principles of Programming Languages, pp. 110–
120 (1979)
4. Ajtai, M., Gurevich, Y.: Datalog vs first-order logic. In: Proceedings 30th IEEE
Symposium on Foundations of Computer Science, pp. 142–147 (1989)
5. Ajtai, M., Gurevich, Y.: Datalog vs first-order logic. J. Comput. Syst. Sci. 49(3),
562–588 (1994)
6. Blass, A., Gurevich, Y., Van den Bussche, J.: Abstract state machines and com-
putationally complete query languages (extended abstract). In: Gurevich, Y.,
Kutter, P.W., Odersky, M., Thiele, L. (eds.) ASM 2000. LNCS, vol. 1912, pp.
22–33. Springer, Heidelberg (2000)
7. Blass, A., Gurevich, Y., Van den Bussche, J.: Abstract state machines and compu-
tationally complete query languages. Information and Computation 174(1), 20–36
(2002)
8. Blass, A., Gurevich, Y., Shelah, S.: Choiceless polynomial time. Annals of Pure
and Applied Logic 100, 141–187 (1999)
Database Theory, Yuri, and Me 55
9. Blass, A., Gurevich, Y., Shelah, S.: On polynomial time computation over un-
ordered structures. Journal of Symbolic Logic 67(3), 1093–1125 (2002)
10. Van den Bussche, J., Van Gucht, D., Andries, M., Gyssens, M.: On the complete-
ness of object-creating database transformation languages. J. ACM 44(2), 272–319
(1997)
11. Chandra, A., Harel, D.: Computable queries for relational data bases. In: Proceed-
ings 11th ACM Symposium in Theory of Computing, pp. 309–318 (1979)
12. Chandra, A., Harel, D.: Computable queries for relational data bases. J. Comput.
Syst. Sci. 21(2), 156–178 (1980)
13. Chandra, A., Harel, D.: Structure and complexity of relational queries. In: Pro-
ceedings 21st IEEE Symposium on Foundations of Computer Science, pp. 333–347
(1980)
14. Chandra, A., Harel, D.: Structure and complexity of relational queries. J. Comput.
Syst. Sci. 25, 99–128 (1982)
15. Dawar, A.: On the descriptive complexity of linear algebra. In: Hodges, W., de
Queiroz, R. (eds.) Logic, Language, Information and Computation. LNCS (LNAI),
vol. 5110, pp. 17–25. Springer, Heidelberg (2008)
16. Dawar, A., Richerby, D., Rossman, B.: Choiceless polynomial time, counting and
the Cai-Fürer-Immerman graphs. Annals of Pure and Applied Logic 152(1-3),
31–50 (2008)
17. Gaifman, H., Vardi, M.: A simple proof that connectivity is not first-order definable.
Bulletin of the EATCS 26, 43–45 (1985)
18. Grädel, E., Gurevich, Y.: Metafinite model theory. Information and Computa-
tion 140(1), 26–81 (1998)
19. Grohe, M.: The quest for a logic capturing PTIME. In: Proceedings 23rd Annual
IEEE Symposium on Logic in Computer Science, pp. 267–271 (2008)
20. Grohe, M., Gurevich, Y., Leinders, D., Schweikardt, N., Tyszkiewicz, J., Van den
Bussche, J.: Database query processing using finite cursor machines. In: Schwentick,
T., Suciu, D. (eds.) ICDT 2007. LNCS, vol. 4353, pp. 284–298. Springer, Heidelberg
(2006)
21. Grohe, M., Gurevich, Y., Leinders, D., Schweikardt, N., Tyszkiewicz, J., Van den
Bussche, J.: Database query processing using finite cursor machines. Theory of
Computing Systems 44(4), 533–560 (2009)
22. Gurevich, Y.: The word problem for some classes of semigroups (russian). Algebra
and Logic 5(2), 25–35 (1966)
23. Gurevich, Y.: Toward logic tailored for computational complexity. In: Richter,
M., et al. (eds.) Computation and Proof Theory. Lecture Notes in Mathematics,
vol. 1104, pp. 175–216. Springer, Heidelberg (1998)
24. Gurevich, Y.: Logic and the challenge of computer science. In: Börger, E. (ed.)
Current Trends in Theoretical Computer Science, pp. 1–57. Computer Science
Press, Rockville (1988)
25. Gurevich, Y.: Evolving algebra 1993: Lipari guide. In: Börger, E. (ed.) Specification
and Validation Methods, pp. 9–36. Oxford University Press, Oxford (1995)
26. Gurevich, Y., Leinders, D., Van den Bussche, J.: A theory of stream queries. In:
Arenas, M., Schwartzbach, M.I. (eds.) DBPL 2007. LNCS, vol. 4797, pp. 153–168.
Springer, Heidelberg (2007)
27. Gurevich, Y., Lewis, H.: The inference problem for template dependencies. In:
Proceedings 1st ACM Symposium on Principles of Database Systems, pp. 221–229
(1982)
28. Gurevich, Y., Lewis, H.: The inference problem for template dependencies. Infor-
mation and Control 55(1-3), 69–79 (1982)
56 J. Van den Bussche
29. Gurevich, Y., Lewis, H.: The word problem for cancellation semigroups with zero.
Journal of Symbolic Logic 49(1), 184–191 (1984)
30. Immerman, N.: Relational queries computable in polynomial time. In: Proceedings
14th ACM Symposium on Theory of Computing, pp. 147–152 (1982)
31. Immerman, N.: Relational queries computable in polynomial time. Information and
Control 68, 86–104 (1986)
32. Kolaitis, P.: Languages for polynomial-time queries: An ongoing quest. In: Gottlob,
G., Vardi, M. (eds.) ICDT 1995. LNCS, vol. 893, pp. 38–39. Springer, Heidelberg
(1995)
33. Kuper, G., Libkin, L., Paredaens, J. (eds.): Constraint Databases. Springer,
Heidelberg (2000)
34. Tarski, A.: What are logical notions? In: Corcoran, J. (ed.) History and Philosophy
of Logic, vol. 7, pp. 143–154 (1986)
35. Vardi, M.: The complexity of relational query languages. In: Proceedings 14th ACM
Symposium on the Theory of Computing, pp. 137–146 (1982)
36. Vardi, M.: The implication and finite implication problems for typed template
dependencies. In: Proceedings 1st ACM Symposium on Principles of Database
Systems, pp. 230–238 (1982)
37. Vardi, M.: The implication and finite implication problems for typed template
dependencies. J. Comput. Syst. Sci. 28, 3–28 (1984)
Dear Guests:
Dear Professor Gurevich:
It is my honor and my pleasure to give a brief exposition of your life, your
work, and your achievements.
In these days, the field of information technology (IT) receives plenty of
attention. It is hard to imagine our present-day information society without
IT: in our daily lives we live and work thanks to products and services that
would have been either unaffordable, or simply impossible, were it not for IT.
Computer science, as an academic discipline, profits from this success, but
at the same time runs some risk because of it. Indeed, the danger is that
those less familiar with computer science, view IT as an obvious technology
that we just use when and were we need it. This purely technological view
of computer science is too limited. Computer science is just as well an exact
science, a relatively young one at that, and still in full growth, that investigates
the possibilities and limitations of one of the hardest tasks for us humans:
the design and programming of computer systems, in a correct and efficient
manner. Logical and abstract reasoning are essential skills in this endeavor.
Now if there is one who is a champion in logical reasoning, it is our honored
guest, professor Yuri Gurevich. Yuri was born in 1940 in Russia, and studied
mathematics at Ural university. Already at the age of 24 he earned his doctor-
ate, and four years later the Soviet state doctorate, which allowed him access
Database Theory, Yuri, and Me 57
to a professor position at the highest level. Thus he found himself at the age of
29 as head of the mathematics division of the national institute for economics
in Sverdlovsk. Russian colleagues have told me that such a steep career was al-
most unthinkable in the communist Russia of the 1960s, also because Gurevich
had refused to become a member of the party.
But it was indeed impossible to ignore the scientific results he had obtained
during his doctorate in mathematical logic. As a young graduate Yuri Gurevich
had been directed towards a subdiscipline of mathematics, called the theory
of ordered abelian groups. Fortunately I do not have to explain this theory
in order to show the depth of the results that Gurevich obtained. Normally
one expects of a mathematician that he or she finds some answers to specific
mathematical questions posed within the discipline in which he or she is active.
Gurevich, however, developed an automated procedure—think of a computer
program—by which every question about the theory of ordered abelian groups
could be answered! One might say that he replaced an entire community of
mathematicians by a single computer program. At that time, as well as in the
present time, it was highly unusual that an entire subdiscipline of mathematics
is solved in such manner.
In the early 1970s it becomes increasingly harder for Yuri Gurevich to
struggle against the discrimination against Jews in the anti-Semitic climate in
Russia in those years. When he hears that the KGB has a file against him, he
plans to emigrate. Unfortunately this happens under difficult circumstances,
so that his scientific activities are suspended for two years. Traveling via the
Republic of Georgia, from where it was easier to emigrate, he finally settles in
1974 in Israel, where he becomes a professor at the Ben-Gurion University. Yuri
amazed everyone by expressing himself in Hebrew at departmental meetings
already after a few months of arriving.
During his Israeli period, Yuri Gurevich develops into an absolute em-
inence, a world leader in research in logic. We cannot go further into the
deep investigations he made, nor into the long and productive collaboration
he developed with that other phenomenal logician, Saharon Shelah. The clear
computer science aspect of his earlier work is less present during this period,
although some of the fundamental results that he obtains here will find un-
expected applications later in the area of automated verification of computer
systems. The latter is not so accidental: having unexpected applications is one
of the hallmarks of pure fundamental research.
Along the way, however, the computer scientist in Yuri Gurevich resur-
faces. Computer science in the late 1970s was in full growth as a new, young
academic discipline, and Gurevich saw the importance of a solid logical foun-
dation for this new science. In a new orientation of his career, he accepts in
1982 an offer from the University of Michigan as professor of computer science.
Since then professor Gurevich, as a leadership figure, serves as an important
bridge between logic and computer science. Partly through his influence, these
two disciplines have become strongly interweaved. Of the many common top-
ics where the two disciplines interact, and where Gurevich played a decisive
role, we mention finite model theory, where a logical foundation is being de-
veloped for computerised databases, an important topic in our own theoretical
computer science group here at LUC; the complexity of computation, which
he gave logical foundations; and software engineering, for which he designed a
58 J. Van den Bussche
B Fischbachau Speech
This speech was given by me after the dinner at the end of the symposium in
honour of Yuri Gurevich’s 60th birthday, held in the charming Bavarian village
of Fischbachau, on 24 August 2000, co-located with the CSL conference. I thank
the many people who contributed the material for this speech. Their names are
mentioned explicitly.
generosity, his accessibility. These very qualities of Yuri have been mentioned
to me independently by many people.
One of these people is Joe Weisburd, who was a student of Yuri in the old
Russian days back in the sixties. Joe also told me the following:
Yuri always looked very undaunted, very powerful and very confident,
which was unusual at those times, unless you were a ranked Commu-
nist. And Yuri Shlomovich—wasn’t. You also didn’t dare to look pow-
erful and confident if you were Jewish. And Yuri Shlomovich—was,
even blatantly so in a city where public Jewish life was completely sup-
pressed. His patronymic, Shlomovich, was a challenge, indicating that
he wasn’t adjusting to sounding more Russian, as was pretty popular
then. The Biblical names of his newly born daughters was another
challenge. And Yuri was absolutely unprecedented in the Jewish folk
song, that he suggested to sing together at the banquet after an annual
mathematical Winter School of our Math department.
Yuri became a full professor at the age of 29, which again was unheard of,
let alone him being Jewish. Nevertheless, it became increasingly clear that he
had to emigrate. Cunningly, he discovered that for some peculiar reason, it
was easier to get permission to emigrate out of the Republic of Georgia, so he
requested and finally received permission to transfer there, and eventually was
able to emigrate. These were severe times; they even had to go on a hunger
strike. Vladik Kreinovich, who met Yuri at a seminar in St. Petersburg during
this period, told me the following:
Yuri seemed to be undisturbed by the complexity of the outside life. He
radiated strength and optimism, and described very interesting results
which he clearly managed to produce lately, during the extremely
severe period of his life. His demeanor looked absolutely fantastic.
Once free, Yuri devoted considerable energy to help other Soviet Jews with
their emigration. This happened in Israel, and also later in the United States.
Vladik, himself an emigrant, continues:
Together with a local rabbi, Yuri formed a committee which became
one of the grassroots activities that finally managed to convince the
American public opinion that the Soviet Jews needed help, in a period
when the political climate was on the left side. History books have
been written about that period, and Yuri is not described there as
one of the prominent and visible leaders of the American support
campaign. However, this is only because he preferred not to be in the
spotlights.
Arriving in Israel, Yuri’s talent for languages was very useful. Baruch
Cahlon, who shared an office with Yuri in Beer-Sheva, recounts:
Yuri hardly spoke any Hebrew, but he had to communicate with all
of us in this orient-modern language, which he seemed to love, but
was yet unable to utter. My wife, who is a Hebrew teacher, used to
tell me how Yuri would try to answer the phone when she would
call. Believe me, it was funny. It didn’t take long, however, and one
day, during a department meeting, Yuri stood up and spoke fluent,
almost flawless Hebrew. We were totally astonished. Just couldn’t
believe it. At the time, of course, we had not known about his love
for languages. His interest in these disciplines far surpasses that of an
average mathematician. But then, of course, Yuri isn’t that either!
60 J. Van den Bussche
In a similar vein, Jim Huggins, one of Yuri’s later students in Michigan, told
me that Yuri would ask him questions about English such as “what is the
difference in pronunciation of the words ‘morning’ and ‘mourning’ ?”, and they
would try to find English equivalents of Russian proverbs. During a number
of summers spent in Paris—Yuri likes the French way of life very much—he
learnt to speak more than a mouthful of French as well. And last April, when
we were together at a nice restaurant in Ascona, Yuri had fun translating the
Italian menu to us!
In Israel, Yuri also gave new meanings to religious symbols. Saharon Shelah
told me that Yuri once turned up at a logic seminar wearing a kipah, this is
the Jewish religious cap. When asked about it, he replied “why, it is very
convenient: it covers exactly my bald part!” Yuri, I am sorry, but I guess that
by now even an extra large won’t do anymore!
I finally must mention the constant source of support in Yuri’s life provided
by his wife, Zoe. Not for nothing, Yuri himself describes the period before he
met Zoe as his “protozoan” period! By the way, Zoe is also a mathematician
by education, and she is a hell of a computer programmer.
Dear Yuri, although you are now a professor emeritus, you are definitely
not yet retired, and thanks to you we will continue to see a lot of exciting
things coming out of Microsoft. I congratulate you with your sixtiest birthday,
and am looking forward to the next sixty years!
Tracking Evidence
Sergei Artemov
CUNY Graduate Center, 365 Fifth Ave., New York, NY 10016, USA
[email protected]
1 Introduction
In this paper, commencing from seminal works [14,21], the following analysis of
basic epistemic notions was adopted: for a given agent,
F is known ∼ F holds in all epistemically possible situations. (1)
The notion of justification, an essential component of epistemic studies, was in-
troduced into the mathematical models of knowledge within the framework of
Justification Logic in [1,2,3,5,6,8,13,16,18,19,22] and other papers; a comprehen-
sive account of this approach is given in [4]. At the foundational level, Justifica-
tion Logic furnishes a new, evidence-based semantics for the logic of knowledge,
according to which
F is known ∼ F has an adequate justification. (2)
Within Justification Logic, we can reason about justifications, simple and com-
pound, and track different pieces of evidence pertaining to the same fact.
In this paper we develop a sufficiently general mechanism of evidence tracking
which is crucial for distinguishing between factive and nonfactive justifications.
Some preliminary observations leading to this mechanism have been discussed
in [4].
This work was supported by NSF grant 0830450.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 61–74, 2010.
c Springer-Verlag Berlin Heidelberg 2010
62 S. Artemov
F = p | F ∧ F | F ∨ F | F → F | ¬F | t:F.
J4 = J40 + R4
A Kripke-Fitting model [13] M = (W, R, E,) is a Kripke model (W, R,) with
transitive accessibility relation R (for J4-style systems), augmented by an ad-
missible evidence function E which for any evidence term t and formula F , spec-
ifies the set of possible worlds where t is considered admissible evidence for
F , E(t, F ) ⊆ W . The admissible evidence function E must satisfy the closure
conditions with respect to operations ·, +, ! as follows:
Theorem 2 (cf. [4]). For any Constant Specification CS, J4CS is sound and
complete for the corresponding class of Kripke-Fitting models respecting CS.
64 S. Artemov
If a man believes that the late Prime Minister’s last name began with a
‘B,’ he believes what is true, since the late Prime Minister was Sir Henry
Campbell Bannerman1 . But if he believes that Mr. Balfour was the late
Prime Minister2 , he will still believe that the late Prime Minister’s last
name began with a ‘B,’ yet this belief, though true, would not be thought
to constitute knowledge.
Here we have to deal with two justifications for a true statement, one which is
correct and one which is not. Let B be a sentence (propositional atom), w be
a designated evidence variable for the wrong reason for B and r a designated
evidence variable for the right (hence factive) reason for B. Then, Russell’s
example prompts the following set of assumptions3 :
F = p | F ∧ F | F ∨ F | F → F | ¬F | t:F | [[s]]F.
Γ F iff F ∈ Γ.
Proof. The proof is also rather standard and proceeds by induction on F . The
base case holds by the definition of the forcing relation ; Boolean connectives
are straightforward. Let F be t:G. If t:G ∈ Γ , then Γ ∈ EA (t, G); moreover,
by the definition of RA , G ∈ Δ for each Δ such that Γ RA Δ. By the Induction
Hypothesis, Δ G, therefore, Γ t:G. If t:G ∈ Γ , then Γ ∈ EA (t, G) and Γ
t:G.
The Induction step in case F = [[s]]G is considered in a similar way.
Corollary 2. TCSA , TCSO , and TCSOA hold at each node.
Indeed, TCSA ∪ TCSO ∪ TCSOA ⊆ Γ since Γ contains all postulates of J4(J4).
By the Truth Lemma, Γ TCSA ∪ TCSO ∪ TCSOA .
To complete the proof of Theorem 6, consider F which is not derivable in
J4(J4). The set {¬F } is therefore consistent. By the standard Henkin construc-
tion, {¬F } can be extended to a maximal consistent set Γ . Since F ∈ Γ , by the
Truth Lemma, Γ F.
Here B, r, and w are as in R, and x, y, z are designated proof variables for the
observer.
First, we check that the observer knows the factivity of w for B, e.g., that
J4(J4) + R + IR [[s]](w:B → B)
for some proof term s. Here is the derivation, which is merely the internalization
of the corresponding derivation from Sect. 2:
1. [[x]]r:B - an assumption;
2. [[y]](r:B → B) - an assumption;
3. [[y ·x]]B - from 1 and 2, by application;
4. [[a]][B → (w:B → B)] - by TCSO for a propositional axiom;
5. [[a·(y ·x)]](w:B → B) - from 3 and 4, by application.
Finally, let us establish that the observer cannot conclude w :B → B other
than by using the factivity of r. In our formal setting, this amounts to proving
the following theorem.
Theorem 7. If
J4(J4) + R + IR [[s̃]](w:B → B) ,
then term s̃ contains both proof variables x and y.
Tracking Evidence 69
then
J4(J4) + R + IR [[s ]]F.
Indeed, all axioms of ∗[J4(J4) + R + IR] are provable in J4(J4) + R + IR; the
rules of the former correspond to axioms of the latter.
In order to establish the converse, let us suppose that
– W = {1};
– RA = ∅, RO = {(1, 1)};
– EA (t, G) = W for each t, G;
– EO (s, G) holds at 1 iff ∗[J4(J4) + R + IR] [[s]]G;
– 1 p for all propositional variables, including B.
Note that RA and RO are transitive. Let us check the closure properties of
the evidence functions. EA is universal and hence closed. EO is closed under
application, sum, and verifier since the calculus ∗[J4(J4) + R + IR] is.
Monotonicity of EA and EO vacuously hold since W is a singleton.
Furthermore, TCSA,O,OA hold in M. To check this, we first note that since
RA = ∅, a formula t:G holds at 1 if and only if EA (t, G). Therefore, all formulas
t:G hold at 1, in particular, 1 TCSA . Hence all axioms A1–A4 of J4(J4) hold
70 S. Artemov
at 1. This yields that 1 TCSOA . Indeed, for each [[c]]A ∈ TCSOA , 1 A (just
established) and EO (c, A), since [[c]]A is an axiom of ∗[J4(J4) + R + IR]. By the
same reasons, 1 R.
Since
∗[J4(J4) + R + IR] TCSO ,
EO (c, A) holds for all [[c]]A ∈ TCSO . In addition, each such A is an axiom O1–O3,
hence 1 A. Therefore, 1 TCSO . By similar reasons, 1 IR.
We have just established that M is a model for J4(J4) + R + IR.
We claim that M [[s ]]F which follows immediately from the assumption
that
∗[J4(J4) + R + IR] [[s ]]F
since then EO (s , F ) does not hold at 1. Therefore,
Proof. Obvious, from the fact that all rules of ∗[J4(J4) + R + IR] have such a
subterm property.
Proof. Suppose the opposite, i.e., that s̃ does not contain x. Then, by the sub-
term property, the proof of [[s̃]](w:B → B) in ∗[J4(J4) + R + IR] does not use
axiom [[x]]r:B. Moreover, since ∗[J4(J4) + R + IR] does not really depend on
R, [[s̃]](w:B → B) is derivable without R and [[x]]r:B. Since such a proof can be
replicated in J4(J4) + IR without [[x]]r:B, it should be the case that
M = (W, RA , RO , EA , EO ,)
in which [[y]](r:B → B) and [[z]]w:B hold, but [[s̃]](w:B → B) does not. Here is the
model:
– W = {1};
– RA = ∅, RO = {(1, 1)};
– EA (r, B) = ∅ and EA (t, F ) = W for all other pairs t, F ;
– EO (s, G) = W for all s, G;
– 1
p for all propositional variables, including B.
Tracking Evidence 71
1 TCSA .
Since RO = {(1, 1)} and EO (s, G) = W , for any observer evidence term s,
1 [[s]]G if and only if 1 G. All observer axioms hold at 1 and hence
1 TCSO .
1 TCSOA .
1 [[y]](r:B → B) ,
M = (W, RA , RO , EA , EO ,)
in which [[x]]r:B and [[z]]w:B hold, but [[s̃]](w:B → B) does not hold. Here is this
model:
– W = {1};
– RA = ∅, RO = {(1, 1)};
– EA (t, F ) = W for all t, F ;
– EO (s, G) = W for all s, G;
– 1
p for all propositional variables, including B.
Conditions on RA , RO , EA , EO are obviously met. Let us check constant specifi-
cations of J4(J4). Since RA = ∅, and EA (t, F ) holds at 1 for all t, F , t:F holds at
for all t, F . In particular,
72 S. Artemov
1 TCSA .
For the same reasons, 1 r:B and 1 w:B.
Furthermore, 1 [[s]]F if and only if 1 F , because EO (s, F ) = {1} and
RO = {(1, 1)}. Therefore,
1 TCSOA .
Since all axioms O1–O3 are true at 1,
1 TCSO .
Another natural candidate for the observer logic is the Logic of Proofs LP (cf.
[2,4,13]) which is J4 augmented by the Factivity Axiom
[[s]]F → F ,
6 Conclusions
is obviously inconsistent. However, any derivation from A which does not use
all n + 1 assumptions of A is contradiction-free. This argument can be naturally
formalized in Justification Logic.
We wish to think that this approach to evidence tracking could be also useful
in distributed knowledge systems (cf. [10,11,12]).
Acknowledgements
The author is very grateful to Mel Fitting, Vladimir Krupski, Roman Kuznets,
and Elena Nogina, whose advice helped with this paper. Many thanks to Karen
Kletter for editing this text.
References
1. Artemov, S.: Operational modal logic. Technical Report MSI 95-29, Cornell Uni-
versity (1995)
2. Artemov, S.: Explicit provability and constructive semantics. Bulletin of Symbolic
Logic 7(1), 1–36 (2001)
5
Here p1 , p2 , . . . , pn are propositional letters.
74 S. Artemov
1 Introduction
The standard intuitionistic connectives (⊃, ∧, ∨, and ⊥) are of great importance
in theoretical computer science, especially in type theory, where they correspond
to basic operations on types (via the formulas-as-types principle and Curry-
Howard isomorphism). Now a natural question is: what is so special about these
connectives? The standard answer is that they are all constructive connectives.
But then what exactly is a constructive connective, and can we define other basic
constructive connectives beyond the four intuitionistic ones? And what does the
last question mean anyway: how do we “define” new (or old) connectives?
Concerning the last question there is a long tradition starting from [12] (see e.g.
[16] for discussions and references) according to which the meaning of a connec-
tive is determined by the introduction and elimination rules which are associated
with it. Here one usually has in mind natural deduction systems of an ideal type,
where each connective has its own introduction and elimination rules, and these
rules should meet the following conditions: in a rule for some connective this con-
nective should be mentioned exactly once, and no other connective should be in-
volved. The rule should also be pure in the sense of [1] (i.e. there should be no side
conditions limiting its application), and its active formulas should be immediate
This research was supported by The Israel Science Foundation (grant no. 809-06).
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 75–94, 2010.
c Springer-Verlag Berlin Heidelberg 2010
76 A. Avron and O. Lahav
subformulas of its principal formula. Now an n-ary connective that can be de-
fined using such rules may be taken as constructive if in order to prove the logical
validity of a sentence of the form (ϕ1 , . . . , ϕn ), it is necessary to prove first the
premises of one of its possible introduction rules (see [9]).
Unfortunately, already the handling of negation requires rules which are not
ideal in the sense described above. For intuitionistic logic this problem is usually
solved by not taking negation as a basic constructive connective, but defining it
instead in terms of more basic connectives that can be characterized by “ideal”
rules (¬ϕ is defined as ϕ →⊥). In contrast, for classical logic the problem was
solved by Gentzen himself by moving to what is now known as Gentzen-type
systems or sequential calculi. These calculi employ single-conclusion sequents
in their intuitionistic version, and multiple-conclusion sequents in their classical
version. Instead of introduction and elimination rules they use left introduction
rules and right introduction rules. The intuitive notions of an “ideal rule” can be
adapted to such systems in a straightforward way, and it is well known that the
usual classical connectives and the basic intuitionistic connectives can indeed be
fully characterized by “ideal” Gentzen-type rules. Moreover: although this can
be done in several ways, in all of them the cut-elimination theorem obtains. This
immediately implies that the connectives of intuitionistic logic are constructive
in the sense explained above, because without using cuts the only way to derive
⇒ (ϕ1 , . . . , ϕn ) in single conclusion systems of this sort is to prove first the
premises of one of its introduction rules (and then apply that introduction rule).
Note that the only formulas that can occur in such premises are ϕ1 , . . . , ϕn .
For the multiple-conclusion framework the above-mentioned facts about the
classical connectives were considerably generalized in [6,7] by defining “multiple-
conclusion canonical propositional Gentzen-type rules and systems” in precise
terms. A constructive necessary and sufficient coherence criterion for the non-
triviality of such systems was then provided, and it was shown that a system of
this kind admits cut-elimination iff it is coherent. It was further proved that the
semantics of such systems is provided by two-valued non-deterministic matrices
(two-valued Nmatrices) – a natural generalization of the classical truth-tables.
In fact, a characteristic two-valued Nmatrix was constructed for every coherent
canonical propositional system. That work shows that there is a large family of
what may be called semi-classical connectives (which includes all the classical
connectives), each of which has both a proof-theoretical characterization in terms
of a coherent set of canonical (= “ideal”) rules, and a semantic characterization
using two-valued Nmatrices.
In this paper we develop a similar theory for the constructive propositional
framework. We define the notions of a canonical rule and a canonical system in the
framework of strict single-conclusion Gentzen-type systems (or, equivalently, nat-
ural deduction systems). We prove that here too a canonical system is non-trivial
iff it is coherent (where coherence is a constructive condition, defined like in the
multiple-conclusion case). We develop a general non-deterministic Kripke-style se-
mantics for such systems, and show that every constructive canonical system (i.e.
coherent canonical single-conclusion system) induces a class of non-deterministic
Strict Canonical Constructive Systems 77
Kripke-style frames for which it is strongly sound and complete. We use this non-
deterministic semantics to show that all constructive canonical systems admit a
strong form of the cut-elimination theorem. We also use it for providing decision
procedures for all such systems. These results again identify a large family of basic
constructive connectives, each having both a proof-theoretical characterization in
terms of a coherent set of canonical rules, and a semantic characterization using
non-deterministic frames. The family includes the standard intuitionistic connec-
tives (⊃, ∧, ∨, and ⊥), as well as many other independent connectives, like the
semi-implication which has been introduced and used by Gurevich and Neeman
in [13].1
It is easy to see (see [7]) that there are exactly two inconsistent structural con-
sequence relations in any given language.2 These consequence relations are ob-
viously trivial, so we exclude them from our definition of a logic:
The following definitions formulate in exact terms the idea of an “ideal rule”
which was described in the introduction. We first formulate these definitions
in terms of Gentzen-type systems. We consider natural deduction systems in a
separate subsection.
Definition 6.
1. A strict canonical introduction rule for a connective of arity n is an expres-
sion constructed from a set of premises and a conclusion sequent, in which
appears in the right side. Formally, it takes the form:
{Πi ⇒ Σi }1≤i≤m / ⇒ (p1 , p2 , . . . , pn )
where m can be 0, and for all 1 ≤ i ≤ m, Πi ⇒ Σi is a definite Horn clause
such that Πi ∪ Σi ⊆ {p1 , p2 , . . . , pn }.
2. A strict canonical elimination3 rule for a connective of arity n is an expres-
sion constructed from a set of premises and a conclusion sequent, in which
appears in the left side. Formally, it takes the form:
{Πi ⇒ Σi }1≤i≤m / (p1 , p2 , . . . , pn ) ⇒
where m can be 0, and for all 1 ≤ i ≤ m, Πi ⇒ Σi is a Horn clause (either
definite or negative) such that Πi ∪ Σi ⊆ {p1 , p2 , . . . , pn }.
3. An application of the rule {Πi ⇒ Σi }1≤i≤m / ⇒ (p1 , p2 , . . . , pn ) is any
inference step of the form:
{Γ, σ(Πi ) ⇒ σ(Σi )}1≤i≤m
Γ ⇒ (σ(p1 ), . . . , σ(pn ))
where Γ is a finite set of formulas and σ is a substitution in L.
3
The introduction/elimination terminology comes from the natural deduction con-
text. For the Gentzen-type context the names “right introduction rule” and “left
introduction rule” might be more appropriate, but we prefer to use a uniform ter-
minology.
Strict Canonical Constructive Systems 79
{p1 , p2 ⇒ } / p1 ∧ p2 ⇒ and { ⇒ p1 , ⇒ p2 } / ⇒ p1 ∧ p2 .
Γ, ψ, ϕ ⇒ θ Γ ⇒ψ Γ ⇒ϕ
Γ, ψ ∧ ϕ ⇒ θ Γ ⇒ψ∧ϕ
The above elimination rule can easily be shown to be equivalent to the combi-
nation of the two, more usual, elimination rules for conjunction.
Example 2 (Disjunction). The two usual introduction rules for disjunction are:
{ ⇒ p1 } / ⇒ p1 ∨ p2 and { ⇒ p2 } / ⇒ p1 ∨ p2 .
Γ ⇒ψ Γ ⇒ϕ
Γ ⇒ψ∨ϕ Γ ⇒ψ∨ϕ
{p1 ⇒ , p2 ⇒} / p1 ∨ p2 ⇒ .
Γ, ψ ⇒ θ Γ, ϕ ⇒ θ
Γ, ψ ∨ ϕ ⇒ θ
{⇒ p1 , p2 ⇒} / p1 ⊃ p2 ⇒ and {p1 ⇒ p2 } / ⇒ p1 ⊃ p2 .
Γ ⇒ ψ Γ, ϕ ⇒ θ Γ, ψ ⇒ ϕ
Γ, ψ ⊃ ϕ ⇒ θ Γ ⇒ψ⊃ϕ
80 A. Avron and O. Lahav
{⇒ p1 , p2 ⇒} / p1 ; p2 ⇒ and {⇒ p2 } / ⇒ p1 ; p2 .
Γ ⇒ ψ Γ, ϕ ⇒ θ Γ ⇒ϕ
Γ, ψ ; ϕ ⇒ θ Γ ⇒ψ;ϕ
Proposition 1. T G ϕ iff {⇒ ψ | ψ ∈ T } seq
G ⇒ ϕ.
The last proposition does not guarantee that every strict canonical system in-
duces a logic (see Definition 4). For this the system should satisfy one more
condition:
Example 6. All the sets of rules for the connectives ∧, ∨, ⊃, ⊥, and ; which
were introduced in the examples above are coherent. For example, for the two
rules for conjunction we have S1 = {p1 , p2 ⇒ }, S2 = { ⇒ p1 , ⇒ p2 }, and
S1 ∪ S2 is the classically inconsistent set {p1 , p2 ⇒ , ⇒ p1 , ⇒ p2 } (from which
the empty sequent can be derived using two cuts).
Example 7. In [15] Prior introduced a “connective” T (which he called “Tonk”)
with the following rules: {p1 ⇒ } / p1 T p2 ⇒ and { ⇒ p2 } / ⇒ p1 T p2 . Prior
then used “Tonk” to infer everything from everything (trying to show by this
that a set of rules might not define any connective). Now the union of the sets of
premises of these two rules is {p1 ⇒ , ⇒ p2 }, and this is a classically consistent
set of clauses. It follows that Prior’s set of rules for Tonk is incoherent.
Definition 10. A strict canonical single-conclusion Gentzen-type system G is
called coherent if every primitive connective of the language of G has a coherent
set of rules in G.
Theorem 1. Let G be a strict canonical Gentzen-type system.
L, G is a logic
(i.e. G is structural, finitary and consistent) iff G is coherent.
The last theorem implies that coherence is a minimal demand from any accept-
able strict canonical Gentzen-type system G. It follows that not every set of such
rules is legitimate for defining constructive connectives – only coherent ones do
(and this is what is wrong with “Tonk”). Accordingly we define:
82 A. Avron and O. Lahav
Γ ⇒ψ Γ, ϕ ⇒ θ Γ ⇒ ψ ⊃ ϕ
Γ ⇒θ
This form of the rule is obviously equivalent to the more usual one (from Γ ⇒ ψ
and Γ ⇒ ψ ⊃ ϕ infer Γ ⇒ ϕ).
The most useful semantics for propositional intuitionistic logic (the paradigmatic
constructive logic) is that of Kripke frames. In this section we generalize this
semantics to arbitrary strict canonical constructive systems. For this we should
introduce non-deterministic Kripke frames.5
Note 3. Because of the persistence condition, a definite Horn clause of the form
⇒ q is satisfied in a by σ iff v(a, σ(q)) = t.
Definition 17. Let W =
W, ≤, v be a generalized L-frame, and let be an
n-ary connective of L.
1. The frame W respects an introduction rule r for if v(a, (ψ1 , . . . , ψn )) = t
whenever all the premises of r are satisfied in a by a substitution σ such
that σ(pi ) = ψi for 1 ≤ i ≤ n (The values of σ(q) for q ∈ {p1 , . . . , pn } are
immaterial here).
2. The frame W respects an elimination rule r for if v(a, (ψ1 , . . . , ψn )) = f
whenever all the premises of r are satisfied in a by a substitution σ such that
σ(pi ) = ψi (1 ≤ i ≤ n).
3. Let G be a strict canonical Gentzen-type system for L. The generalized
L-frame W is G-legal if it respects all the rules of G.
Example 12. By definition, a generalized L-frame W =
W, ≤, v respects the
rule (⊃⇒) iff for every a ∈ W , v(a, ϕ ⊃ ψ) = f whenever v(b, ϕ) = t for every
b ≥ a and v(a, ψ) = f . Because of the persistence condition, this is equivalent to
v(a, ϕ ⊃ ψ) = f whenever v(a, ϕ) = t and v(a, ψ) = f . Again by the persistence
condition, v(a, ϕ ⊃ ψ) = f iff v(b, ϕ ⊃ ψ) = f for some b ≥ a. Hence, we
get: v(a, ϕ ⊃ ψ) = f whenever there exists b ≥ a such that v(b, ϕ) = t and
v(b, ψ) = f . The frame W respects (⇒⊃) iff for every a ∈ W , v(a, ϕ ⊃ ψ) = t
whenever for every b ≥ a, either v(b, ϕ) = f or v(b, ψ) = t. Hence the two
rules together impose exactly the well-known Kripke semantics for intuitionistic
implication ([14]).
Example 13. A generalized L-frame W =
W, ≤, v respects the rule (;⇒) un-
der the same conditions under which it respects (⊃⇒). The frame W respects
(⇒;) iff for every a ∈ W , v(a, ϕ ; ψ) = t whenever v(a, ψ) = t (recall that
this is equivalent to v(b, ψ) = t for every b ≥ a). Note that in this case the two
rules for ; do not always determine the value assigned to ϕ ; ψ: if v(a, ψ) = f ,
and there is no b ≥ a such that v(b, ϕ) = t and v(b, ψ) = f , then v(a, ϕ ; ψ) is
free to be either t or f . So the semantics of this connective is non-deterministic.
Example 14. A generalized L-frame W =
W, ≤, v respects the rule (T ⇒)
(see Example 7) if v(a, ϕT ψ) = f whenever v(a, ϕ) = f . It respects (⇒ T ) if
v(a, ϕT ψ) = t whenever v(a, ψ) = t. The two constraints contradict each other
in case both v(a, ϕ) = f and v(a, ψ) = t. This is a semantic explanation why
Prior’s “connective” T (“Tonk”) is meaningless.
Definition 18. Let G be a strict canonical constructive system.
1. We denote S |=seq
G s (where S is a set of sequents and s is a sequent) iff every
G-legal model of S is also a model of s.
2. The semantic consequence relation |=G between formulas which is induced by
G is defined by: T |=G ϕ if every G-legal model of T is also a model of ϕ.
Again we have:
Proposition 3. T |=G ϕ iff {⇒ ψ | ψ ∈ T } |=seq
G ⇒ ϕ.
Strict Canonical Constructive Systems 85
In this section we show that the two logics induced by a strict canonical con-
structive system G ( G and |=G ) are identical. Half of this identity is given in
the following theorem:
Theorem 2. Every strict canonical constructive system G is strongly sound
with respect to the semantics of G-legal generalized frames. In other words:
1. If T G ϕ then T |=G ϕ.
2. If S seq seq
G s then S |=G s.
Proof. We prove the second part first. Assume that S seq G s, and W =
W, ≤, v
is a G-legal model of S. We show that s is locally true in every a ∈ W . Since
the axioms of G and the premises of S trivially have this property, and the cut
and weakening rules obviously preserve it, it suffices to show that the property
of being locally true is preserved also by applications of the logical rules of G.
– Suppose Γ ⇒ (ψ1 , . . . , ψn ) is derived from {Γ, σ(Πi ) ⇒ σ(qi )}1≤i≤m us-
ing the introduction rule r = {Πi ⇒ Σi }1≤i≤m / ⇒ (p1 , p2 , . . . , pn ) (σ is
a substitution such that σ(pj ) = ψj for 1 ≤ j ≤ n). Assume that all the
premises of this application have the required property. We show that so
does its conclusion. Let a ∈ W . If v(a, ψ) = f for some ψ ∈ Γ , then obviously
Γ ⇒ (ψ1 , . . . , ψn ) is locally true in a. Assume otherwise. Then the persis-
tence condition implies that v(b, ψ) = t for every ψ ∈ Γ and b ≥ a. Hence our
assumption concerning {Γ, σ(Πi ) ⇒ σ(qi )}1≤i≤m entails that for every b ≥ a
and 1 ≤ i ≤ m, either v(b, ψ) = f for some ψ ∈ σ(Πi ), or v(b, σ(qi )) = t. It
follows that for 1 ≤ i ≤ m, Πi ⇒ qi is satisfied in a by σ. Since W respects
r, it follows that v(a, (ψ1 , . . . , ψn )) = t, as required.
– Now we deal with the elimination rules of G. Suppose Γ, (ψ1 , . . . , ψn ) ⇒ θ is
derived from {Γ, σ(Πi ) ⇒ σ(Σi )}1≤i≤m1 and {Γ, σ(Πi ) ⇒ θ}m1 +1≤i≤m , us-
ing the elimination rule r = {Πi ⇒ Σi }1≤i≤m / (p1 , p2 , . . . , pn ) ⇒ (where
Σi is empty for m1 + 1 ≤ i ≤ m, and σ is a substitution such that σ(pj ) = ψj
for 1 ≤ j ≤ n). Assume that all the premises of this application have the re-
quired property. Let a ∈ W . If v(a, ψ) = f for some ψ ∈ Γ or v(a, θ) = t,
then we are done. Assume otherwise. Then v(a, θ) = f , and (by the persis-
tence condition) v(b, ψ) = t for every ψ ∈ Γ and b ≥ a. Hence our assump-
tion concerning {Γ, σ(Πi ) ⇒ σ(Σi )}1≤i≤m1 entails that for every b ≥ a and
1 ≤ i ≤ m1 , either v(b, ψ) = f for some ψ ∈ σ(Πi ), or v(b, σ(Σi )) = t. This
immediately implies that every definite premise of the rule is satisfied in a
by σ. Since v(a, θ) = f , our assumption concerning {Γ, σ(Πi ) ⇒ θ}m1 +1≤i≤m
entails that for every m1 + 1 ≤ i ≤ m, v(a, ψ) = f for some ψ ∈ σ(Πi ). Hence
the negative premises of the rule are also satisfied in a by σ. Since W respects
r, it follows that v(a, (ψ1 , . . . , ψn )) = f , as required.
The first part follows from the second by Propositions 1 and 3.
86 A. Avron and O. Lahav
It remains to prove that W is a model of S but not of s. For this we first prove
that the following hold for every T ∈ W and every formula ψ ∈ F :
Proof. The first part follows from Theorems 4 and 3. The second part is a special
case of the first, where the set S of premises is empty.
Corollary 3. The four following conditions are equivalent for a strict canonical
single-conclusion Gentzen-type system G:
1.
L, G is a logic (by Proposition 2, this means that G is consistent).
2. G is coherent.
3. G admits strong cut-elimination.
4. G admits cut-elimination.
The following two theorems are now easy consequences of Theorem 6 and the
soundness and completeness theorems of the previous section:7
Proof. Let F be the set of subformulas of the formulas in S ∪{s}. From Theorem
6 and the proof of Theorem 3 it easily follows that in order to decide whether
S seq
G s it suffices to check all triples of the form
W, ⊆, v where W ⊆ 2
F
and
v : F → (W → {t, f }), and see if any of them is a G-legal semiframe which is
a model of S but not a model of s.
Note 4. Prior’s “connective” Tonk ([15]) has made it clear that not every com-
bination of “ideal” introduction and elimination rules can be used for defining
a connective. Some constraints should be imposed on the set of rules. Such a
constraint was indeed suggested by Belnap in his famous [8]: the rules for a con-
nective should be conservative, in the sense that if T ϕ is derivable using
them, and does not occur in T ∪ ϕ, then T ϕ can also be derived without
using the rules for . This solution to the problem has two problematic aspects:
1. Belnap did not provide any effective necessary and sufficient criterion for
checking whether a given set of rules is conservative in the above sense.
Without such criterion every connective defined by inference rules (without
an independent denotational semantics) is suspected of being a Tonk-like
connective, and should not be used until a proof is given that it is “innocent”.
2. Belnap formulated the condition of conservativity only with respect to the
basic deduction framework, in which no connectives are assumed. But noth-
ing in what he wrote excludes the possibility of a system G having two
connectives, each of them “defined” by a set of rules which is conservative
7
The two theorems can also be proved directly from the cut-elimination theorem for
strict canonical constructive systems.
92 A. Avron and O. Lahav
over the basic system B, while G itself is not conservative over B. If this
happens then it will follow from Belnap’s thesis that each of the two connec-
tives is well-defined and meaningful, but they cannot exist together. Such a
situation is almost as paradoxical as that described by Prior.
Now the first of these two objections is met, of course, by our coherence criterion
for strict canonical systems, since coherence of a finite set of strict canonical
rules can effectively be checked. The second is met by Theorem 8. That theorem
shows that a very strong form of Belnap’s conservativity criterion is valid for
strict canonical constructive systems, and so what a set of strict canonical rules
defines in such systems is independent of the system in which it is included.
Applications of these rules have the following form (where E is either empty
or a singleton):
Γ, ϕ ⇒ E Γ, ϕ ⇒
Γ, ◦ϕ ⇒ E Γ ⇒ ◦ϕ
Obviously, G is not coherent. However, in G there is no way to derive a
negative sequent from no assumptions (this is proved by simple induction).
Hence, the introduction rule for ◦ can never be used in proofs without as-
sumptions. For this trivial reason, G is consistent. Hence, in this framework
coherence is no longer equivalent to consistency.
– For the same reason, G from the previous example admits cut-elimination
but does not admit strong cut-elimination. Hence, strong cut-elimination
and cut-elimination are also no longer equivalent.
– Consider the well-known rules for intuitionistic negation:
Γ ⇒ϕ Γ, ϕ ⇒
Γ, ¬ϕ ⇒ Γ, ⇒ ¬ϕ
References
Abstract. Let M = (A, <, P ) where (A, <) is a linear ordering and P
denotes a finite sequence of monadic predicates on A. We show that if A
contains an interval of order type ω or −ω, and the monadic second-order
theory of M is decidable, then there exists a non-trivial expansion M of
M by a monadic predicate such that the monadic second-order theory
of M is still decidable.
1 Introduction
In this paper we address definability and decidability issues for monadic second
order (shortly: MSO) theories of labelled linear orderings. Elgot and Rabin ask
in [9] whether there exist maximal decidable structures, i.e., structures M with
a decidable first-order (shortly: FO) theory and such that the FO theory of any
expansion of M by a non-definable predicate is undecidable. This question is
still open. Let us mention some partial results:
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 95–107, 2010.
c Springer-Verlag Berlin Heidelberg 2010
96 A. Bès and A. Rabinovich
– The paper [1] gives a sufficient condition in terms of the Gaifman graph of
M which ensures that M is not maximal. The condition is the following: for
every natural number r and every finite set X of elements of the base set
|M | of M there exists an element x ∈ |M | such that the Gaifman distance
between x and every element of X is greater than r.
We investigate the Elgot-Rabin problem for the class of labelled linear orderings,
i.e., infinite structures M = (A; <, P1 , . . . , Pn ) where < is a linear ordering over
A and the Pi ’s denote unary predicates. This class is interesting with respect
to the above results, since on one hand no regular ordering seems to be FO
interpretable in such structures, and on the other hand their associated Gaifman
distance is trivial, thus they do not satisfy the criterion given in [1].
In this paper we focus on MSO logic rather than FO. The main result of the
paper is that for every labelled linear ordering M such that (A, <) contains an
interval of order type ω or −ω and the MSO theory of M is decidable, then there
exists an expansion M of M by a monadic predicate which is not MSO-definable
in M , and such that the MSO theory of M is still decidable. Hence, M is not
maximal. The result holds in particular when (A. <) is order-isomorphic to the
order of the naturals ω = (N, <), or to the order ζ = (Z, <) of the integers, or
to any infinite ordinal, or more generally any infinite scattered ordering (recall
that an ordering is scattered if it does not contain any dense sub-ordering).
The structure of the proof is the following: we first show that the result
holds for ω and ζ. For the general case, starting from M , we use some definable
equivalence relation on A to cut A into intervals whose order type is either finite,
or of the form −ω, ω, or ζ. We then define the new predicate on each interval
(using the constructions given for ω and ζ), from which we get the definition
of M . The reduction from M SO(M ) to M SO(M ) uses Shelah’s composition
theorem, which allows to reduce the MSO theory of an ordered sum of structures
to the MSO theories of the summands.
The main reason to consider MSO logic rather than FO is that it actually
simplifies the task. Nevertheless we discuss some partial results and perspectives
for FO logic in the conclusion of the paper.
Let us recall some important decidability results for MSO theories of linear
orderings (the case of labelled linear orderings will be discussed later for ω and
ζ). In his seminal paper [4], Büchi proved that languages of ω−words recogniz-
able by automata coincide with languages definable in the MSO theory of ω,
from which he deduced decidability of the theory. The result (and the automata
method) was then extended to the MSO theory of any countable ordinal [5], to
ω1 , and to any ordinal less than ω2 [6]. Gurevich, Magidor and Shelah prove [13]
that decidability of MSO theory of ω2 is independent of ZFC. Let us mention
results for linear orderings beyond ordinals. Using automata, Rabin [19] proved
decidability of the MSO theory of the binary tree, from which he deduces decid-
ability of the MSO theory of Q, which in turn implies decidability of the MSO
theory of the class of countable linear orderings. Shelah [26] improved model-
theoretical techniques that allow him to reprove almost all known decidability
results about MSO theories, as well as new decidability results for the case of
Decidable Expansions of Labelled Linear Orderings 97
2.2 Logic
Let us briefly recall useful elements of monadic second-order logic, and settle
some notations. For more details about MSO logic see e.g. [12,31]. Monadic
second-order logic is an extension of first-order logic that allows to quantify over
elements as well as subsets of the domain of the structure. Given a signature
L, one can define the set of (MSO) formulas over L as well-formed formulas
that can use first-order variable symbols x, y, . . . interpreted as elements of the
domain of the structure, monadic second-order variable symbols X, Y, . . . inter-
preted as subsets of the domain, symbols from L, and a new binary predicate
x ∈ X interpreted as “x belongs to X”. A sentence is a formula without free
variable. As usual, we often confuse logical symbols with their interpretation.
Given a signature L and an L−structure M with domain D, we say that a re-
lation R ⊆ Dm × (2D )n is (MSO) definable in M if a nd only if there exists a
formula over L, say ϕ(x1 , . . . , xm , X1 , . . . , Xn ), which is true in M if and only if
(x1 , . . . , xm , X1 , . . . , Xn ) is interpreted by an (m + n)−tuple of R. Given a struc-
ture M we denote by M SO(M ) (respectively F O(M )) the monadic second-order
(respectively first-order) theory of M . We say that M is maximal if M SO(M )
is decidable and M SO(M ) is undecidable for every expansion M of M by a
predicate which is not definable in M .
We can identify labelled linear orderings with structures of the form M =
(A, <, P1 , . . . , Pn ) where < is a binary relation interpreted as a linear ordering
over A, and the Pi ’s denote unary predicates. We use the notation P as a shortcut
98 A. Bès and A. Rabinovich
for the n-tuple (P1 , . . . Pn ). The structure M can be seen as a word indexed by
A and over the alphabet Σn = {0, 1}n ; this word will be denoted by w(M ). For
every interval I of A we denote by MI the sub-structure of M with domain I.
If the domains of the Mi are not disjoint, replace them with isomorphic chains
that have disjoint domains, and proceed
as before.
We shall use the notation M = i∈I Mi for theordered sum of the family
(Mi )i∈I . If I = {1, 2} has two element, we denote i∈I Mi by M1 + M2 .
3 The Case of N
In this section we prove that there is no maximal structure of the form (N, <, P )
with respect to MSO logic. The proof is based upon results from [20] . Let us first
briefly review results related to the decidability of the MSO theory of expansions
of (N, <). Büchi [4] proved decidability of M SO(N, <) using automata. On the
other hand it is known that M SO(N, +), and even M SO(N, <, x → 2x), are un-
decidable [22]. Elgot and Rabin study in [9] the MSO theory of structures of the
form (N, <, P ), where P is some unary predicate. They give a sufficient condition
on P which ensures decidability of the MSO theory of (N, <, P ). In particular
the condition holds when P denotes the set of factorials, or the set of powers of
any fixed integer. The frontier between decidability and undecidability of related
theories was explored in numerous later papers [7,10,25,24,21,20,27,29]. Let us
also mention that [25] proves the existence of unary predicates P and Q such that
both M SO(N, <, P ) and M SO(N, <, Q) are decidable while M SO(N, <, P, Q)
is undecidable.
Most decidability proofs for M SO(N, <, P ) are related somehow to the pos-
sibility of cutting N into segments whose k−type is ultimately constant, from
which one can compute the k−type of the whole structure (using Theorem 3).
This connection was specified in [20] (see also [21]) using the notion of homoge-
neous sets.
The following result [20] settles a tight connection between M SO(N, <, P ) and
uniformly homogeneous sets.
One can use this theorem to show that no structure M = (N, <, P ) is maximal.
Let us give the main ideas. Starting from M such that M SO(M ) is decidable,
Theorem 6 implies the existence of a recursive uniformly homogeneous set H =
{h0 < h1 < . . .} for M .
Let M be an expansion of M by a monadic predicate Pn+1 defined as Pn+1 =
{hn! | n ∈ N}.
By definition of H, the structures M[hk! ,h(k+j)! [ have the same k−type for all
j, k ≥ 0. If we combine this with the fact that successive elements of Pn+1 are
far away from each other, we can prove that Pn+1 is not definable in M . For all
i, k ≥ 0 let us define the interval I(i, k) = [h(k+i)! , h(k+i+1)! [. In order to prove
that M SO(M ) is decidable, we exploit the fact that all structures MI(i,k) have
the same k−type for all i, k ≥ 0, and that only the first element of each interval
I(i, k) belongs to Pn+1 . This allows to compute easily the k−type of structures
MI(i,k) from the one of MI(i,k) , and then the k−type of the whole structure M .
This provides a reduction from M SO(M ) to M SO(M ).
The above construction, which we described for a fixed structure M , can
actually be defined uniformly in M . This leads to the following result.
Let us discuss item (2). In the proof of the general result (see Sect. 5), we start
from a labelled linear ordering M = (A, <, P ) with a decidable MSO theory and
try to expand it while keeping decidability. In some case the (decidable) expan-
sion M of M will be defined by applying the above construction to infinitely
many intervals of A of order type ω. In order to get a reduction from M SO(M )
to M SO(M ), we need that the reduction algorithm for such intervals is uniform,
which is what item (2) expresses.
Decidable Expansions of Labelled Linear Orderings 101
4 The Case of Z
Remark 10. Let us discuss uniformity issues related to Proposition 7 and Propo-
sition 9. Proposition 7 implies that there is an algorithm which reduces M SO(M )
to M SO(M ). This reduction algorithm is independent of M ; it only uses an or-
acle for M SO(M ). Proposition 9 implies a weaker property. Namely, it implies
that for every non-recurrent M there is an algorithm which reduces M SO(M ) to
M SO(M ). However, this reduction algorithm depends on M .
The proof of Lemma 11 is similar to the proof of Proposition 2.8 in [1], which
roughly shows how to deal with the case when w(M ) is rich.
Now w(M ) has a finite factor in some regular language X ⊆ Σn+1∗
iff w(M )
∗
has a finite factor in π(X ) ⊆ Σn . The set π(X ) is regular, and a sentence which
defines π(X ) is computable from a sentence that defines X , thus we obtain, by
Theorem 8(2), that if M SO(M ) is decidable then M SO(M ) is decidable.
One can show that if M is any expansion of M which has property (*), then
Pn+1 is not definable in M . This implies that no recurrent structure is maximal.
From a more detailed analysis of the proof of Theorem 8(2) we can derive the
following proposition.
Decidable Expansions of Labelled Linear Orderings 103
5 Main Result
The next theorem is our main result.
Theorem 15. Let M = (A, <, P1 , . . . , Pn ) where (A, <) contains an interval
of type ω or −ω. There exists an expansion M of M by a relation Pn+1 such
that Pn+1 is not definable in M , and M SO(M ) is recursive in M SO(M ). In
particular, if M SO(M ) is decidable, then M SO(M ) is decidable.
1. if some Ai has order type ω or −ω, then we apply to each substructure MAi
of order type ω the construction given in Proposition 7, and add no element
of Pn+1 elsewhere. If there is no Ai of order type ω, we proceed in a similar
way with each substructure MAi of order type −ω, but using the dual of
Proposition 7 for −ω.
2. if no Ai has order type ω or −ω, then at least one ≈ −equivalence class Ai
has order type ζ. We consider two subcases:
(a) if all ≈ −equivalence classes Ai with order type ζ are such that w(MAi )
is recurrent, then we apply to each substructure MAi of order type ζ the
construction given in Proposition 12. For other ≈ −equivalence classes
Ai we set Pn+1 ∩ Ai = ∅.
(b) otherwise there exist ≈ −equivalence classes Ai with order type ζ and
such that w(MAi ) is not recurrent. Let ϕ(x) be a formula with minimal
quantifier depth such that ϕ(x) defines an element in some MAi where
Ai has order type ζ. For every MAi such that Ai has order type ζ and
ϕ(x) defines an element ci in MAi , we apply the construction Eci from
Proposition 9 to MAi . For other ≈ −equivalence classes Ai we set Pn+1 ∩
Ai = ∅.
The fact that the set Pn+1 is not definable in M follows rather easily from the
construction, which ensures that there exists some Ai such that the restriction
of Pn+1 to Ai is not definable in the substructure MAi .
Let M be the expansion of M by the predicate Pn+1 . In order to prove that
M SO(M ) is recursive in M SO(M ), we use Shelah’s composition method [26,
Theorem 2.4] (see also [12,30]) which allows to reduce the MSO theory of a sum
of structures to the MSO theories of the components and the MSO theory of the
index structure.
Qj = {i ∈ I : T k (Mi ) = τj } j = 1, . . . , p
and τ1 , . . . , τp is the list of all formally possible k-types for the signature L.
Let us explain the reduction from M SO(M ) to M SO(M ). We can apply The-
orem 17 to M = i∈I MA i , which allows to show that for every k, the k−type
References
1. Bès, A., Cégielski, P.: Weakly maximal decidable structures. RAIRO-Theor. Inf.
Appl. 42(1), 137–145 (2008)
2. Bès, A., Cégielski, P.: Nonmaximal decidable structures. Journal of Mathematical
Sciences 158, 615–622 (2009)
3. Blumensath, A., Colcombet, T., Löding, C.: Logical theories and compatible oper-
ations. In: Flum, J., Grädel, E., Wilke, T. (eds.) Logic and automata: History and
Perspectives, pp. 72–106. Amsterdam University Press (2007)
4. Büchi, J.R.: On a decision method in the restricted second-order arithmetic. In:
Proc. Int. Congress Logic, Methodology and Philosophy of science, Berkeley 1960,
pp. 1–11. Stanford University Press, Stanford (1962)
5. Büchi, J.R.: Transfinite automata recursions and weak second order theory of or-
dinals. In: Proc. Int. Congress Logic, Methodology, and Philosophy of Science,
Jerusalem 1964, pp. 2–23. Holland (1965)
6. Büchi, J.R., Zaiontz, C.: Deterministic automata and the monadic theory of ordi-
nals ω2 . Z. Math. Logik Grundlagen Math. 29, 313–336 (1983)
7. Carton, O., Thomas, W.: The monadic theory of morphic infinite words and gen-
eralizations. Inform. Comput. 176, 51–76 (2002)
8. Compton, K.J.: On rich words. In: Lothaire, M. (ed.) Combinatorics on words.
Progress and perspectives, Proc. Int. Meet., Waterloo/Can. 1982. Encyclopedia of
Mathematics, vol. 17, pp. 39–61. Addison, Reading (1983)
9. Elgot, C.C., Rabin, M.O.: Decidability and undecidability of extensions of second
(first) order theory of (generalized) successor. J. Symb. Log. 31(2), 169–181 (1966)
10. Fratani, S.: The theory of successor extended with several predicates (2009)
(preprint)
11. Gurevich, Y.: Modest theory of short chains.i. J. Symb. Log. 44(4), 481–490 (1979)
12. Gurevich, Y.: Monadic second-order theories. In: Barwise, J., Feferman, S.
(eds.) Model-Theoretic Logics, Perspectives in Mathematical Logic, pp. 479–506.
Springer, Heidelberg (1985)
13. Gurevich, Y., Magidor, M., Shelah, S.: The monadic theory of ω2 . J. Symb.
Log. 48(2), 387–398 (1983)
14. Gurevich, Y., Shelah, S.: Modest theory of short chains. ii. J. Symb. Log. 44(4),
491–502 (1979)
15. Gurevich, Y., Shelah, S.: Interpreting second-order logic in the monadic theory of
order. J. Symb. Log. 48(3), 816–828 (1983)
16. Makowsky, J.A.: Algorithmic uses of the Feferman-Vaught theorem. Annals of Pure
and Applied Logic 126(1-3), 159–213 (2004)
17. Perrin, D., Pin, J.-E.: Infinite Words. Pure and Applied Mathematics, vol. 141.
Elsevier, Amsterdam (2004), ISBN 0-12-532111-2
18. Perrin, D., Schupp, P.E.: Automata on the integers, recurrence distinguishability,
and the equivalence and decidability of monadic theories. In: Symposium on Logic
in Computer Science (LICS 1986), Washington, D.C., USA, June 1986, pp. 301–
305. IEEE Computer Society Press, Los Alamitos (1986)
Decidable Expansions of Labelled Linear Orderings 107
19. Rabin, M.O.: Decidability of second-order theories and automata on infinite trees.
Transactions of the American Mathematical Society 141, 1–35 (1969)
20. Rabinovich, A.: On decidability of monadic logic of order over the naturals ex-
tended by monadic predicates. Inf. Comput. 205(6), 870–889 (2007)
21. Rabinovich, A., Thomas, W.: Decidable theories of the ordering of natural numbers
with unary predicates. In: Ésik, Z. (ed.) CSL 2006. LNCS, vol. 4207, pp. 562–574.
Springer, Heidelberg (2006)
22. Robinson, R.M.: Restricted set-theoretical definitions in arithmetic. Proc. Am.
Math. Soc. 9, 238–242 (1958)
23. Rosenstein, J.G.: Linear ordering. Academic Press, New York (1982)
24. Semenov, A.L.: Decidability of monadic theories. In: Chytil, M.P., Koubek, V.
(eds.) MFCS 1984. LNCS, vol. 176, pp. 162–175. Springer, Heidelberg (1984)
25. Semenov, A.L.: Logical theories of one-place functions on the set of natural num-
bers. Mathematics of the USSR - Izvestia 22, 587–618 (1984)
26. Shelah, S.: The monadic theory of order. Annals of Mathematics 102, 379–419
(1975)
27. Siefkes, D.: Decidable extensions of monadic second order successor arithmetic. In:
Automatentheorie und Formale Sprachen, Tagung, Math. Forschungsinst, Ober-
wolfach (1969); Bibliograph. Inst., Mannheim, pp. 441–472 (1970)
28. Soprunov, S.: Decidable expansions of structures. Vopr. Kibern. 134, 175–179
(1988) (in Russian)
29. Thomas, W.: A note on undecidable extensions of monadic second order successor
arithmetic. Arch. Math. Logik Grundlagenforsch. 17, 43–44 (1975)
30. Thomas, W.: Ehrenfeucht games, the composition method, and the monadic theory
of ordinal words. In: Mycielski, J., Rozenberg, G., Salomaa, A. (eds.) Structures
in Logic and Computer Science, A Selection of Essays in Honor of A. Ehrenfeucht.
LNCS, vol. 1261, pp. 118–143. Springer, Heidelberg (1997)
31. Thomas, W.: Languages, automata, and logic. In: Rozenberg, G., Salomaa, A.
(eds.) Handbook of Formal Languages, vol. III, pp. 389–455. Springer, Heidelberg
(1997)
32. Thomas, W.: Model transformations in decidability proofs for monadic theories. In:
Kaminski, M., Martini, S. (eds.) CSL 2008. LNCS, vol. 5213, pp. 23–31. Springer,
Heidelberg (2008)
Existential Fixed-Point Logic,
Universal Quantifiers, and Topoi
Andreas Blass
1 Introduction to EFPL
Existential fixed-point logic can be roughly described as the result of modifying
first-order logic by
The author is partially supported by NSF grant DMS-0653696 and by a grant from
Microsoft Corporation.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 108–134, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Existential Fixed-Point Logic, Universal Quantifiers, and Topoi 109
The language also has variables (infinitely many of each sort), equality, negation,
conjunction, disjunction, existential quantification, and the least-fixed-point op-
erator which we write as “Let · · · ← · · · then · · · .” Terms and atomic formulas
are defined as usual in multi-sorted first-order logic, with equality for any sort
allowed between any two terms of that sort. The definition of formulas is given
by recursion, for all vocabularies simultaneously, as follows (under traditional
conventions for omitting parentheses).
is an Υ -formula.
Free and bound variables are defined as usual in first-order logic with the added
convention that, in Let P1 (x1 ) ← δ1 , . . . , Pk (xk ) ← δk then ϕ, the occurrences
of the variables of xi in P (xi ) and in δi are bound. Because the Pi ’s, though
in the vocabulary of the δi ’s and ϕ, are not in the vocabulary of Let P1 (x1 ) ←
δ1 , . . . , Pk (xk ) ← δk then ϕ, it is reasonable to regard them as bound second-
order variables in the latter formula.
The semantics of EFPL is defined as in multi-sorted first-order logic (where
we write As for the base set of sort s in structure A) with the following additional
clause for Let P1 (x1 ) ← δ1 , . . . , Pk (xk ) ← δk then ϕ. Suppose we are given an
Υ -structure A and values in its base sets for all the free variables of Let P1 (x1 ) ←
δ1 , . . . , Pk (xk ) ← δk then ϕ, i.e., for all the variables that either occur free in ϕ
or occur free in some δi but are not in the corresponding xi . Then the formulas
δi collectively define an operator on k-tuples of relations of the arities of the Pi ’s
as follows. Given such a k-tuple of relations, use them as the interpretations of
the Pi ’s to expand A to an Υ ∪ {P1 , . . . , Pk }-structure; interpret each δi in that
structure (with the given, fixed values of the variables other than xi ) to obtain
new relations of the same arities; the tuple of these is the output of the operator.
Now form the least fixed point2 of that operator. Use the k components of that
2
The use of least fixed points here presupposes that the operator is monotone; that
can be proved by induction on formulas, simultaneously with the definition of seman-
tics. Alternatively, one can define the semantics using the inflationary fixed point
construction, and then afterward prove monotonicity and infer that the inflationary
fixed point is in fact the least fixed point. This alternative was used in [6]. Yet an-
other equivalent alternative is to use as a definition the second-order formulation in
the next paragraph.
Existential Fixed-Point Logic, Universal Quantifiers, and Topoi 111
Remark 3. The requirement that the Pi ’s be positive predicates means not only
that they occur positively in the δi ’s, to ensure monotonicity of the inductive
operator, but also that they occur positively in ϕ. Is this additional requirement
a penalty for building positivity into vocabularies rather than treating it locally,
in each formula, as is traditional? No, the additional requirement is needed. If
it were waived, we could use the formula Let P (x) ← ∃y ¬Q(x, y) then ¬P (x)
(with negatable Q) to express ∀y Q(x, y), an unwanted universal quantifier.
Existential fixed-point logic has many pleasant properties, including the follow-
ing, which we first list briefly and then comment on more extensively.
A famous theorem of Immerman [11] and Vardi [13] asserts that first-order logic
plus the least-fixed-point operator (FO+LFP) captures polynomial time on or-
dered structures. In more detail, consider structures whose base set has the
form {1, 2, . . . , n} and whose vocabulary includes a symbol < interpreted as the
standard ordering of natural numbers. Such structures can easily be coded as
strings over a finite (2-element, if desired) alphabet, so it makes sense to talk
about a collection C of such structures being computable in polynomial time;
there should be a PTime Turing machine accepting exactly (the strings that en-
code) the structures in C. The Immerman-Vardi theorem says that C is PTime
decidable if and only if it is the collection of models of some sentence in first-
order-plus-least-fixed-point logic.
112 A. Blass
It is shown in [6] that the same is true for EFPL provided the collection
of structures is modified in the following way. Instead of having a symbol for
the ordering, the structures should have a symbol for the immediate successor
function S. This modification would make no difference in the case considered
by Immerman and Vardi, because S is definable from < in first-order logic and
< is definable from S using the least-fixed-point operator. In EFPL, only the
second of these definitions is available. Since the notion of immediate successor
is needed in describing the operation of Turing machines, we must assume that
S is available.
It is not difficult to extend the Immerman-Vardi theorem and its EFPL analog
to the case of multi-sorted structures, provided one has appropriate successor
functions on all of the sorts.
We record for future use the trivial observation that these theorems imply that
PTime is also captured, on structures with successor, by any logic intermediate
between EFPL and FO+LFP and indeed by stronger logics as long as these
admit PTime model-checking.
positive predicate symbols), then the expressivity hypothesis needed for Cook’s
theorem is automatically satisfied. It is in this sense that EFPL works well – in
particular works better than first-order logic – with Hoare logic.
2.5 Homomorphisms
hs : As −→ Bs ,
and
– preserving the interpretations of negatable predicate symbols in both direc-
tions, i.e.,
That is, a homomorphism must preserve truth of atomic formulas and their
negations insofar as the negations are permitted, i.e., insofar as the predicate
symbols involved are negatable.
Theorem 4 of [6] shows that homomorphisms preserve truth of EFPL formulas.
(The set-up there was one-sorted, but the same proof works in the many-sorted
case.)
– for each sort symbol s, an object As (to play the role of the base set of
sort s),
Existential Fixed-Point Logic, Universal Quantifiers, and Topoi 115
Among the constructions available in topoi are all those needed to produce
interpretations of terms and formulas of second-order logic over the vocabulary
Υ , once an Υ -structure A is given. Specifically, a term t of sort s with (first-order)
variables among x1 , . . . , xn of sorts s1 , . . . , sn is interpreted as a morphism
tA : As1 × · · · × Asn −→ As ,
of topoi. To explain what that means, we first discuss the two sorts of morphisms
commonly used in connection with topoi.
Since topoi are defined as categories with a certain amount of structure (finite
limits and power objects), it is natural to define homomorphisms of topoi to be
functors that preserve this structure. Such homomorphisms are called logical
morphisms because they preserve the interpretations of all formulas of higher-
order logic. That is, if f : E −→ F is a logical morphism and if A is an Υ -
structure in E, then one obtains an Υ -structure f (A) in F by applying f to all
the ingredients of A (base sets As , interpretations PA of predicate symbols, and
interpretations FA of function symbols), and one has
f (ϕA ) = ϕf (A)
for all formulas ϕ of higher-order logic.
A different notion of morphism, however, was natural in the earlier, more
geometric theory of topoi developed by Grothendieck (see [2]). Grothendieck
had observed that much of the algebraic topology of a topological space X can
be expressed in terms of the category of sheaves over X and he defined topoi
as generalized sheaf-categories. Further, he defined morphisms between topoi
so that, in particular, the morphisms from the topos of sheaves on X to the
topos of sheaves on Y correspond (as long as X and Y are somewhat reasonable
spaces) to continuous functions from X to Y . Nowadays, topoi in Grothendieck’s
sense are called Grothendieck topoi; they are a proper subclass of the topoi de-
fined above, often called elementary topoi. Morphisms in Grothendieck’s sense
are called geometric morphisms because of their origin in topological consider-
ations. It turns out that geometric morphisms can be defined not only between
Grothendieck topoi but between arbitrary topoi.
Definition 5. A geometric morphism from a topos E to another topos F is a
pair of functors f∗ : E −→ F and f ∗ : F −→ E such that f∗ is right adjoint
to f ∗ and f ∗ preserves finite limits (equivalently, preserves finite products and
equalizers). f∗ is called the direct-image part of the geometric morphism, and
f ∗ is called the inverse-image part.
Unlike logical morphisms, the constituents f∗ and f ∗ of geometric morphisms do
not in general preserve the interpretations of higher-order (or even first-order)
formulas. Nevertheless, the inverse image parts f ∗ of geometric morphisms have
some good properties with respect to logic. They preserve the interpretation of
existential positive first-order formulas. (In Grothendieck topoi, the same re-
mains true if one allows infinite disjunctions; in general elementary topoi infinite
disjunctions cannot be interpreted because the category may lack the infinite
unions of subobjects that one needs.) Under our convention that negatable pred-
icate symbols must be interpreted by complemented subobjects, f ∗ will preserve
the interpretations of all existential first-order formulas. (The point here is that
f ∗ need not preserve negations of general subobjects, but in the case of com-
plemented subobjects f ∗ will preserve the complement.) Better yet, this preser-
vation property remains correct when the least-fixed-point operator is added to
the logic.
Existential Fixed-Point Logic, Universal Quantifiers, and Topoi 117
For this purpose, we consider the Sierpiński topos, the functor category S =
Sets2 , where 2 is the category with two objects and one non-identity morphism.
f
This means that an object in S amounts to a diagram A −→ B consisting of a
f
single function between two sets. A morphism in S from one such object A −→ B
f
to another A −→ B is a commutative diagram
f
A −→ B
↓ ↓
f
A −→ B .
h
– h(X) ⊆ Y , because∗ X −→ Y is an object of S.
– X = D∗ (ϕH ) = ϕD∗ (H) = ϕA because D∗ preserves the interpretation of ϕ.
– Y = C ∗ (ϕH ) = ϕC (H) = ϕB because C ∗ preserves the interpretation of ϕ.
Thus, we get that h(ϕA ) ⊆ ϕB , as required.
Remark 7. With a little additional work, essentially the same argument shows
that the interpretation of ϕ is also preserved forward along homomorphisms
between Υ -structures in arbitrary topoi. The additional work arises because we
have used the fact that, in Sets, all subobjects are complemented. In general, if S
f f
is the Sierpiński topos over some other topos E, a subobject X −→ Y of A −→ B
is complemented if and only if X is complemented in A, Y is complemented in
B, and f maps the complement of X into the complement of Y (and of course
f
maps X into Y , since X −→ Y is an object in S).
The second form of axioms, saying that every tuple (of the appropriate sorts)
satisfies one of P and P , is not a Horn sentence, because of the disjunction in
the consequent of the implication. In this situation, the standard technique for
constructing classifying topoi would not produce a functor category as above
but rather a subcategory of sheaves. For this discussion, we must therefore pre-
suppose some information about sheaf topoi; at the end, it will turn out that we
can, after all, use a functor category, but justifying this assertion requires some
discussion of sheaves. The reader unfamiliar with sheaves could either consult
[12] for the necessary information or skip the following discussion of sheaves,
rejoining us at Proposition 8.
Each of the axioms
be exactly like A except that this one tuple a now satisfies P ; similarly, let A− be
exactly like A except that a now satisfies P . Then the pair of homomorphisms
given by identity maps
A −→ A+ and A −→ A−
generates a sieve on A that is the pullback (in Mop ) of SP along the homomor-
phism x −→ A that sends ẋ to a. So this pair covers A.
This argument can be iterated, i.e., it can be applied to other tuples that
satisfy neither P nor P in A (and therefore also in A± ) as well as to other
negatable predicate symbols. Because, in a Grothendieck topology, covers of
covers are covers (and because both Υ and A are finite), we find that A is covered
(in the topology J on Mop ) by homomorphisms (in M) from A to objects of
M in which, for every negatable predicate symbol P and for every tuple a of
the appropriate arity, either P (a) or P (a) holds. In other words, every object
is covered by homomorphisms to genuine Υ -structures.
In this situation, Grothendieck’s “Lemme de comparaison” [2, III.4.1] applies
and tells us that the topos of J-sheaves over Mop is equivalent to the topos of
sheaves on the full subcategory M∗op of genuine finite Υ -structures, with the
topology induced by J. Furthermore, it is easy to see that this induced topology
is trivial; any object is covered only by the sieve of all morphisms to it. Thus,
the category of sheaves reduces to the category of presheaves on M∗op , i.e., the
M∗
functor category Sets .
Notice also that, among objects in M∗ , the homomorphisms as defined in M
are, in fact, homomorphisms of Υ -structures. That is, they preserve negatable
predicates P not only forward but also backward. This is simply because they
preserve P forward.
We summarize the result of this sheaf discussion, adding some easily checked
information about the generic model.
Proposition
∗
8. The classifying topos for Υ -structures is the functor category
SetsM where M∗ is the category of finite Υ -structures and homomorphisms.
The generic Υ -structure is the one whose constituents (interpretations of sorts
and predicates) are the functors that send each finite Υ -structure A to its corre-
sponding constituents in Sets.
With this description of the classifying topos for Υ -structures and the generic
structure, we are ready to prove the main result of this subsection.
The conclusion (1) of this theorem asserts that, whenever ϕ(x) is satisfied by
some elements of an Υ -structure, it is “because” those elements and finitely
many others satisfy some quantifier-free information α(x, y) that guarantees the
truth of ϕ(x). That is, we have finite determination as described in Sect. 2. And
the theorem says that this will happen for any ϕ(x) that is preserved by the
inverse-image parts of geometric morphisms.
In connection with the definition of Φ, note that α is required to be an Υ -
formula, so negation will be applied only to atomic formulas whose predicate
symbol is negatable.
Proof. The proof proceeds in three phases, each establishing the equivalence (1)
in certain circumstances.
In phase 1, we observe that (1) holds in all finite (ordinary, set-based) Υ -
structures. The right-to-left implication is immediate from the definition of Φ.
As for the left-to-right implication, consider any finite Υ -structure A and any
tuple a of elements in it such that A satisfies ϕ(a). Let b be a list (without
repetitions) of all the elements of the base sets of A. Note that this is a finite
list because, in phase 1, we are dealing with a finite structure A. Let α(x, y) be
the conjunction of all the atomic formulas and negated atomic Υ -formulas that
are satisfied in A by the tuple (a, b). There are only finitely many conjuncts
here, because Υ has no function symbols and only finitely many predicate and
constant symbols.
I claim that α(x, y) ∈ Φ. Once this claim is proved, we will know that the
elements a satisfy in A the disjunct ∃y α(x, y) on the right side of (1), so the
proof for phase 1 will be complete.
To verify the claim, suppose our α(x, y) is satisfied by some tuple (a , b ) in
some Υ -structure B. It is easy to check that, by sending each element in the list
b (i.e., each element of any of the base sets of A) to the coresponding element
in b , we define a homomorphism h : A −→ B that satisfies h(a) = a . Indeed,
the fact that (a , b ) satisfies the equations in α(x, y) ensures that h(a) = a ,
while satisfaction of the other conjuncts in α(x, y) is exactly what is needed to
ensure that h is a homomorphism. Having already shown, in Subsection 3.1, that
ϕ must be preserved by homomorphisms, we know that B satisfies ϕ(a ). This
completes the verification of the claim and thus phase 1 of the proof.
In phase 2, we establish that the equivalence (1) is valid in the generic Υ -
∗
structure G in the classifying topos SetsM . Note that, because (1) has no
free variables, its interpretation is a subobject of the empty product 1 (i.e., the
interpretation is a truth value); we shall show that this interpretation is all of
124 A. Blass
1 (i.e., the truth-value true). For this purpose, we shall apply the assumption
that ϕ(x) is preserved by the inverse-image functors of geometric morphisms.
The relevant inverse-image functors for this phase of the proof are the evalu-∗
ation functors EA ∗
, one for each object A of M∗ . These functors from SetsM
to Sets are defined by
∗
EA (X) = X(A)
∗
for all objects X of SetsM , i.e., all functors X : M∗ −→ Sets. It is well known
∗
[12] and easy to check that EA is the inverse-image part of a geometric morphism;
∗
its right adjoint is the functor EA ∗ : Sets −→ SetsM defined by
EA∗ (S)(B) =
= set of functions to S from the set of homomorphisms B −→ A .
∗
Applying the definition of EA with the (tautological) constituents of the generic
G in the role of X, we find that
∗
EA (G) = A
for all objects A of M∗ , i.e., for all finite Υ -structures A. We know that, like the
∗
inverse-image part of any geometric morphism, EA preserves the interpretation
of ϕ(x). It also preserves the interpretation of all the α’s that occur in (1). In-
deed, interpretations of atomic formulas and their complements are preserved,
according to the definition of how inverse-image functors act on Υ -structures,
and conjunctions are preserved because inverse-image functors preserve finite
limits. Furthermore, the existential quantifiers and the (in general infinite) dis-
junction in (1) are preserved because inverse-image functors, having right ad-
joints, preserve all colimits, and so preserve images and joins (even infinite joins)
of subobjects. ∗
Consider the interpretations in G (in the topos SetsM ) of the two sides
of the biconditional in (1), ϕ(x) and the big disjunction. On the left we have
ϕ(x)G and on the right we have another subobject D of the relevant product
∗
Gs . For any finite (ordinary, set-based) Υ -structure A, the functor EA sends
these two subobjects to ϕA and the interpretation in A of the right side of the
biconditional in (1). We have shown in phase 1 that these two subobjects are the
same. Since this holds for every object A in M∗ , we have that ϕ(x)G and D are
∗
two subobjects of Gs in the functor category SetsM that have the same values
at every A in M∗ . That makes them the same functor and thereby shows that
(1) holds (i.e., has interpretation 1) in G. This completes phase 2 of the proof.
Finally, in phase 3, we prove the full conclusion of the theorem. Let A be
an arbitrary Υ -structure in an arbitrary Grothendieck topos E. Because∗ G is
the generic Υ -structure, there is a geometric morphism f : E −→ SetsM such
that A = f ∗ (G). As in phase 2, we have that f ∗ preserves the interpretations
of both sides of the biconditional in (1). Since the two sides have, according to
∗
phase 2, the same interpretation in G (in SetsM ), it follows immediately that
they have the same interpretation in A (in E). That completes the proof of the
theorem.
Existential Fixed-Point Logic, Universal Quantifiers, and Topoi 125
Since these axioms are universal Horn sentences, the classifying topos is, ac-
cording to a result from [8] already used above, the topos of functors from the
category M of models (in the new sense, with partial functions) to Sets.
With this classifying topos, we can proceed as above to incorporate the re-
quirement that negatable predicate symbols be interpreted by complemented
objects, and the rest of the results of this subsection work as before.
of those tuples were already present; such a change risks invalidating earlier
positive answers to EFPL queries. Intuitively, this means that the database’s
information about these negatable predicates is complete, at least insofar as
the elements present in the database are concerned. In other words, we have a
closed-world assumption for these predicates: If the tuple a is available in the
database but doesn’t satisfy P there, then this means that a doesn’t satisfy P
in reality. (Contrast this with the situation for positive predicates, where P (a)
could fail in the database while it holds in reality, if the database simply lacked
this bit of information.)
Thus, our distinction between negatable and positive predicate symbols for-
malizes the distinction between predicates to which such a closed-world assump-
tion applies and others to which it does not apply.
Given this idea, it is natural to also consider another sort of closed-world
assumption, one which says that the database is aware of all the elements of
a certain sort; no additional elements can be added. (We formulated EFPL
in a multi-sorted framework in order to be able to impose this closed-world
assumption on only some sorts rather than on the entire database.) Such a
closed-world assumption is not formalized in EFPL; homomorphisms can lead
to new elements in any sorts. A closed-world assumption for a sort s should be
reflected formally in a requirement that homomorphisms be surjective on the
base sets of sort s. This restriction on the allowed homomorphisms would be
reflected in a liberalization of the language; with fewer homomorphisms, we can
expect them to preserve more formulas.
In fact, there is a very familiar way to extend the language so as to retain
preservation properties for surjective homomorphisms but not for others: Al-
low universal quantification. This leads to the following proposal for extending
EFPL.
A vocabulary should say which (if any) of its sorts are closed ; the others are
then called open. The syntax is extended by allowing universal quantification of
variables of closed sorts. The semantics is the obvious one, familiar from first-
order logic, in the case of set-based structures. The semantics in structures in
topoi is perhaps not obvious but it is well-known. As indicated earlier, higher-
order intuitionistic logic is interpreted in topoi [3,12], and that certainly includes
first-order universal quantification.
Of course, it is easy to propose a new logic, especially such a slight variation
of a known one. But does this extension preserve any of the nice properties of
EFPL? That is the topic of the next section.
5 Geometric Preservation
In this section, we consider the pleasant properties of EFPL listed in Sect. 2 and
analyze what happens to them when we introduce universal quantification over
some sorts, the closed ones.
Of course there is no problem when the pleasant property is one that says
the logic is rich enough for some purpose; we have only made it richer. Thus,
Existential Fixed-Point Logic, Universal Quantifiers, and Topoi 127
for example, we certainly retain the “richness” half of capturing PTime; every
Ptime computable property of structures with successor was expressible in EFPL
and therefore is expressible in the extension by ∀ on closed sorts. The other
half of capturing PTime, namely the availability of PTime model-checking for
each formula, could be lost by enlarging the logic, but it is not lost in the
present enlargement. The reason is that, even with ∀ adjoined, our logic is still
a fragment of first-order-plus-least-fixed-point logic, which captures PTime by
the Immerman-Vardi theorem.
The good behavior of EFPL in relation to Hoare logic is also quite safe under
the present extension. Inspection of the relevant arguments in [6,7] shows that
they depend only on the availability of the least-fixed-point construction, exis-
tential quantification, and some connectives, not on the unavailability of other
things like universal quantification.
The remaining four properties of EFPL listed in Sect. 2 can, however, be
lost when we introduce ∀ on closed sorts. In one case, preservation by homo-
morphisms, this loss is intentional. We introduced ∀ in order to match a more
restrictive notion of homomorphism, surjective on the closed sorts. With this
modified notion of homomorphism, this preservation property revives.
Finite determination is lost even for the simplest case, the formula ∀x P (x), if
the base set of the sort of x is infinite. The fact that all infinitely many elements
of this base set satisfy P is obviously not a consequence of information about any
finitely many of them. To revive finite determination, we would have to require
that closed sorts be interpreted by finite base sets.
If universal quantification is allowed over infinite sets then the iterations lead-
ing to least fixed points can continue for any ordinal number of steps. An example
was given at the end of [6] where, in an arbitrary wellordering, the elements are
added in order, one at a time in the iteration.
Finally, we consider preservation by inverse-image parts of geometric mor-
phisms of topoi. Here again, preservation can fail in general but will hold if the
interpretations of the closed sorts satisfy an appropriate restriction, related to
finiteness but considerably weaker.
Remark 11. How can it be weaker? We saw in Subsection 3.2 that geometric
preservation implies finite determination, and a moment ago we saw that fi-
nite determination trivially fails unless closed sorts are finite. Therefore mustn’t
geometric preservation also fail unless the closed sorts are finite?
The fallacy in this argument is that the proof in Subsection
∗
3.2 used geometric
preservation for topoi like the classifying topos SetsM when proving finite
determination in other (arbitrary) topoi. It is entirely possible that a weaker
condition than finiteness, applied to the generic model in the classifying topos,
may imply finiteness elsewhere, for example in Sets.
In order to discuss the conditions under which universal quantification over a set
(the interpretation As of a sort s in an Υ -structure A) is preserved by the inverse-
image parts of geometric morphisms, we must first recall the topos-theoretic
interpretation of universal quantification. Consider a formula ϕ(x) of the form
∀y ψ(x, y). Let s be the sort of the universally quantified variable y, and let
128 A. Blass
construction of the classifying topos H and the generic object G. We shall use a
standard construction of classifying topoi, as in [12], with E as the base topos.
That is, we shall work in the internal logic of E, just as one would ordinarily
work in the “real world” of Sets.
Working in E requires some caution, because the internal logic of a topos
is intuitionistic. In our situation, one manifestation of intuitionistic logic will
be that we must be careful with the concept of finiteness. There are various
equivalent ways to define “finite” in ordinary set theory, but the proofs of their
equivalence use classical logic (and in some cases even the axiom of choice).
So the definitions are inequivalent intuitionistically. It turns out that the right
definition for our purposes, i.e., the definition that makes the usual construction
of classifying topoi work, is what is usually called K-finiteness, but we shall call it
finiteness because we have no need for any other version of finiteness. According
to this definition, a subset F of a set S is finite if it belongs to every family X
of subsets of S such that
– ∅ ∈ X and
– X ∪ {s} ∈ X for all X ∈ X and all s ∈ S.
In more anthropomorphic terms, F is finite if it can be obtained by starting with
the empty set and repeatedly adjoining single elements of S.
Until further notice, the following discussion takes place in the internal logic
of E. Here L and M are sets, Y and X are subsets, and p is a function.
To begin the study of the classifying topos H, we first write, as a geometric
theory, what it should classify, namely subsets of M whose pre-image along p
is included in Y . Subsets of M amount to models of the theory consisting of
propositional variables m for all m ∈ M ; the truth value assigned by a model to
m tells to what extent m is in the corresponding subset of M . The requirement
that the pre-image of this subset be included in Y amounts to a geometric theory
that the corresponding model must satisfy, namely the theory whose axioms are,
for each l ∈ L,
p(l) =⇒ {true : l ∈ Y } .
The (rather peculiar-looking) disjunction on the right is a disjunction of at most
one formula, namely the formula true; this formula is present if and only if l ∈ Y .
(If the logic were classical, there would be one disjunct, true, when l ∈ Y and
none, so that the disjunction is false, when l ∈ / Y . Intuitionistically, though,
those two cases need not be exhaustive.) This disjunction has, regardless of the
truth values assigned to the propositional variables, the same truth value as the
statement l ∈ Y . We have written it as a disjunction to fit the general format
of geometric theories and thus to enable us to apply the standard method for
building classifying topoi of geometric theories.
That general method produces a topos of sheaves as follows. Begin with the
partially ordered set Fin(M ) of finite subsets of M , ordered by reverse inclusion,
and make it into a category, still called Fin(M ), in the usual way: Objects are the
elements of Fin(M ) and there is a single morphism c −→ d if and only if c ⊇ d.
(This is the dual of the category of finitely presented models and homomorphisms
130 A. Blass
for the geometric theory consisting of all our propositional variables m but
none of our axioms.) Each element l of L determines a sieve Sl on the object
{p(l)} as follows. The sieve contains every morphism into p(x) if and only if
l ∈ Y . (Classically, the sieve would be the trivial sieve of all morphisms into
{p(l)} or the empty sieve, according to whether l ∈ Y or l ∈ / Y ; but again,
intuitionistically, we do not know that these alternatives are exhaustive.) Note
the following unusual property of the sieves Sl : if Sl contains some morphism
into {p(l)}, then it contains all morphisms into {p(l)}. (It is tempting to say
that Sl contains all or none of the morphisms into p(l), but this formulation
presupposes classical logic and is intuitionistically too strong.)
Let J be the smallest Grothendieck topology on Fin(M ) that contains all these
sieves Sl . Then the classifying topos H is the topos of J-sheaves on Fin(M ). We
shall need the following explicit description of the topology J, or at least of the
sieves that cover a singleton {m} ∈ Fin(M ). Of course there are the sieves Sl
described above, for all l ∈ p−1 {m}, and all sieves that are supersets of these. But
there are more, because of the closure conditions on Grothendieck topologies.
Closure under pullbacks doesn’t yield any new covers for singletons (though it
does yield covers for larger finite subsets of M ). But new covers of {m} do arise
from closure under iteration, i.e., from the requirement that, if S covers {m} and
if T is a sieve on {m} whose pullback along every d −→ {m} in S covers d, then
T covers {m}. Starting with the sieves Sl and repeatedly using this iteration
closure, we find that J contains, for each m ∈ M , all the sieves described in the
following definition.
It may be helpful to write out explicitly the first two steps of this induction. S
has rank ≤ 1 if and only if
That is, S includes a sieve of the form Sl . S has rank ≤ 2 if and only if
∃l1 ∈ p−1 {m} l1 ∈ Y =⇒ ∃l2 ∈ p−1 {m} (l2 ∈ Y =⇒ 1{m} ∈ S) .
Note that, if S has any rank n, then, just as in the case of the original Sl ’s, if S
contains some morphism into {m}, then it contains all such morphisms.
It is routine to verify that the J-covering sieves of {m} are just those that
have some rank.
With this description of J, we can begin to characterize the circumstances
under which the generic subset-of-M -with-preimage-in-Y G in H is included in
u∗ (X). The requirement is that, for each m ∈ M , the truth value of m ∈ G is
below the truth value of m ∈ u∗ (X). These truth values are the J-closures of the
Existential Fixed-Point Logic, Universal Quantifiers, and Topoi 131
op
corresponding truth values in the presheaf topos SetsFin(M) . Since J-closure is
an idempotent and monotone operation, this is the same as requiring the truth
value of m ∈ G in the presheaf topos to be below the J-closure of the truth
value, in the presheaf topos, of m ∈ u∗ (X).
In the presheaf topos, the truth value of m ∈ G is the sieve (on the terminal
object ∅) generated by the object {m} (or, more precisely, generated by the
morphism {m} −→ ∅). The truth value of m ∈ u∗ (X) is the sieve T (again on
∅) that contains each morphism to ∅ if and only if m ∈ X. So the requirement
that we want to analyze is that the morphism {m} −→ ∅ is in the J-closure
of T . That is equivalent to requiring 1{m} to be in the pullback to {m} of the
J-closure of T , and the latter is the J-closure of the pullback of T . The pullback
of T to {m} contains each morphism with codomain {m} if and only if m ∈ X.
Using this information and the description above of J-covering sieves, we can
express the requirement that we want to analyze as the disjunction of an infinite
sequence of statements ρn defined as follows.
For fixed m ∈ M let ρ0 be the statement m ∈ X and let ρn+1 be the statement
∃l ∈ p−1 {m} (l ∈ Y =⇒ ρn ) .
Again, it seems useful to explicitly exhibit ρ1
∃l1 ∈ p−1 {m} (l1 ∈ Y =⇒ m ∈ X)
and ρ2
Several comments are in order about this result. First, the formula (2) for n = 1
is logically valid in classical logic. The proof is to instantiate l1 as an element
of A − Y if one exists, and as an arbitrary element of A if Y = A. (This uses
that, in classical logic, one takes domains of discourse, like A, to be nonempty.
In an empty A, it is not the n = 1 case of (2) but the n = 0 case that is
valid.) We conclude that, if E satisfies classical logic, i.e., if it is a Boolean topos,
then universal quantification is geometrically preserved. This is, of course, no
news, because in classical logic one can express universal quantification using
existential quantification and complementation, both of which are geometrically
preserved. (Negation is not in general geometrically preserved, but when a com-
plement exists, that will be preserved.)
It may be worth noting that validity of the n = 1 case of (2) (for inhabited A)
embodies the full strength of classical logic, i.e., it implies the law of the excluded
middle. To see this, let ϕ be an arbitrary statement, and consider the set A that
contains 0 (definitely) and contains 1 if and only if ϕ. So A is inhabited (by 0).
Let Y be the subset of A that contains 0 if and only if ϕ (and definitely does
not contain 1). Then ∀x ∈ A x ∈ Y is false. (It implies 0 ∈ Y , hence ϕ, hence (as
now 1 ∈ A) 1 ∈ Y , and hence a contradiction.) So the n = 1 case of (2) implies
that ∃l ∈ A ¬l ∈ Y . By definition of A, such an l must be 0 or 1. If it is 0 then
the definition of Y gives ¬ϕ. If it is 1, then the definition of A gives ϕ. So in
both cases, we have ϕ ∨ ¬ϕ.
It is tempting to rewrite the nested quantifications and implications in (2) as
a single quantification over all n of the li ’s at once, i.e.,
n
∃l1 , . . . , ln ∈ A li ∈ Y =⇒ ∀x ∈ A x ∈ Y .
i=1
Unfortunately, this simplification works only in classical logic. The point is that,
in the correct, nested formulation (2), l2 need only exist (in A) to the extent that
l1 ∈ Y , whereas in the proposed simplification l2 must exist outright. Classically,
this doesn’t matter; if we have a good value for l2 when l1 ∈ Y , then we can
simply give l2 the same value as l1 when l1 ∈ / Y . But intutionistically we don’t
know that l1 ∈ Y ∨ l1 ∈ / Y , so this is not an adequate specification of a value for
l2 . To see the problem in a concrete case, consider the same A = {0} ∪ {1 : ϕ}
and Y = {0 : ϕ} as above. Notice that the n = 2 case of (2) is satisfied. (Take
l1 = 0 and, if l1 ∈ Y , i.e., if ϕ, then take l2 = 1, which is legal because when
ϕ then 1 ∈ A.) But the proposed simplification can hold (for any n) only if the
logic is classical. (Proof: Each li has to be 0 or 1. If at least one of them is 1,
then 1 ∈ A and so ϕ holds. If all of them are 0, then the implications say that
the consequent (∀x ∈ A) x ∈ Y , which is false, follows from (n repetitions of)
0 ∈ Y , i.e., from ϕ. So we get ϕ ∨ ¬ϕ.)
Let us revisit Subsection 3.1, where we showed that geometric preservation
implies preservation along homomorphisms, and let us try to apply the same
argument in the new context where universal quantification is allowed over cer-
tain sorts, the closed sorts. The argument used geometric preservation in the
case of geometric morphisms to the Sierpiński topos S, so it applies in the new
Existential Fixed-Point Logic, Universal Quantifiers, and Topoi 133
References
1. Apt, K.: Ten Years of Hoare’s Logic: A Survey – Part I. ACM Trans. Prog. Lang.
and Systems 3, 431–483 (1981)
2. Artin, M., Grothendieck, A., Verdier, J.-L.: Théorie des Topos et Cohomologie
Étale des Schémas. In: Séminaire de Géométrie Algébrique du Bois Marie 1963–
64 (SGA) 4, vol. 1, Lecture Notes in Mathematics, vol. 269. Springer, Heidelberg
(1972)
3. Bell, J.: Toposes and Local Set Theories. Oxford Logic Guides, vol. 14. Oxford
University Press, Oxford (1988)
4. Blass, A.: Topoi and Computation. Bull. European Assoc. Theoret. Comp. Sci. 36,
57–65 (1988)
5. Blass, A.: Geometric Invariance of Existential Fixed-Point Logic. In: Gray, J., Sce-
drov, A. (eds.) Categories in Computer Science and Logic. Contemp. Math., vol. 92,
pp. 9–22. Amer. Math. Soc., Providence (1989)
6. Blass, A., Gurevich, Y.: Existential Fixed-Point Logic. In: Börger, E. (ed.) Compu-
tation Theory and Logic. LNCS, vol. 270, pp. 20–36. Springer, Heidelberg (1987)
7. Blass, A., Gurevich, Y.: The Underlying Logic of Hoare Logic. Bull. European
Assoc. Theoret. Comp. Sci. 70, 82–110 (2000); Reprinted in Paun, G., Rozenberg,
G., Salomaa, A.: Current Trends in Theoretical Computer Science: Entering the
21st Century, pp. 409–436. World Scientific, Singapore (2001)
8. Blass, A., Ščedrov, A. (later simplified to Scedrov): Classifying Topoi and Finite
Forcing. J. Pure Appl. Algebra 28, 111–140 (1983)
9. Chandra, A., Harel, D.: Horn Clause Queries and Generalizations. J. Logic Pro-
gramming 2, 1–15 (1985)
10. Cook, S.: Soundness and Completeness of an Axiom System for Program Verifica-
tion. SIAM J. Computing 7, 70–90 (1978)
11. Immerman, N.: Relational Queries Computable in Polynomial Time. Information
and Control 68, 86–104 (1986); Preliminary version in 14th ACM Symp. on Theory
of Computation (STOC), pp. 147–152 (1982)
12. Johnstone, P.: Topos Theory. London Mathematical Society Monographs, vol. 10.
Academic Press, London (1977)
13. Vardi, M.: Complexity of Relational Query Languages. In: 14th ACM Symp. on
Theory of Computation (STOC), pp. 137–146 (1982)
Three Paths to Effectiveness
Abstract. Over the past two decades, Gurevich and his colleagues
have developed axiomatic foundations for the notion of algorithm,
be it classical, interactive, or parallel, and formalized them in a new
framework of abstract state machines. Recently, this approach was
extended to suggest axiomatic foundations for the notion of effective
computation over arbitrary countable domains. This was accomplished
in three different ways, leading to three, seemingly disparate, notions of
effectiveness. We show that, though having taken different routes, they
all actually lead to precisely the same concept. With this concept of
effectiveness, we establish that there is – up to isomorphism – exactly
one maximal effective model across all countable domains.
1 Introduction
Church’s Thesis asserts that the recursive functions are the only numeric func-
tions that can be effectively computed. Similarly, Turing’s Thesis stakes the
claim that any function on strings that can be mechanically computed can be
computed, in particular, by a Turing machine. For models of computation that
operate over arbitrary data structures, however, these two standard notions of
what constitutes effectiveness may not be directly applicable; as Richard Mon-
tague asserts [9, pp. 430–431]:
Now Turing’s notion of computability applies directly only to functions
on and to the set of natural numbers. Even its extension to functions
defined on (and with values in) another denumerable set S cannot be ac-
complished in a completely unobjectionable way. One would be inclined
to choose a one-to-one correspondence between S and the set of natural
numbers, and to call a function f on S computable if the function of
Supported in part by a Lady Davis postdoctoral fellowship.
Supported in part by the Israel Science Foundation (grant no. 250/05).
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 135–146, 2010.
c Springer-Verlag Berlin Heidelberg 2010
136 U. Boker and N. Dershowitz
One may ask, for example: What are the computable functions over the alge-
braic numbers? Does one obtain different sets of computable functions depending
on which representation (“correspondence”) one chooses for them?
Before we can answer such questions, we need a most-general notion of al-
gorithm. Sequential algorithms – that is, deterministic algorithms without un-
bounded parallelism or (intra-step) interaction with the outside world – have
been analyzed and formalized by Gurevich in [6]. There it was proved that any
algorithm satisfying three natural formal postulates (given below) can be emu-
lated, step by step, by a program in a very general model of computation, called
“abstract state machines” (ASMs). This formalization was recently extended in
[1] to handle partial functions. But an algorithm, or abstract state machine pro-
gram, need not yield an effective function. Gaussian elimination, for example, is
a perfectly well-defined algorithm over the real numbers, even though the reals
cannot all be effectively represented and manipulated.
We adopt the necessary point of view that effectiveness is a notion applicable
to collections of functions, rather than to single functions (cf. [10]). A single
function over an arbitrary domain cannot be classified as effective or ineffective
[9,14], since its effectiveness depends on the context. A detailed discussion of
this issue can be found in [3].
To capture what it is that makes a sequential algorithm mechanically com-
putable, three different generic formalizations of effectiveness have recently been
suggested:
2 Algorithms
Here, x =T y, for a set of terms T , means that tx = ty for all t ∈ T .
Whenever we refer to an “algorithm” below, we mean an object satisfying the
above three postulates, what we like to call a “classical algorithm”.
4. There is a term t (in the vocabulary of the algorithm) such that for all
a0 , . . . , ak ∈ D, if f (a1 , . . . , ak ) = c, then there is some initial state x0 ∈ I,
with j x0 = aj (j = 1, . . . , k), initiating a terminating computation
x0 ;τ · · · ;τ xn , where xn ∈ O and such that txn = c.
5. Whenever f (a1 , . . . , ak ) is ⊥, there is an initial state x0 ∈ I, with
j x0 = aj (j = 1, . . . , k), initiating an infinite computation x0 ;τ x1
;τ · · ·.
A (finite or infinite) set of algorithms, all with the same domain, will be called
a model (of computation).
3 Effective Models
We turn now to examine the three different approaches to understanding effec-
tiveness. Informally, they each add a postulate along the following lines:
Postulate IV (Effective Initial State). The initial states S0 of an effective
algorithm are finitely representable.
s
x t ⇔ sx = tx .
where f (cn ) is the term obtained by enclosing the term cn with the symbol f .
These numerical f track their original counterparts f over D, as follows:
f(ρ(x)) = f(min{cj = x})
j
= min {ci
f (ck )} where k = min{cj = x}
i j
= min {ci = f (ck )}
i
= min {ci = f (ck )}
i
= min {ci = f (x)}
i
= ρ(f (x)) .
Similarly for operators of other arities.
Three Paths to Effectiveness 143
These numerical
g track their original counterparts g in B, as follows:
g(ρ(x)) =
g (min{cj = x})
j
= min {ci
g(ck )} where k = min c
cminj {cj =x}
i
= min c = cminj {cj =x}
= min {c = x}
= min {ci = g(ck )}
i
= min {ci = g(ck )}
i
= min {ci = g(x)}
i
= ρ(g(x)) .
Proof. For any computable model M over domain D, there is, by Lemma 1,
a bijection π : D ↔ N such that every function f in the initial states of
M ’s algorithms is tracked under π by some partial recursive function g. By
Theorem 1, there is a constructive model that computes all the partial recur-
sive functions P over N. Since algorithms (according to Postulate II) are closed
under isomorphism, so are constructive models. Hence, there is a constructive
model P over π −1 (N), with some set of constructors, that computes all functions
π ◦ g ◦ π −1 that are tracked by functions g ∈ P, and – in particular – computes
all initial functions of M . Since all M ’s initial functions are constructive, M is
constructive.
5 Conclusions
Thanks to Theorem 3, it seems reasonable to just speak of “effectiveness”, with-
out distinguishing between the three equivalent notions discussed in the previous
sections. Having shown that three prima facie distinct definitions of effectiveness
over arbitrary domains comprise exactly the same functions strengthens the im-
pression that the essence of the underlying notion of computability has in fact
been captured.
Fixing the concept of an effective model of computation, the question natu-
rally arises as to whether there are “maximal” effective models, and if so, whether
they are really different or basically one and the same. Formally, we consider an
effective computational model M (consisting of a set of functions) over domain
D to be maximal if adding any function f ∈ M over D to M gives an ineffective
model M ∪ {f }. It turns out that there is exactly one effective model (regardless
of which of the three definitions one prefers), up to isomorphism.
Theorem 4. The set of partial recursive functions is the unique maximal effec-
tive model, up to isomorphism, over any countable domain.
Proof. We first note that the partial recursive functions are a maximal effective
model. Their effectiveness was established in Theorem 1. As for their maximality,
the partial recursive functions are “interpretation-complete”, in the sense that
they cannot simulate a more inclusive model, as shown in [2,4]. By Theorem 2,
they can simulate every effective model, leading to the conclusion that there is
no effective model more inclusive than the partial recursive functions.
Next, we show that the partial recursive functions are the unique maximal
effective model, up to isomorphism. Consider some maximal effective model M
over domain D. By Theorem 2, the partial recursive functions can simulate
M via a bijection π. Since effectiveness is closed under isomorphism, it follows
that there is an effective model M over D isomorphic to the partial recursive
functions via π −1 . Hence, M contains M , and by the maximality of M we get
that M = M . Therefore, M is isomorphic to the partial recursive functions, as
claimed.
Three Paths to Effectiveness 145
References
1. Blass, A., Dershowitz, N., Gurevich, Y.: Exact exploration. Technical
Report MSR-TR-2009-99, Microsoft Research, Redmond, WA (2010),
http://research.microsoft.com/pubs/101597/Partial.pdf; A short ver-
sion to appear as Algorithms in a world without full equality. In: The Proceedings
of the 19th EACSL Annual Conference on Computer Science Logic, Brno, Czech
Republic. LNCS. Springer, Heidelberg (August 2010)
2. Boker, U., Dershowitz, N.: Comparing computational power. Logic Journal of the
IGPL 14, 633–648 (2006)
3. Boker, U., Dershowitz, N.: The Church-Turing thesis over arbitrary domains. In:
Avron, A., Dershowitz, N., Rabinovich, A. (eds.) Pillars of Computer Science.
LNCS, vol. 4800, pp. 199–229. Springer, Heidelberg (2008)
4. Boker, U., Dershowitz, N.: The influence of domain interpretations on computa-
tional models. Journal of Applied Mathematics and Computation 215, 1323–1339
(2009)
5. Dershowitz, N., Gurevich, Y.: A natural axiomatization of computability and proof
of Church’s Thesis. Bulletin of Symbolic Logic 14, 299–350 (2008)
6. Gurevich, Y.: Sequential abstract state machines capture sequential algorithms.
ACM Transactions on Computational Logic 1, 77–111 (2000)
7. Lambert Jr., W.M.: A notion of effectiveness in arbitrary structures. The Journal
of Symbolic Logic 33, 577–602 (1968)
8. Mal’tsev, A.: Constructive algebras I. Russian Mathematical Surveys 16, 77–129
(1961)
146 U. Boker and N. Dershowitz
1 Introduction
Finite automata on infinite objects were first introduced in the 60’s, and were
the key to the solution of several fundamental decision problems in mathematics
and logic [5,15,20]. Today, automata on infinite objects are used for specification
verification, and synthesis of nonterminating systems. The automata-theoretic
approach to verification views questions about systems and their specifications
as questions about languages, and reduces them to automata-theoretic problems
like containment and emptiness [13,26]. Recent industrial-strength property-
specification languages such as Sugar, ForSpec, and the recent standard PSL 1.01
include regular expressions and/or automata, making specification and verifica-
tion tools that are based on automata even more essential and popular [1].
Early automata-based algorithms aimed at showing decidability. The appli-
cation of automata theory in practice has led to extensive research on the com-
plexity of problems and constructions involving automata [6,19,22,24,25,27]. For
many problems and constructions, our community was able to come up with
satisfactory solutions, in the sense that the upper bound (the complexity of the
best algorithm or the blow-up in the best known construction) coincides with the
lower bound (the complexity class in which the problem is hard, or the blow-
up that is known to be unavoidable). For some problems and constructions,
however, the gap between the upper bound and the lower bound is significant.
This situation is especially frustrating, as it implies that not only something is
Supported in part by a Lady Davis postdoctoral fellowship.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 147–164, 2010.
c Springer-Verlag Berlin Heidelberg 2010
148 U. Boker and O. Kupferman
1
In Büchi automata, some of the states are designated as accepting states, and a run
is accepting iff it visits states from the accepting set infinitely often [5]. Dually, in
co-Büchi automata, a run is accepting iff it visits the set of accepting states only
finitely often.
2
When applied to universal Büchi automata, the translation in [16], of alternating
Büchi automata into NBW, results in DBW. By dualizing it, one gets a translation
of NCW to DCW.
The Quest for a Tight Translation of Büchi to co-Büchi Automata 149
The main challenge in proving a non-trivial lower bound for the translation of
NBW to NCW is the expressiveness superiority of NBW with respect to NCW.
Indeed, a family of languages that is a candidate for proving a lower bound for
this translation has to strike a delicate balance: the languages have to somehow
take advantage of the Büchi acceptance condition, and still be recognizable by a
co-Büchi automaton.3 In particular, it is not clear how to use the main feature of
the Büchi condition, namely its ability to easily track infinitely many occurrences
of an event, as a co-Büchi automaton cannot recognize languages that are based
on such a tracking.
Beyond the theoretical challenge in tightening the gaps, and the fact they are
related to other gaps in our knowledge [9], the translation of NBW to NCW has
immediate important applications in formal methods. The premier example in
this class is of symbolic LTL model checking. Evaluating specifications in AFMC
can be done with linearly many symbolic steps. In contrast, direct LTL model
checking reduces to a search for bad-cycles, whose symbolic implementation in-
volves nested fixed-points, and is typically 4 quadratic [21]. It is shown in [12]
that given an LTL formula ψ, there is an alternation-free μ-calculus (AFMC)
formula equivalent to ∀ψ iff ψ can be recognized by a DBW. Alternatively, an
NCW for ¬ψ can be linearly translated to an AFMC formula equivalent to ∃¬ψ,
which can be negated to a formula equivalent to ∀ψ. Thus, an improvement of
the translation of NBW to NCW would immediately imply an improvement of
the translation of LTL to AFMC.
We describe the quest to a 2Θ(n) tight bound for the translation. In the upper-
bound front, we describe the construction in [3], which translates an NBW B
to an NCW C whose underlying structure is the product of B with its subset
construction. Thus, given an NBW B with n states, the translation yields an
equivalent NCW with n2n states, and it has a simple symbolic implementation
[17]. In the lower-bound front, we first describe the counterexample given in [11]
to the NCW-typeness of NBW. We then describe the “circumventing counting”
idea, according to which the ability of NBWs to easily track infinitely many
occurrences of an event makes them more succinct than NCWs. The idea is to
consider a family of languages L1 , L2 , L3 , . . . in which an NCW for Lk has to
count to some bound that depends on k, whereas an NBW can count instead
to infinity. In the first application of the idea, the NBW for the language Lk
checks that an event P occurs infinitely often. The language Lk is still NCW-
recognizable as other components of Lk make it possible to check instead that
P has at least k occurrences. An NCW for Lk can then count occurrences of
P , but it needs O(k) more states for this [2]. In order to achieve a super-linear
3
A general technique for proving lower bounds on the size of automata on infinite
words is suggested in [28]. The technique is based on full automata, in which a word
accepted by the automaton induces a language. The fact NCWs are less expressive
than NBWs is a killer for the technique, as full automata cannot be translated to
NCWs.
4
Better algorithms have been suggested [7,21], but it turns out that algorithms based
on nested fixed-points perform better in practice.
150 U. Boker and O. Kupferman
succinctness, we enhance the idea as follows. The NBW for the language Lk
still checks that an event P occurs infinitely often. Now, however, in order for
Lk to be NCW-recognizable, other components of Lk make it possible to check
instead that P repeats at least once in every interval of some bounded length
f (k). Thus, while the NBW can detect infinitely many occurrences of P with
2 states, the NCW has to devote O(f (k)) states for the counting. We first use
ideas from number theory in order to make f (k) quadratic in k, and then use
binary encoding in order to make f (k) exponential in k [3].
2 Preliminaries
better conveys the intuition that, as with the Büchi condition, a visit in α is
a “good event”. An automaton accepts a word iff it has an accepting run on
it. The language of an automaton A, denoted L(A), is the set of words that A
accepts. We also say that A recognizes the language L(A). For two automata A
and A , we say that A and A are equivalent if L(A) = L(A ).
We denote the different classes of automata by three letter acronyms in
{D, N} × {B, C} × {W}. The first letter stands for the branching mode of the
automaton (deterministic or nondeterministic); the second letter stands for the
acceptance-condition type (Büchi, or co-Büchi); the third letter indicates that
the automaton runs on words. We say that a language L is in a class γ if L is
γ-recognizable, that is, L can be recognized by an automaton in the class γ.
Different classes of automata have different expressive power. In particular,
while NBWs recognize all ω-regular language [15], DBWs are strictly less ex-
pressive than NBWs, and so are DCWs [14]. In fact, a language L is in DBW
iff its complement is in DCW. Indeed, by viewing a DBW as a DCW, we get an
automaton for the complementing language, and vice versa. The expressiveness
superiority of the nondeterministic model over the deterministic one does not ap-
ply to the co-Büchi acceptance condition. There, every NCW has an equivalent
DCW [16].
3 Upper Bound
In this section we present the upper-bound proof from [3] for the translation of
NBW to NCW (when possible).5 The proof is constructive: given an NBW B
with k states whose language is NCW-recognizable, we construct an equivalent
NCW C with at most k2k states. The underlying structure of C is very simple: it
runs B in parallel to its subset construction. We refer to the construction as the
augmented subset construction, and we describe the rationale behind it below.
Consider an NBW B with set αB of accepting states. The subset construction
of B maintains, in each state, all the possible states that B can be at. Thus, the
subset construction gives us full information about B’s potential to visit αB in
the future. However, the subset construction loses information about the past.
In particular, we cannot know whether fulfilling B’s potential requires us to give
up past visits in αB . For that reason, the subset construction is adequate for
determinizing automata on finite words, but not good enough for determinizing
ω-automata. A naive try to determinize B could be to build its subset construc-
tion and define the acceptance set as all the states for which B has the potential
to be in αB . The problem is that a word might infinitely often gain this potential
via different runs. Were we only able to guarantee that the run of the subset
construction follows a single run of the original automaton, we would have en-
sured a correct construction. Well, this is exactly what the augmented subset
construction does!
5
For readers who skipped the preliminaries, let us mention that we work here with a
less standard definition of the co-Büchi condition, where a run r satisfies a co-Büchi
condition α iff inf (r) ⊆ α.
152 U. Boker and O. Kupferman
For automata on finite words, if two states of the automaton have the same
language, they can be merged without changing the language of the automaton.
While this is not the case for automata on infinite words, the lemma below
enables us to do take advantage of such states.
Our next observation is the key to the definition of the acceptance condition
in the augmented subset construction. Intuitively, it shows that if an NCW
language L is indifferent to a prefix in (u + v)∗ , and L contains the language
(v ∗ · u+ )ω , then L must also contain the word v ω .
We can now present the construction together with its acceptance condition.
Theorem 1 ([3]). For every NBW B with k states that is co-Büchi recognizable
there is an equivalent NCW C with at most k2k states.
– C = B × 2B . That is, the states of C are all the pairs b, E where b ∈ B and
E ⊆ B.
– For all b, E ∈ C and σ ∈ Σ, we have δC (b, E, σ) = δB (b, σ) × {δB (E, σ)}.
That is, C nondeterministically follows B on its B-components and deter-
ministically follows the subset construction of B on its 2B -component.
– C0 = B0 × {B0 }.
– A state is a member of αC if it is reachable from itself along a path whose
projection on B visits αB . Formally, b, E ∈ αC if there is a state b , E ∈
αB × 2B and finite words y1 and y2 such that b , E ∈ δC (b, E, y1 ) and
b, E ∈ δC (b , E , y2 ). We refer to y1 · y2 as the witness for b, E. Note
that all the states in αB × 2B are members of αC with an empty witness.
on u that visits αB and goes back to b. Recall also that for the word p1 , we have
that r (p1 ) = b, E and η(p1 ) = d0 . Hence, p1 · w ∈ L(B). Since L(B) = L(D),
we have that p1 · w ∈ L(B). Therefore, w ∈ L(Dd0 ).
i ∈ L(D ). Since δD (d0 , ti ) =
Thus, by Corollary 1, for all i ≥ 1 we have that tω d0
d0 , it follows that all the states that D visits when it reads ti from d0 are in αD .
Note that w = p1 · t1 · t2 · · · . Hence, since δD (p1 ) = d0 , the run of D on w is
accepting, thus w ∈ L(D). Since L(D) = L(B), it follows that w ∈ L(B), and we
are done.
4 Lower Bound
In this section we describe the “circumventing counting” idea and how it has
led to a matching lower bound. In the deterministic setting, DBWs are co-Büchi
type. Thus, if a DBW A is DCW-recognizable, then there is a DCW equivalent
to A that agrees with A on its structure (that is, one only has to modify the
acceptance condition). The conjecture that NBW are also co-Büchi type was
refuted only in [11]:
Theorem 2 ([11]). NBWs are not co-Büchi type.
Proof. Consider the NBW A described in Fig. 1. The NBW recognizes the lan-
guage a∗ · b · (a + b)∗ (at least one b). This language is in NCW, yet it is easy to
see that there is no NCW recognizing L on the same structure.
b a
b a
q0 q1 q3
a b a
q2
The result in [11] shows that there are NBWs that are NCW-recognizable and
yet an NCW for them requires a structure that is different from the one of the
given NBW. It does not show, however, that the NCW needs to have more states.
In particular, the language of the NBW in Fig. 1 can be recognized by an NCW
with two states.
The Quest for a Tight Translation of Büchi to co-Büchi Automata 155
Since an automaton recognizing Lk must accept every word in which there are at
least k a’s and k b’s, regardless of how the letters are ordered, it may appear as
if the automaton must have two k-counters operating in parallel, which requires
O(k 2 ) states. This would indeed be the case if a and b had not been the only
letters in Σ, of if the automaton had been deterministic or on finite words.
However, since we are interested in nondeterministic automata on infinite words,
and a and b are the only letters in Σ, we can do much better. Since Σ contains
only the letters a and b, one of these letters must appear infinitely often in every
word in Σ ω . Hence, w ∈ Lk iff w has at least k b’s and infinitely many a’s,
or at least k a’s and infinitely many b’s. An NBW can guess which of the two
cases above holds, and proceed to validate its guess (if w has infinitely many
a’s as well as b’s, both guesses would succeed). The validation of each of these
guesses requires only one k-counter, and a gadget with two states for verifying
that there are infinitely many occurrences of the guessed letter. Implementing
this idea results in the NBW with 2k + 1 states appearing in Fig. 2.
The reason we were able to come up with a small NBW for Lk is that NBWs
can abstract precise counting by “counting to infinity” with two states. The fact
that NCWs do not share this ability [14] is what ultimately allows us to prove
that NBW are more succinct than NCW. As it turns out, however, even an NCW
for Lk can do much better than maintaining two k-counters with O(k 2 ) states.
To see how, note that a word w is in Lk iff w has at least k b’s after the first k a’s
(this characterizes words in Lk with infinitely many b’s), or a finite number of
b’s that is not smaller than k (this characterizes words in Lk with finitely many
b b b a, b b
b
a a a
a, b t1 t2 ··· tk−2 tk−1 tk
a a
t0
b a
b b tk−2 b tk−1
t1 t2 ··· tk
b
a a a a, b a
b b b a a a a, b
a a a b b b
a, b t1 t2 ··· tk−1 tk tk+1 ··· t2k−1 t2k
a
t0
b
b b b b
t1 t2 ··· tk−2 tk−1 tk
a a a a a
b’s). Obviously the roles of a and b can also be reversed. Implementing this idea
results in the NCW with 3k + 1 states described in Fig. 3. As detailed in [2], up
to one state this is indeed the best one can do. Thus, the family of languages
L1 , L2 , . . . implies that translating an NBW with 2k + 1 states may result in
an NCW with at least 3k states, hence the non-trivial, but still linear, lower
bound.
Bk :
a
s0
a a a a a b
b b b b
sk sk−1 sk−2 ··· s2 s1
b b b a, b
a
sk+1 sk+2
Assume first that w ∈ Lk . Then, w either have infinitely many a’s, or starts
with br · a or has a subword of the form a · br · a, for r ∈ Sk . In the first case, w
is accepted by Bk , since the automaton’s transition function is total and an a-
transition always goes to an accepting state. Now, assume that w has a subword
of the form a · br · a, starting at a position t, for r ∈ Sk . Then, as argued above,
6
In general, for a finite set of positive integers {n1 , n2 , . . . , nl }, we have that all
integers above max2 {n1 , n2 , . . . , nl } can be written as linear combinations of the
ni ’s iff the greater common divisor of the ni ’s is 1. For our purpose, it is sufficient
to restrict attention to linear combinations of two subsequent integers.
158 U. Boker and O. Kupferman
Next, we show that while Lk can be recognized by an NCW, every NCW recog-
nizing Lk cannot take advantage of its non-determinism. Formally, we present
a DCW (Fig. 5) for Lk that has k 2 − k + 2 states, and prove that an NCW
recognizing Lk needs at least that many states. For simplicity, we show that the
NCW must count up to th(k), resulting with at least k 2 − k states, and do not
consider the two additional states of the DCW.
Theorem 5. For every k ≥ 1, the language Lk can be recognized by a DCW
with k 2 − k + 2 states, and cannot be recognized by an NCW with fewer than
k 2 − k states.
Proof. Consider the DCW Dk , appearing in Fig. 5. In the figure, a state si has
an a-transition to the state sth(k)+2 if and only if i ∈ Sk . We leave to the reader
the easy task of verifying that that L(Dk ) = Lk .
We now turn to prove the lower bound. Assume by way of contradiction that
there is an NCW Ck with at most k 2 − k − 1 states that recognizes Lk . The
2
word w = (b(k −k−1) · a)ω belongs to Lk since it has infinitely many a’s. Thus,
there is an accepting run r of Ck on w. Let t be a position such that rt ∈ α
Dk : a
s0
b
a a a b
b b b b
sth(k)+1 sth(k) ··· si∈Sk ··· s2 s1
a, b
a a
sth(k)+2
Lemma 4. For every k ≥ 1, the language Lk can be recognized by an NBW with
O(k) states and by an NCW with O(k) states.
Proof. We show that there is an NFW with O(k) states recognizing Sk ∪Ik ∪{1k }.
Completing the NFW to an NBW or an NCW for Lk is straightforward. It is
easy to construct NFWs with O(k) states for Sk and for {1k }. An NFW with
O(k) states for Ik is fairly standard too (see, for example, [10]). The idea is that
if v is the successor of v in a binary k-bit cyclic counter, then v can be obtained
from v by flipping the bits of the 0 · 1∗ suffix of v, and leaving all other bits
unchanged (the only case in which v does not have a suffix in 0 · 1∗ is when
v ∈ 1∗ , in which case all bits are flipped). For example, the successor of 1001 is
obtained by flipping the bits of the suffix 01, which results in 1010. Accordingly,
there is an improper-increase error in v · $ · v if there is at least one bit of v that
does not respect the above rule. An NFW can guess the location of this bit and
reveals the error by checking the bit located k + 1 bits after it, along with the
bits read in the suffix of v that starts in this bit.
r0 , r1 , . . . , rt1 , (rt1 +1 , . . . , rt2 )ω . Note that since rt ∈ α for all t ≥ t, the run r
is indeed accepting. We would get to a contradiction by proving that w
∈ Lk .
Since t2 ≤ t0 + k2k and k2k < d, we have that wt1 +1 · · · wt2 has no occurrence
of #, thus w has no occurrences of # after position t0 . Recall that Lk = Lk ∪
(Σ ∗ · #)ω . By the above, w
∈ (Σ ∗ · #)ω . Furthermore, since Lk = Σ ∗ · (Sk ∪
Ik ∪ {1 }) · Σ · # · Σ , the fact w has no occurrences of # after position t0
k ∗ ω
Combining the above lower bound with the upper bound in Theorem 1, we can
conclude with the following.7
Theorem 7 ([3]). The asymptotically tight bound for the state blow up in the
translation, when possible, of an NBW to an equivalent NCW is 2Θ(n) .
5 Discussion
It is well known that nondeterministic automata are exponentially more succinct
than deterministic ones. The succinctness is robust and it applies to all known
classes of automata on finite or infinite objects. Restricting attention to nonde-
terministic automata makes the issue of succinctness more challenging, as now
all classes of automata may guess the future, and the question is whether cer-
tain acceptance conditions can use this feature better than others. For example,
7
Note that the lower and upper bounds are only asymptotically tight, leaving a gap
in the constants. This is because the NBW that recognizes Lk requires O(k) states
and not strictly k states.
162 U. Boker and O. Kupferman
References
1. Accellera: Accellera organization inc. (2006), http://www.accellera.org
2. Aminof, B., Kupferman, O., Lev, O.: On the relative succinctness of nondetermin-
istic Büchi and co-Büchi word automata. In: Cervesato, I., Veith, H., Voronkov,
A. (eds.) LPAR 2008. LNCS (LNAI), vol. 5330, pp. 183–197. Springer, Heidelberg
(2008)
3. Boker, U., Kupferman, O.: Co-ing Büchi made tight and useful. In: Proc. 24th
IEEE Symp. on Logic in Computer Science (2009)
4. Boker, U., Kupferman, O., Rosenberg, A.: Alternation removal in Büchi automata.
In: Proc. 37th Int. Colloq. on Automata, Languages, and Programming (2010)
5. Büchi, J.R.: On a decision method in restricted second order arithmetic. In: Proc.
Int. Congress on Logic, Method, and Philosophy of Science, pp. 1–12. Stanford
University Press, Stanford (1962)
6. Emerson, E., Jutla, C.: The complexity of tree automata and logics of programs. In:
Proc. 29th IEEE Symp. on Foundations of Computer Science, pp. 328–337 (1988)
7. Gentilini, R., Piazza, C., Policriti, A.: Computing strongly connected components
in a linear number of symbolic steps. In: 14th ACM-SIAM Symp. on Discrete
Algorithms, pp. 573–582 (2003)
8. Krishnan, S., Puri, A., Brayton, R.: Deterministic ω-automata vis-a-vis determinis-
tic Büchi automata. In: Du, D.-Z., Zhang, X.-S. (eds.) ISAAC 1994. LNCS, vol. 834,
pp. 378–386. Springer, Heidelberg (1994)
9. Kupferman, O.: Tightening the exchange rate beteen automata. In: Duparc, J.,
Henzinger, T.A. (eds.) CSL 2007. LNCS, vol. 4646, pp. 7–22. Springer, Heidelberg
(2007)
10. Kupferman, O., Lustig, Y., Vardi, M.: On locally checkable properties. In:
Hermann, M., Voronkov, A. (eds.) LPAR 2006. LNCS (LNAI), vol. 4246, pp. 302–
316. Springer, Heidelberg (2006)
11. Kupferman, O., Morgenstern, G., Murano, A.: Typeness for ω-regular automata.
International Journal on the Foundations of Computer Science 17, 869–884 (2006)
12. Kupferman, O., Vardi, M.: From linear time to branching time. ACM Transactions
on Computational Logic 6, 273–294 (2005)
13. Kurshan, R.: Computer Aided Verification of Coordinating Processes. Princeton
Univ. Press, Princeton (1994)
14. Landweber, L.: Decision problems for ω–automata. Mathematical Systems The-
ory 3, 376–384 (1969)
15. McNaughton, R.: Testing and generating infinite sequences by a finite automaton.
Information and Control 9, 521–530 (1966)
16. Miyano, S., Hayashi, T.: Alternating finite automata on ω-words. Theoretical Com-
puter Science 32, 321–330 (1984)
17. Morgenstern, A., Schneider, K.: From LTL to symbolically represented determinis-
tic automata. In: Logozzo, F., Peled, D.A., Zuck, L.D. (eds.) VMCAI 2008. LNCS,
vol. 4905, pp. 279–293. Springer, Heidelberg (2008)
18. Piterman, N.: From nondeterministic Büchi and Streett automata to deterministic
parity automata. In: Proc. 21st IEEE Symp. on Logic in Computer Science, pp.
255–264. IEEE press, Los Alamitos (2006)
19. Pnueli, A., Rosner, R.: On the synthesis of a reactive module. In: Proc. 16th ACM
Symp. on Principles of Programming Languages, pp. 179–190 (1989)
20. Rabin, M.: Decidability of second order theories and automata on infinite trees.
Transaction of the AMS 141, 1–35 (1969)
164 U. Boker and O. Kupferman
21. Ravi, K., Bloem, R., Somenzi, F.: A comparative study of symbolic algorithms for
the computation of fair cycles. In: Johnson, S.D., Hunt Jr., W.A. (eds.) FMCAD
2000. LNCS, vol. 1954, pp. 143–160. Springer, Heidelberg (2000)
22. Safra, S.: On the complexity of ω-automata. In: Proc. 29th IEEE Symp. on Foun-
dations of Computer Science, pp. 319–327 (1988)
23. Safra, S., Vardi, M.: On ω-automata and temporal logic. In: Proc. 21st ACM Symp.
on Theory of Computing, pp. 127–137 (1989)
24. Street, R., Emerson, E.: An elementary decision procedure for the μ-calculus. In:
Paredaens, J. (ed.) ICALP 1984. LNCS, vol. 172, pp. 465–472. Springer, Heidelberg
(1984)
25. Vardi, M., Wolper, P.: Automata-theoretic techniques for modal logics of programs.
Journal of Computer and Systems Science 32, 182–221 (1986)
26. Vardi, M., Wolper, P.: Reasoning about infinite computations. Information and
Computation 115, 1–37 (1994)
27. Wolper, P., Vardi, M., Sistla, A.: Reasoning about infinite computation paths. In:
Proc. 24th IEEE Symp. on Foundations of Computer Science, pp. 185–194 (1983)
28. Yan, Q.: Lower bounds for complementation of ω-automata via the full automata
technique. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP
2006. LNCS, vol. 4052, pp. 589–600. Springer, Heidelberg (2006)
Normalization of Some Extended Abstract State
Machines
1 Introduction
Yuri Gurevich has given a schema of languages which is not only a Turing-
complete language (a language allowing to express at least an algorithm for each
computable function), but which also allows to express all algorithms for each
computable function (it is an algorithmically complete language); this schema of
languages was first called dynamic structures, then evolving algebras, and finally
ASM (for Abstract State Machines) [2]. He proposed the Gurevich’s thesis (the
notion of algorithm is entirely captured by the model) in [3]. Yuri had explained
us this thesis during his stay in Paris, and a fascinating russian style after-
talk discussion, between Yuri and Vladimir Uspensky, during a conference in
Fontainebleau, convinced all those attending the talk of the truth of Yuri’s thesis.
There exist several partial implementations of ASMs as a programming lan-
guage. These implementations are partial by nature for two reasons: (i) ASMs
allow to program functions computable by Turing machines with oracles, but
Address correspondence to this author.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 165–180, 2010.
c Springer-Verlag Berlin Heidelberg 2010
166 P. Cégielski and I. Guessarian
obviously not the implementations; (ii) in ASMs any (computable) first order
structures may be defined, which is not true for the current implementations.
Yuri Gurevich has directed a group at Microsoft Laboratories (Redmond)
which has implemented such a programming language, called AsmL (for ASM
Language), written first in C++ then in C# as a language of the .NET framework
of Microsoft. To invite programmers to use its language, this group extended the
control structures used in ASMs. Pure ASM control structures are represented
by normal forms of AsmL programs.
The aim of this paper is to give a formal definition of AsmL (more precisely
of the part concerning control structures; we are not interested by constructions
of first-order structures here), a formal definition of normal forms in AsmL, to
show how to build a normal form from an AsmL program, and to compare the
cost of the normal form with the cost of the original program.
2 Definitions
We first recall the definition of ASMs and then precise our point of view on
ASMs because different definitions exist.
2.2 Syntax
Signature L has three sorts: Data, Boolean and Null. Terms are defined by:
The above defined terms are usually called closed or ground terms; general terms
with variables are disallowed. An n-ary function symbol of sort Boolean will be
called an n-ary predicate.
f (t1 , . . . , tn ) := t0
2.3 Semantics
Definition 5. If L is an ASM signature, an ASM abstract state, or more pre-
cisely an L-state, is a synonym for a first-order structure A of signature L (an
L-structure).
The universe of A consists of the disjoint union of three sets: the basis set A,
the Boolean set B = {true, f alse}, and a singleton set {⊥}. The values of the
Boolean constant symbols true and false and of null (or undef) in A will be
denoted by true, false, and null (or ⊥).
Definition 6. Let L be an ASM signature and A a non empty set. A set of
modifications (more precisely an (L, A)-modification set) is any finite set of
triples:
(f, a, a),
where f is a function symbol of L, a = (a1 , . . . , an ) is an n-tuple of A (where n
is the arity of f ), and a is an element of A.
Definition 7. Let L be an ASM signature, let A be an L-state and let Π be an
L-program. Let ΔΠ (A) denote the set defined by structural induction on Π as
follows:
168 P. Cégielski and I. Guessarian
1. If Π is an update rule:
f (t1 , . . . , tn ) := t0
A A A
then, denoting t1 by a1 , . . . , tn by an , and t0 by a, the set ΔΠ (A) is
the singleton:
{(f, (a1 , . . . , an ), a)}.
2. If Π is a block:
par R1 . . . Rn endpar
then the set ΔΠ (A) is the union:
ΔR1 (A) ∪ . . . ∪ ΔRn (A).
3. If Π is a test:
if ϕ then R
we first have to evaluate the expression ϕA . If it is false then the set ΔΠ (A)
is empty, otherwise it is equal to:
ΔR (A).
The semantics of the alternative rule is similar.
We may check that ΔΠ (A) is an (L,A)-set of modifications.
Definition 8. A set of modifications is incoherent if it contains two elements
(f, a, a) and (f, a, b) with a
= b. It is coherent otherwise.
Definition 9. Let L be an ASM signature, Π an L-program, and A an L-state.
If ΔΠ (A) is coherent, the transform τΠ (A) of A by Π is the L-structure B
defined by:
– the base set of B is the base set A of A;
– for any element f of L and any element a = (a1 , . . . , an ) of An (where n is
the arity of f ):
• if (f, a, a) ∈ ΔΠ (A) for an a ∈ A, then :
B
f (a) = a ;
• otherwise:
B A
f (a) = f (a).
If ΔΠ (A) is incoherent then τΠ (A) = A (hence the state is a fixed point).
Definition 10. Let L be an ASM signature, Π an L-program, and A an L-state.
The computation is the sequence of L-states
(An )n∈N
defined by:
– A0 = A (called the initial algebra of the computation);
– An+1 = τΠ (An ) for n ∈ N.
Normalization of Some Extended Abstract State Machines 169
3.1 Syntax
Roughly speaking, AsmL considers more control structures than ASMs. We first
define AsmL rules, then explain them.
Definition 12. Let L be an ASM signature.
– An ASM rule is also an extended rule of L.
– If R1 , . . . , Rk are extended rules of signature L, where k ≥ 1, then the
following expression is also an extended rule of L, called a step rule:
par
step R1
..
.
step Rk
endpar
– If R is an extended rule of signature L, then the following expressions are
also extended rules of L, called an iteration rules:
• step until ϕ R
• step while ϕ R
with ϕ a boolean term.
– If ϕ is a boolean term, and R1 and R2 are extended rules, then:
if ϕ then R1
else R2
endif
is an extended rule, called an alternative rule.
3.2 Semantics
We now explain the above rules. Following the semantics given above for ASMs,
the meaning of the par rule:
par
R1
..
.
Rk
endpar
170 P. Cégielski and I. Guessarian
is that rules R1 , ..., Rk are running simultaneously, i.e. each rule is running
(with a point of view sequential or parallel for the observer, it is immaterial)
independently of the other rules; incoherences might occur for some updates;
if there is at least one incoherence, the program stops, otherwise updates are
applied.
The paradigm of simultaneity is unusual for programmers who are used to se-
quentiality. Hence sequentiality was introduced using step in AsmL. The mean-
ing of the extended rule
par
step R1
..
.
step Rk
endpar
is as follows: rule R1 is running first; then rule R2 is running (with values of
closed terms depending of the result of running rule R1 ); and so on.
Finally classical iterations appear in AsmL via the iteration rules.
4 Normalization
In the present section, all L-state A have an infinite base set which can be
assumed to contain N.
Running rules are implicit in ASMs and are made explicit in AsmL. If R is an
ASM rule (not extended), then step until fixpoint R will have the same
semantics as the ASM program Π consisting of rule R.
In the sequel par and endpar will be omitted: when rules have the same inden-
tation, they will be assumed to be in a par . . . endpar block.
To give a normal form for a given program is a classical issue of theoretical
computer science. For ASMs, we have an extra constraint, which is very impor-
tant: when running, the normal form and the original program should execute
approximatively the same number of updates and comparisons.
Example 1. Consider the following programming problem: to compute the aver-
age of an array of grades and to determine the number of grades whose value is
greater than this average. A natural program in extended ASMs is given below.
Note that in this program, avg, i, n, and nb are not variables, but constants: the
value of i will be changed in the next L-state of the computation.
step
avg := 0
i := 0
step until (i = n)
avg := avg + grade[i]
i := i+1
step
avg := avg/n
i := 0
nb := 0
step until (i = n)
if (grade[i] > avg) then nb := nb + 1
i := i + 1
if (mode = 0) then
avg := 0
i := 0
mode := 1
if (mode = 1) then
avg := avg + grade[i]
i := i + 1
if (i = n) then mode := 2
if (mode = 2) then
avg := avg/n
i := 0
nb := 0
mode := 3
if (mode = 3) then
if (grade[i] > avg) then nb := nb + 1
i := i + 1
if (i = n) then mode := 4
172 P. Cégielski and I. Guessarian
Theorem 1. For every extended ASM program Π, there exists a normal form
ASM program Πn such that for every L-state A whose base set is infinite there
exists an Lm -state Am in which Πn computes the same function as Π, where
Lm is the signature L ∪ {mode}, with mode a new constant symbol, and Am is
the expansion of A to Lm .
if (mode = 0) then
R
mode := 1
par
step R1
..
.
step Rk
endpar
In the (abnormal) case k = 1, the core is simply R1 . To explain the case
k ≥ 2, it is sufficient to suppose k = 2. In this case the core of Πn is:
R1
R2+f in1
with
1. R1 = R1 .
2. R2+f in1 is defined as follows: Let f in1 be the greatest value used for
mode in R1 . Rule R2+f in1 is R2 where f in1 is added to each constant
occurring on the right hand side of the “=” sign of an expression rule
beginning by mode (a boolean expression mode = constant or an update
rule mode := constant).
– If Π is an iteration rule (step until ϕ R) and R is the core of the normal
form of R then the core Π of the normal form Πn of Π is
Normalization of Some Extended Abstract State Machines 173
if (mode = 0) then
if ¬ϕ then mode := 1
if ϕ then mode := fin + 1
R+1
where R+1 is R in which each constant occurring on the right hand side
of the “=” or “≥” sign of an expression beginning by mode (a boolean
expression mode = constant, mode ≥ constant or an update rule mode :=
constant) is incremented by 1 but for the greatest such value f in, which is
replaced by 0.
– If Π is an iteration rule (step while ϕ R), it is treated similarly:
if (mode = 0) then
if ϕ then mode := 1
if ¬ϕ then mode := fin + 1
R+1
– If Π is an alternative rule,
if ϕ then R1
else R2
endif
and R1 , R2 are the respective cores of the normal forms of R1 , R2 , then the
core of the normal form Πn of Π is:
if (mode = 0) then
if ϕ then mode := 1 else mode := fin + 1
R1+1
R2f in+1
where Ri+k is Ri in which k is added to each constant occurring on the right
hand side of the “=” (or “≥”) sign of an expression rule beginning by mode.
References
1. Grieskamp, W., Tillmann, N.: AsmL Standard Library, Foundations of Software
Engineering – Microsoft Research (2002),
http://www.codeplex.com/AsmL//AsmLReference.doc,
http://research.microsoft.com/en-us/downloads/
3444a9cb-47ce-4624-9e14-c2c3a2309a44/default.aspx
2. Gurevich, Y.: Reconsidering Turing’s Thesis: Toward More Realistic Semantics of
Programs, University of Michigan, Technical Report CRL–TR–38–84, EECS De-
partment (1984)
3. Gurevich, Y.: A New Thesis, Abstracts, p. 317. American Mathematical Society,
Providence (August 1985)
174 P. Cégielski and I. Guessarian
A Appendix
We prove here that an AsmL program Π and its normal form Πn have the same
semantics. To this end we first define formally the semantics of AsmL programs,
in the general case when step until fixpoint iterations are also allowed in
AsmL.
– If Π is an alternative rule,
if ϕ then R1
else R2
endif
its semantics is defined by:
[[(Π, A)]]e = if ϕA then [[(R1 , A)]]e else [[(R2 , A)]]e .
– If Π is a step rule:
par
step R1
step R2
endpar
R1
R2+f in1
where R1 is the core R1 of R1 , and R2+f in1 is defined as follows: Let f in1
be the greatest value used for mode in R1 . Rule R2+f in1 is the core R2
of R2 where f in1 is added to each constant occurring on the right hand
side of the “=” (or “≥”) sign of an expression rule beginning by mode (a
boolean expression mode = constant, mode ≥ constant or an update rule
mode := constant).
In order to prove that (1) holds, note that
• Assuming by the induction hypothesis that (1) holds for R1 and R2 , we
have that
[[(R1 , Am )]]|L = [[(R1 , A)]]e (2)
e
[[(R2 , Bm )]]|L = [[(R2 , B)]] . (3)
l k l
τΠ +k (Bm )|L = τΠn (Bm )|L (4)
n
k
[[(Πn+k , Bm )]]|L = [[(Πn , Bm )]]|L . (5)
• Rules in R1 are executed only when the value of mode is < f in1 , and
when execution of R1 is finished the value of mode is f in1 , hence
f in1
[[(R1 , Am )]] = [[(R1 , A)]]e m . (6)
Normalization of Some Extended Abstract State Machines 177
• Rules in R2+f in1 are executed only when the value of mode verifies f in1 ≤
mode < (f in1 +f in2 ), hence rules of R1 can no longer be executed; when
execution of R2+f in1 is finished then the value of mode is f in1 + f in2 ,
and no rule of Πn can be executed, hence the fixed point of τΠn (Am ) is
reached. We can deduce by equation (5) that
Finally, we have:
[[(Πn , Am )]] = τR∗ +f in1 τR∗ 1 (Am )
2
= τR∗ +f in1 [[(R1 , Am )]] by definition
2
f in1
= τR+f in1 [[(R1 , A)]]e m
∗
by equation (6)
2
f in1
= [[(R2+f in1 , [[(R1 , A)]]e m )]] by definition .
if (mode = 0) then
if ¬ϕ then mode := 1
if ϕ then mode := fin + 1
R+1
It can be seen that there is no rule with guard “ if mode = fin + 1 then...”,
and that always ϕAm = ϕA . We prove (1).
• either eventually ϕB is true, for some B = τΠ k
n
(Am ), with k ∈ N. In
k
this case ϕ B|L
is also true and we have B|L = τΠ (A), for some k ∈ N,
k k
k ≤ k; then [[(Πn , Am )]] = τΠ n
(Am ) is τΠ (A) together with value of
mode equals to f in + 1, and (1) holds.
• or ϕB is always false, then [[(Πn , Am )]] is undefined (because mode takes
infinitely often the values 0 and 1), and so is [[(Π, A)]] whence (1).
– the case of step while iterations is treated similarly.
– If Π is an iteration rule (step until fixpoint R), let R be the core of
the normal form of R and M the largest value of mode occurring in R , then
the core Π of the normal form Πn of Π is:
178 P. Cégielski and I. Guessarian
= ti0 then
if fi (t1 , . . . , tn )
fi (t1 , . . . , tn ) := ti0
c := 1
Let A0 = A, and for i ≥ 0, Ai+1 = [[(R, Ai )]]e . Let also A0 = A00 , and for
i ≥ 0, Ai+1 = [[(R , Ai )]]00 . It can be seen that equation (8) implies, for all i
Now, on the one hand, [[(R, A)]]e is equal to the first Ai such that Ai = Ai+1 ;
by equation (9), we also have that Ai = Ai+1 . On the other hand, let us
compute [[(Πn , Am )]].
If all updates other than updates on c and mode are trivial, we let
• In the latter case, c = 0 and mode > 0 in [[(R , Ai )]], so no rule of Πn
can be executed, the fixed point [[(Πn , Am )]] is reached and is equal to
[[(R , Ai )]], i.e.
moreover, Ai+1 = Ai (only trivial updates are performed in the course
of the computation of R ), and by equation (9) this implies that also
Ai+1 = Ai , hence the fixed point of Π is reached and
if (mode = 0) then
if ϕ then mode := 1 else mode := fin + 1
R1+1
R2f in+1
where Ri+k is Ri where k is added to each constant occurring on the right
hand side of the “=” (or “≥”) sign of an expression rule beginning by mode.
For A an L–state, and k ∈ N, recall that Akm denote the Lm –state, where
all function symbols are interpreted as in A, and where mode has value k.
The semantics of Πn is:
[[(Πn , A0m )]] = if ϕA then [[(R1+1 , A1m )]] else [[(R2f in+1 , Afmin+1 )]] .
Because:
[[(R1+1 , A1m )]]|L = [[(R1 , A)]]e and [[(R2f in+1 , Afmin+1 )]]|L = [[(R2 , A)]]e ,
it can be deduced that: [[(Πn , A0m )]]|L = [[(Π, A)]]e .
par
step R1
step R2
endpar
cost(Π) = 1 + Σi (1 + cost(R[i])) ,
– If Π is an alternative rule,
if ϕ then R1
else R2
endif
then
The step until fixpoint has been excluded, even though we chose a way
to emulate it in normal form, because the cost of checking that the fixpoint is
reached is non null in AsmL, but a precise semantics of that cost in AsmL should
first be chosen before any attempt to compare costs.
Finding Reductions Automatically
1 Introduction
Perhaps the most useful item in the complexity theorist’s toolkit is the reduc-
tion. Confronted with decision problems A, B, C, . . ., she will typically compare
them with well-known problems, e.g., REACH, CVP, SAT, QSAT, which are
complete for the complexity classes NL, P, NP, PSPACE, respectively. If she
finds, for example, that A is reducible to CVP (A ≤ CVP), and that SAT ≤ B,
C ≤ REACH, and REACH ≤ C, then she can conclude that A is in P, B is
NP hard, and C is NL complete.
When Cook proved that SAT is NP complete, he used polynomial-time Turing
reductions [4]. Shortly later, when Karp showed that many important combina-
torial problems were also NP complete, he used the simpler polynomial-time
many-one reductions [14].
Since that time, many researchers have observed that natural problems remain
complete for natural complexity classes under surprisingly weak reductions in-
cluding logspace reductions [13], one-way logspace reductions [9], projections
[22], first-order projections, and even the astoundingly weak quantifier-free pro-
jections [11].
It is known that artificial non-complete problems can be constructed [15].
However, it is a matter of common experience that most natural problems are
complete for natural complexity classes. This phenomenon is receiving a great
deal of attention recently via the dichotomy conjecture of Feder and Vardi that
all constraint satisfaction problems are either NP complete, or in P [7,20,1].
The authors were partially supported by the National Science Foundation under
grants CCF-0830174 and CCF-0541018 (first two authors) and CCF-0953761 and
CCF-0540862 (third author). Any opinions, findings, conclusions, or recommenda-
tions expressed in this material are those of the authors and do not necessarily reflect
the views of the NSF.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 181–200, 2010.
c Springer-Verlag Berlin Heidelberg 2010
182 M. Crouch, N. Immerman, and J.E.B. Moss
whose universe is the nonempty set |A|. For each relation symbol Ri of arity ai
in τ , A has a relation RiA of arity ai defined on |A|, i.e. RiA ⊆ |A|ai . For each
function symbol fi ∈ τ , fiA is a total function from |A|ri to |A|.
Let STRUC[τ ] be the set of finite structures of vocabulary τ . For example,
τg = E 2 ; ; is the vocabulary of (directed) graphs and thus STRUC[τg ] is the
set of finite graphs.
2.2 Ordering
It is often convenient to assume that structures are ordered. An ordered struc-
ture A has universe |A| = {0, 1, . . . , n − 1} and numeric relation and constant
symbols: ≤, Suc, min, max referring to the standard ordering, successor relation,
minimum, and maximum elements, respectively (we take Suc(max) = min). Re-
ductionFinder may be asked to find a reduction on ordered or unordered struc-
tures. In the former case it may use the above numeric symbols. Unless otherwise
noted, we from now on assume that all structures are ordered.
184 M. Crouch, N. Immerman, and J.E.B. Moss
It is well known that REACH is complete for NL, and REACHd and REACHu
are complete for L [10,19]. A simpler way to express deterministic transitive clo-
sure is to syntactically require that the out-degree of our graph is at most one by
using a function symbol: denote the child of v as f (v), with f (v) = v if v has
no outgoing edges. In this notation,
a problem equivalent
to REACHd ,and thus
complete for L, is REACHf = G ∈ STRUC[τfst ] G |= RTC(f )(s, t) .
If O is an operator such as TC, let FO(O) be the closure of first-order logic
using O. Then L = FO(DTC) = FO(RDTC) = FO(STC) = FO(RSTC) and
NL = FO(TC) = FO(RTC).
It is useful to define new relations by induction. For example, we can express the
transitive closure of the relation E inductively, and thus the property REACH,
via the following Datalog program:
E ∗ (x, x) ←
E ∗ (x, y) ← E(x, y)
(2)
E ∗ (x, y) ← E ∗ (x, z), E ∗ (z, y)
REACH ← E ∗ (s, t)
Define FO(IND) to be the closure of first-order logic using such positive induc-
tive definitions. The Immerman-Vardi Theorem states that P = FO(IND). In
this paper we will use stratified Datalog programs such as Equation 2 to ex-
press problems and then use ReductionFinder to automatically find reductions
between them. Thus ReductionFinder can handle any problem in P or below.
In the future we hope to handle problems in NP, but this will require us to go
beyond SAT solvers to QBF solvers.
2.6 Reductions
A∈S ⇔ f (A) ∈ T .
E (x, y) ≡ y = t ∧ f (y) = x
s ≡ s (3)
t ≡ t
186 M. Crouch, N. Immerman, and J.E.B. Moss
Note that the three formulas in Rfu ’s definition (Equation 3) have no quantifiers,
so Rfu is not only a first-order reduction, it is a quantifier-free reduction and we
write REACHf ≤qf REACHu .
More explicitly, for each structure A ∈ STRUC[σ], B = Rfu (A) =
|A|, E B , sB , tB is a structure in STRUC[τ ] with universe the same as A, and
symbols given as follows:
E B = a, b (A, a/x, b/y) |= y = t ∧ f (y) = x
s B = sA
tB = tA
In this paper we restrict ourselves to quantifier-free reductions. In general, a first-
order reduction R has an arity which measures the blow-up of the size of the
reduction. In [10] a first-order
reduction of arity
k maps a structure with universe
|A| to a structure of universe a1 , . . . ak (A, a1 /x1 , . . . , ak /xk ) |= ϕ0 , i.e., a
first-order definable subset of |A|k . However, increasing the arity of a reduction
beyond two is rather excessive – arity two already squares the size of the instance.
In this paper, in order to keep our reductions as small and simple as possible, we
use a triple of natural numbers, k, k1 , k2 , to describe the universe of the image
structure, namely
3 Strategy
We begin searching for reductions at a very small size (n = 3); for search
spaces without a correct reduction, even this small size is often enough to detect
irreducibility. When a reduction is found at a particular size n, we examine larger
structures for counterexamples; currently we look at structures of size at most
n + 2. If a counterexample is found, we add it to G, increment n and return to
step 1.
Search time increases very rapidly as n increases. Of the 10,422 successful
reductions found, 9,291 of them were found at size 3, 1076 at size 4, 38 at size 5,
and 17 at sizes 6-8. See Section §5 for details of results. See Section §6 for more
about the current limits of size and running time and our ideas concerning how
to improve these.
4 Implementation
Figure 1 shows a schematic view of ReductionFinder’s algorithm. The program
is written in Scala, an object-oriented functional programming language imple-
mented in the Java Virtual Machine1 . ReductionFinder maintains a database of
problems via a directed graph, G, whose vertices are problems. An edge (a, b)
indicates that a reduction has been found from problem a to problem b, and is
labelled by the parameters of a minimal such reduction that has been found so
far.
When a new problem, c, is entered, ReductionFinder systematically searches
for reductions to resolve the relationships between c and the problems already
categorized in G.
Given a pair of problems, c, d, specified in stratified Datalog, and a search
space Ra,p specifying the arity a and parameters p, ReductionFinder calls the
Cmodels 3.79 answer-set system2 to answer individual queries of the form of
1
http://www.scala-lang.org
2
http://www.cs.utexas.edu/users/tag/cmodels.html
Finding Reductions Automatically 189
Equations (6), (7). Cmodels in turn makes calls to SAT solvers. The SAT solvers
we currently use are MiniSAT and zChaff [6,18].
Once a search space and a pair of problems are fixed, ReductionFinder performs
the iterative sequence of search stages described in section 3.1. Within each stage,
ReductionFinder outputs a single lparse/cmodels program expressing Equations
(6) or (7), and calls the Cmodels tool. The find statements in these equations
are quantified explicitly using lparse’s choice rules. The majority of the program
is devoted to evaluation rules defining the structure R(G) in terms of the sets of
boolean variables R and G.
Figure 2 gives lparse code for a single counterexample-finding step (equation
(7)). This code attempts to find a counterexample to a previously-generated re-
duction candidate. The specific code listed is examining reductions from REACH
(Section 2.4) to its negation. The reduction candidate was E (x, y) ≡ (E(y, x) ∧
x = s) ∨ E(x, x), s ≡ t, t ≡ Suc(min) (lines 7-9).
The counterexample is found using lparse’s choice rules as existential quanti-
fiers, directly guessing the relation in E and the two constant symbols in s and
in t (lines 12-13). Since lparse does not contain function symbols, these constants
are implemented as degree-1 relations which are true at exactly one point. We
specify the constraint that we cannot have in satisfied == out satisfied
(line 16); these boolean variables will be defined later in the program, and this
constraint will ensure that our graph is a counterexample to the reduction candi-
date.
Defining in satisfied and out satisfied in terms of the input and out-
put predicates (respectively) is easy. We have already required the user to input
lparse code for the input and output queries. We do some minimal processing
on this code, disambiguating names and turning function symbols into relations.
The user’s input for directed-graph reachability, listed in Equation (8), is trans-
lated into the input query block of lines 19-22. Similarly, the output query is
translated into lines 25-28.
The remainder of the lparse code exists to define the output predicates (in this
case out E, out s, out t) in terms of the input predicates and the reduction. In
building the output reduction out E(X, Y), we first build up a truth table for
each of the atomic formulas used; for example, line 31 states that term e y x is
true at point (X, Y) exactly if E(Y, X) in the input structure. Each position in
the DNF definition is true at (X, Y) exactly if the atomic formula chosen for that
position is true (lines 36-37). The output relation out E(X, Y) is then defined
via the terms in the DNF (lines 38-39). The code in lines 30-39 thus defines the
output relation out E(X, Y) in terms of the input relations in E, in s, in t and
the reduction candidate reduct E.
Lines 41-47 similarly define the output constants out s and out t. Since lparse
does not provide function symbols, we define these constants as unary relations
out s(X), making sure that these relations are true at exactly one point. We
are thus able to define the output constants in terms of the input symbols in s,
in t and the the reduction candidate’s definitions of s , t (reduct s, reduct t).
The code for finding a reduction candidate (equation (6)) is very similar to the
counterexample-finding code in Figure 2. We import the list G of counterexample
Finding Reductions Automatically 191
Fig. 2. Lparse code for a single search stage. This code implements equation (7), search-
ing for a 4-node counterexample for a candidate reduction from REACH (Section 2.4)
to its negation. Variables X, Y, Z range over nodes.
192 M. Crouch, N. Immerman, and J.E.B. Moss
graphs, and must guess a reduction. The input query, output vocabulary, and
output query are evaluated for each graph. Truth tables must be built for each
relation which might appear in the reduction, and for each graph.
4.4 Timing
ReductionFinder uses the Cmodels logic programming system to solve its search
problems. The Cmodels system solves answer-set programs, such as those in the
lparse language, by reducing them to repeated SAT solver calls. Direct transla-
tions from answer-set programming (ASP) to SAT exist[2,12], but introduce new
variables; Lifschitz and Razborov have shown that, assuming the widely-believed
conjecture P ⊆ NC1 /poly, any translation from ASP must either introduce new
variables or produce a program of worst-case exponential length [17].
The Cmodels system first translates the lparse program to its Clark comple-
tion [3], interpreting each rule a : – b as merely logical equivalence (a ⇔ b).
Models of this completion may fail to be answer sets if they contain loops, sets
of variables which are true only because they assume each other. If the model
found contains a loop, Cmodels adds a loop clause preventing this loop and
continues searching, keeping the SAT solver’s learned-clause database intact. A
model which contains no loops is an answer set, and all answer sets can be found
in this way.
The primary difficulty in finding large reductions with ReductionFinder has
been computation time. The time spent finding reductions dominates over the
!"# $
%!"
Fig. 3. Timing data for a run reducing ¬RTC[f ](s, t) ≤ RTC[f ](s, t) at arity 2,
size 4. The solid line shows time to find each reduction candidate in seconds, on a
logarithmic scale. The dotted line shows the number of loop formulas generated by
Cmodels, and thus the number of SAT solver calls for each reduction candidate. This
run was successful in finding a reduction.
Finding Reductions Automatically 193
5 Results
5.1 Size and Timing Data
We have run ReductionFinder for approximately 5 months on an 8-core 2.3 GHz
Intel Xeon server with 16 GB of RAM. As of this writing, ReductionFinder has
performed 331,036 searches on a database of 87 problems. Of the 7482 pairs
of distinct problems, we explicitly found reductions between 2698; an additional
803 reductions could be concluded transitively. 23 pairs were manually marked as
irreducible, comprising provable theorems about first-order logic plus statements
that L (co-)NL P. From these 23, an additional 3043 pairs were transitively
concluded to be irreducible. 915 pairs remained unfinished.
For many of the pairs which we reduced successfully, we found multiple suc-
cessful reductions. Sometimes this occurred when we first found the reduction in
a large search space, then tried smaller spaces to determine the minimal spaces
containing a reduction. More interestingly, some pairs contained multiple suc-
cessful reductions in distinct minimal search spaces, demonstrating trade-offs
between different measures of the reduction’s complexity. Some of these trade-
offs were uninteresting: a reduction which simply needs “some distinguished
constant” could use min, max, or c1 . Others, however, began to show non-trivial
trade-offs between the formula length required and the numerics or arity avail-
able. See Equations (9), (10) for an example. Of the 12,149 correct reductions
found between the 2698 explicitly-reduced pairs of problems, 5091 were in some
minimal search space.
!
"
"
#$
Fig. 4. A map of reductions in the query database. Nodes without numbers represent
a single query. A node with number n represents n queries of the same complexity.
Some queries are elided for clarity.
Finding Reductions Automatically 195
includes ∃xy.E(x, y), ∃x.f (x) = s, ∃x.E(s, x). Below this, the structure of FO
under quantifier-free reductions is correctly represented up to two quantifier
alternations.
Beyond FO, ReductionFinder has made significant progress in describing the
complexity hierarchy. A class of 7 L-complete problems is visible at TC[f ](s, t)
(deterministic reachability), including its complement (¬TC[f ](s, t)) and de-
terministic reachability with a relational target (∃y.T (y) ∧ TC[f ](s, y)). Un-
fortunately, the L-complete problems of cycle-finding (∃x.TC[E](x, x)) and its
negation have not been placed in this class; nor has deterministic reachability
with relations as both source and target (∃xy.S(x) ∧ T (y) ∧ TC[E](x, y)).
Below this level, ReductionFinder had limited success. We succeeded in re-
ducing several problems to reachability (see Figure 5), including degree-2 reach-
ability (reduction described in section 5.3. Not surprisingly, we did not discover a
proof of the Immerman-Szelepcsényi theorem (showing co-NL ≤ NL by providing
a reduction ¬TC[E](s, t) ≤ TC[E](s, t)). We similarly did not prove Reingold’s
theorem [19], showing SL ≤ L by reducing STC[E](s, t) ≤ TC[f ](s, t). These
two results were historically elusive, and may require reductions above arity 2,
or longer formulas than we were able to examine. Considering P-complete prob-
lems, we proved the equivalence of several variations of alternating transitive
196 M. Crouch, N. Immerman, and J.E.B. Moss
closure (ATC); however, we did not show the problem equivalent to its nega-
tion, or to the monotone circuit value problem (MCVAL).
|R(A)| = {a1 , a2 , . . . , an , c1 }
|R(A)| = {a1 , a2 , . . . , an }
This reduction uses the traditional technique of using successor to iterate through
possible neighbors. Each node x, y of the output structure can be read as “we
are at node x, considering y as a possible next step”. If there is an edge E(x, y),
we nondeterministically either follow this edge (moving along f to y, y) or
move along g to the next possibility x, Suc(y). If there is no edge E(x, y), our
only nontrivial movement is along g, to x, Suc(y).
References
1. Allender, E., Bauland, M., Immerman, N., Schnoor, H., Vollmer, H.: The Com-
plexity of Satisfiability Problems: Refining Schaefer’s Theorem. J. Comput. Sys.
Sci. 75, 245–254 (2009)
2. Ben-Eliyahu, R., Dechter, R.: Propositional semantics for disjunctive logic pro-
grams. Annals of Mathematics and Artificial Intelligence 12, 53–87 (1996)
3. Clark, K.: Negation as Failure. In: Gallaire, H., Minker, J. (eds.) Logic and Data
Bases, pp. 293–322. Plenum Press, New York
4. Cook, S.: The Complexity of Theorem Proving Procedures. In: Proc. Third Annual
ACM STOC Symp., pp. 151–158 (1971)
5. Ebbinghaus, H.-D., Flum, J.: Finite Model Theory, 2nd edn. Springer, Heidelberg
(1999)
6. Eén, N., Sörensson, N.: An Extensible SAT-solver [extended version 1.2]. In:
Giunchiglia, E., Tacchella, A. (eds.) SAT 2003. LNCS, vol. 2919, pp. 502–518.
Springer, Heidelberg (2004)
7. Feder, T., Vardi, M.: The Computational Structure of Monotone Monadic SNP
and Constraint Satisfaction: A Study Through Datalog and Group Theory. SAIM
J. Comput. 28, 57–104 (1999)
8. Giunchiglia, E., Lierler, Y., Maratea, M.: SAT-Based Answer Set Programming.
In: Proc. AAAI, pp. 61–66 (2004)
9. Hartmanis, J., Immerman, N., Mahaney, S.: One-Way Log Tape Reductions. In:
IEEE Found. of Comp. Sci. Symp., pp. 65–72 (1978)
200 M. Crouch, N. Immerman, and J.E.B. Moss
Anuj Dawar
For Yuri, on the occasion of your seventieth birthday. Thank you for always
asking the most stimulating questions.
1 Introduction
One of the main drivers of research in the area of finite model theory and descriptive
complexity over the last three decades has been the question of whether there is a logic
that expresses exactly the polynomial-time computable properties of finite structures.
In short form, we ask whether there is a logic capturing P. This question was first for-
mulated by Chandra and Harel [4] but given the precise form in which it is usually cited
by Yuri Gurevich [7]. In this form, the question is as follows. A logic L is a function
SEN associating a recursive set of sentences to each finite vocabulary σ together with
a function SAT that associates to each σ a recursive satisfaction relation relating finite
σ-structures to sentences that is also isomorphism-invariant. That is, if A and B are iso-
morphic σ-structures and ϕ is any sentence of SEN(σ) then (A, ϕ) ∈ SAT(σ) if, and
only if, (B, ϕ) ∈ SAT(σ). Now, a logic L captures P if there is a computable function
that takes each sentence of L to a polynomially-clocked Turing machine that recog-
nises the models of the sentence, and for every polynomial-time recognizable class K
of structures, there is a sentence of L whose models are exactly K.
The author carried out this work while supported by a Leverhulme Trust Study Abroad
Fellowship.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 201–207, 2010.
c Springer-Verlag Berlin Heidelberg 2010
202 A. Dawar
Gurevich conjectured that there is no logic capturing P in this sense. He also proved
in the same paper that there is such a logic if, and only if, there is a logic that captures
P on graphs. Thus, the conjecture can be reformulated as the following statement.
Conjecture 1 (Gurevich). There is no recursively enumerable set S of pairs (M, p)
where M is a deterministic Turing machine and p a polynomial such that:
1. for each (M, p) ∈ S, if G1 and G2 are isomorphic n-vertex graphs, then M accepts
input G1 in p(n) steps if, and only if, it accepts G2 in p(n) steps; and
2. for any polynomial-time decidable class K of graphs, there is a pair (M, p) ∈ S
such that an n-vertex graph G is accepted by M in p(n) steps if, and only if, G ∈ K.
The key difference between this and the question formulated by Chandra and Harel
is that in their formulation we would require S to be a set of machines which run in
polynomial-time but the polynomials need not be given explicitly. For a discussion of
the relationship between the two questions see [11].
While Conjecture 1 has become well-known as Gurevich’s conjecture and generated
a large amount of follow-up research, in [7], Gurevich stated similar conjectures for the
complexity classes NP ∩ co-NP and RP which have received somewhat less attention.
To be precise, the conjecture that there is no logic for NP ∩ co-NP can be stated in the
following equivalent form.
Conjecture 2 (Gurevich). There is no recursively enumerable set S of triples (M, N, p)
where M and N are nondeterministic Turing machines and p a polynomial such that:
1. for each (M, N, p) ∈ S, if G1 and G2 are isomorphic n-vertex graphs, then M
accepts input G1 in p(n) steps if, and only if, it accepts G2 in p(n) steps and the
same holds for N ;
2. for each (M, N, p) ∈ S and each n-vertex graph G, M accepts G in p(n) steps if,
and only if, N does not accept G in p(n) steps; and
3. for any class K of graphs that is in both NP and co-NP, there is a triple (M, N, p) ∈
S such that an n-vertex graph G is accepted by M in p(n) steps if, and only if,
G ∈ K.
Indeed, Gurevich provided evidence in support of Conjecture 2 in the following form.
He showed that if Conjecture 2 is false then there is a complete problem in NP ∩ co-NP
under polynomial-time reductions. However, it is known by a result of Sipser [14] that
there are oracles A such that the complexity class NPA ∩ co-NPA does not have com-
plete problems with respect to polynomial-time reductions. This implies, in particular,
that any refutation of Conjecture 2 would have to be non-relativizing.
On the other hand, it is not difficult to show that there is also an oracle A for which
there is a logic capturing NPA ∩ co-NPA (an argument is given in Sect. 3). This means
that any proof of Conjecture 2 would also have to be non-relativizing. In short, Conjec-
ture 2 runs up against the famous relativization barrier in complexity theory (see [6,1]).
What about Conjecture 1? Is it also subject to the same barrier? Since a proof of
the conjecture would imply that P = NP, it is subject to all the barriers that face that
question. Is a refutation of the conjecture also up against a relativization barrier? This
is a question we address in this paper, which takes us through a tour of the relation-
ship between logics capturing complexity classes, recursive enumerations and complete
problems.
On Complete Problems, Relativizations and Logics for Complexity Classes 203
set of witnesses whatsoever. Could we then construct an oracle A with respect to which
there is no logic capturing P?
Just as Gurevich showed that from the assumption that there is a logic for NP ∩
co-NP it follows that there is a complete problem in this class under polynomial-
time reductions, a prospect considered unlikely, so we could also conclude from the
assumption that there is a logic for P that this class contains complete problems. How-
ever, the latter prospect is not so unlikely. P certainly contains complete problems
under polynomial-time reductions and indeed under much weaker reductions such as
logarithmic-space or AC0 reductions. I was able to show in [5] that the existence of a
logic for P implies (and, indeed, is equivalent to) the existence in P of complete prob-
lems under reductions that are themselves syntactically isomorphism-invariant.
To make this precise, let us consider logical interpretations. Suppose we are given
two relational signatures σ and τ and a logic L. An m-ary L-interpretation of τ in σ is
a sequence of formulas of L in the signature σ consisting of:
– a formula υ(x);
– a formula η(x, y);
– for each relation symbol R in τ of arity a, a formula ρR (x1 , . . . , xa ); and
– for each constant symbol c in τ , a formula γ c (x),
where each x, y or xi is an m-tuple of free variables. We call m the width of the inter-
pretation. We say that an interpretation Φ associates a τ -structure B to a σ-structure A,
if there is a surjective map h from the m-tuples {a ∈ Am | A |= υ[a]} to B such that:
FO-reductions while for L, like for P, the existence of a logic capturing it remains an
open question. On the other hand, there is also nothing special about the choice of FO
in Theorem 3. The important fact is that FO-reductions are themselves syntactically de-
fined, and so can be enumerated, and they are isomorphism-invariant. We could replace
FO in the theorem by virtually any logic that has an effective syntax, is isomorphism-
invariant (i.e. it is a logic in the sense defined in the introduction) and is contained in
P. The important fact is that the semantic condition of isomorphism-invariance that is
implied in the recursive enumeration of P is captured in the reductions themselves and
we obtain the usual construction of a complete problem from the syntactic presentation
of a class.
The complexity class NP ∩ co-NP does not meet the definition of a bounded class
as given in [5], but with a recursive presentation in the sense of Conjecture 2, it is
possible to carry through a construction analogous to the proof of Theorem 3 to obtain
the following.
Theorem 4. There is a logic capturing NP∩co-NP if, and only if, NP∩co-NP contains
a complete problem under FO-reductions.
This strengthens Theorem 1.17 of [7] by replacing polynomial-time reductions by FO-
reductions and, at the same time, providing a converse.
3 Relativization Barriers
Baker, Gill and Solovay [2] proved that there is an oracle A such that the complexity
classes PA and NPA are different and there is another oracle B so that the classes PB
and NPB are equal. This result forms what is called the relativization barrier to the
resolution of the question of whether or not P = NP. That is to say, it demonstrates
that any resolution of the question must use methods that do not relativize to machines
with oracles. In particular, methods based on diagonalization will not suffice. Since this
seminal result, methods have been found that circumvent this barrier in some cases (for
instance the celebrated result that IP = PSpace [13]) and other barriers have been
observed to the resolution of the relationship between P and NP (see [1]). The question
we address here is whether Conjectures 1 and 2 face similar barriers.
As noted in Sect. 1, Gurevich proved that if there is a logic for NP ∩ co-NP
then this class contains a complete problem with respect to polynomial-time reduc-
tions. Moreover Sipser [14] (see also [8]) showed that there are oracles with respect to
which this is not the case. On the other hand, it is easy to show that there is an ora-
cle A for which there is a logic capturing NP ∩ co-NP. To be precise, take A to be a
PSpace-complete problem such as satisfiability of quantified Boolean formulas. Then
NPA = co-NPA = PSpace. Moreover, it is known that there is a logic capturing
PSpace (see, for example [12]) and the result follows. This implies that the question
of whether there is a logic capturing NP ∩ co-NP is subject to the relativization barrier.
A resolution either way would require non-relativizing methods.
How about the question of whether there is a logic for P?. Once again, it is easy to
construct an oracle with respect to which there is such a logic. Indeed, taking A once
again to be a PSpace-complete problem, we see that PA = PSpace and therefore
206 A. Dawar
there is a logic for PA . Can we also construct an oracle A so that there is no logic
capturing PA ? Or, equivalently, an oracle A so that PA does not contain any complete
problems with respect to FO-reductions. We show next that to construct such an oracle,
we would have to separate P from NP.
Theorem 5. If P = NP then for every oracle A, there is a logic capturing PA .
Proof. A graph canonization function is a function c on strings that encode graphs with
the property that for any graph G, c(G) is the encoding of a graph isomorphic to G
and if G and G are isomorphic graphs then c(G) = c(G ). It is known that there are
graph canonization functions (such as the function that given a graph G yields the lex-
icographically minimal string representing a graph isomorphic to G) in the polynomial
hierarchy (see [3]). It follows that if P = NP then there is a graph canonization function
c that is computable in polynomial time. For any oracle machine M and polynomial p,
define CM,p to be the machine that takes an input x, computes c(x) and then simulates
M for p(n) steps on input c(x), where n is the length of c(x).
It is easy to see that the language accepted by CM,p is invariant under isomorphisms.
Moreover, if M with oracle A and running within bounds p accepts a class of graphs
K invariant under isomorphisms, then the language accepted by CM,p with oracle A
is exactly the strings encoding graphs in K. Thus, the collection of all machines CM,p
with oracle A is a recursive enumeration of PA .
Thus, it would seem that relativization is not itself a barrier to refuting Conjecture 1.
Proving the conjecture, on the other hand, would separate P from NP and this is subject
to all the barriers that complexity theory is familiar with.
4 Concluding Remarks
We noted that the conditions defining complexity classes come in two flavours: syn-
tactic and semantic. Syntactic conditions are restrictions on the accepting machines
(or on their resource bounding functions) that can be recognized from the form of the
machines or the presentations of the functions themselves. Semantic conditions on the
other hand are typically undecidable properties of the machines. Complexity classes
that are defined by purely syntactic criteria such as L, P, NP and PSpace (on strings)
admit recursive enumerations and from these one can construct complete problems un-
der quite weak reductions. On the other hand, complexity classes that are defined by
semantic restrictions on the witnessing machines, such as NP ∩ co-NP and RP, do not
admit obvious recursive presentations or complete problems and to prove that they do
would require fundamental new characterizations of these classes. Indeed establishing
whether or not they have complete problems is subject to the relativization barrier in
complexity theory.
The study of complexity classes on (unordered) graphs imposes a new semantic con-
dition on machines, namely that of isomorphism invariance. However, some complex-
ity classes (such as NP and PSpace) still admit recursive presentations even under
this semantic restriction. This is because the semantic restriction can be enforced by an
externally imposed pre-processing step (such as a canonization function) that does not
On Complete Problems, Relativizations and Logics for Complexity Classes 207
take us out of the class. But, it remains an open question whether classes such as P or
L admit recursive presentations under this restriction.
Considering NP ∩ co-NP or RP on graphs, one sees that we are dealing with two
distinct semantic restrictions: one that is inherent to the definition of the class and the
second arising from isomorphism invariance. This, as it were, makes it doubly unlikely
that we could find logics that capture these complexity classes. Moreover, it is the first
of these semantic restrictions that means that the problem of the existence of such logics
runs up against the relativization barrier.
References
1. Aaronson, S., Wigderson, A.: Algebrization: a new barrier in complexity theory. In: Proc.
40th ACM Symp. on Theory of Computing, pp. 731–740 (2008)
2. Baker, T., Gill, J., Solovay, R.: Relativizations of the P =?N P question. SIAM Journal on
Computing 4(4), 431–442 (1975)
3. Blass, A., Gurevich, Y.: Equivalence relations, invariants, and normal forms. SIAM Journal
on Computing 13(4), 682–689 (1984)
4. Chandra, A., Harel, D.: Structure and complexity of relational queries. Journal of Computer
and System Sciences 25, 99–128 (1982)
5. Dawar, A.: Generalized quantifiers and logical reducibilities. Journal of Logic and Compu-
tation 5(2), 213–226 (1995)
6. Fortnow, L.: The role of relativization in complexity theory. Bulletin of the EATCS 52, 229–
243 (1994)
7. Gurevich, Y.: Logic and the challenge of computer science. In: Börger, E. (ed.) Current
Trends in Theoretical Computer Science, pp. 1–57. Computer Science Press, Rockville
(1988)
8. Hartmanis, J., Li, M., Yesha, Y.: Containment, separation, complete sets, and immunity of
complexity classes. In: Kott, L. (ed.) ICALP 1986. LNCS, vol. 226, pp. 136–145. Springer,
Heidelberg (1986)
9. Kowalczyk, W.: Some connections between representability of complexity classes and the
power of formal systems of reasoning. In: Proc. 11th Intl. Symp. Mathematical Foundations
of Computer Science, pp. 364–369 (1984)
10. Landweber, L.H., Lipton, R.J., Robertson, E.L.: On the structure of sets in NP and other
complexity classes. Theor. Comput. Sci. 15, 181–200 (1981)
11. Nash, A., Remmel, J.B., Vianu, V.: PTIME queries revisited. In: Eiter, T., Libkin, L. (eds.)
ICDT 2005. LNCS, vol. 3363, pp. 274–288. Springer, Heidelberg (2004)
12. Richerby, D.: Logical characterizations of PSPACE. In: Computer Science Logic: Proc. 13th
Conf. of the EACSL, pp. 370–384 (2004)
13. Shamir, A.: IP = PSpace. Journal of the ACM 39(4), 869–877 (1992)
14. Sipser, M.: On relativization and the existence of complete sets. In: Nielsen, M., Schmidt,
E.M. (eds.) ICALP 1982. LNCS, vol. 140, pp. 523–531. Springer, Heidelberg (1982)
Effective Closed Subshifts in 1D Can Be
Implemented in 2D
C’est avec grand plaisir que nous avons rédigé cet article pour notre ami et
collègue Yuri Gurevich. En effet, l’intérêt qu’il a porté à nos travaux depuis
bien longtemps nous a soutenu, ses remarques nous ont éclairés, et ses
questions nous ont laissé entrevoir de nouvelles perspectives.
1 Introduction
Let A be a finite set (alphabet ); its elements are called letters. By A-configuration
we mean a mapping C : Z2 → A. In geometric terms: a cell with coordinates (i, j)
contains letter C(i, j).
A local rule is defined by a positive integer M and a list of prohibited (M ×M )-
patterns (M × M squares filled by letters). A configuration C satisfies a local
rule R if none of the patterns listed in R appears in C.
Let A and B be two alphabets and let π : A → B be any mapping. Then
every A-configuration can be transformed into a B-configuration (its homomor-
phic image) by applying π to each letter. Assume that the local rule R for
A-configurations and mapping π are chosen in such a way that local rule R pro-
hibits patterns where letters a and a with π(a) = π(a ) are vertical neighbors.
This guarantees that every A-configuration that satisfies R has an image where
vertically aligned B-letters are the same. Then for each B-configuration in the
image every vertical line carries one single letter of B. So we can say that π maps
Partially supported by NAFIT ANR-09-EMER-008-01 and RFBR 09-01-00709a
grants.
On leave from IITP RAS, Moscow.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 208–226, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Effective Closed Subshifts in 1D Can Be Implemented in 2D 209
The first part of the statement is easy. The set L(A, B, R, π) is evidently shift
invariant; it remains to show that it is effectively closed. The set of all A-
configurations that satisfy R is a closed subset of a compact space and therefore
is a compact space itself. The mapping of A-configurations into B-configurations
is continuous. Therefore the set L(A, B, R, π) is compact (as a continuous image
of a compact set). This argument can be effectivized in a standard way. A B-
string is declared bad if it cannot appear in the π-image of any A-configuration
that satisfies R. The set of all bad strings is enumerable and L(A, B, R, π) is the
set of all bi-infinite sequences that have no bad factors.
The reverse implication is more difficult and is the main subject of this paper.
It cannot be proven easily since it implies the classical result of Berger [2]: the
existence of a local rule that makes all configurations aperiodic. Indeed, it is
easy to construct an effectively closed subshift S that has no periodic points; if
it is represented as L(A, B, R, π), then local rule R has no periodic configura-
tions (configurations that have two independent period vectors); indeed, those
configurations have a horizontal period vector.
So it is natural to expect a proof of Theorem 1 to be obtained by modifying
one of the existing constructions of an aperiodic local rule. It is indeed the
case: we use the fixed-point construction described in [7]. We do not repeat this
construction (assuming that the reader is familiar with that paper or has it at
hand) and explain only the modifications that are needed in our case. This is
done in sections 2–6; in the rest of this section we survey some other steps in
the same direction.
M. Hochman [13] proved a similar result for 3D implementations of 1D sub-
shifts (and, in general, (k + 2)-dimensional implementation of k-dimensional
subshifts) and asked whether a stronger statement is true where 3D is replaced
by 2D.
210 B. Durand, A. Romashchenko, and A. Shen
computations in the second layer are fed with the data from the first layer and
check that the first layer does not contain any forbidden string.
Indeed, the macro-tiles (at every level) in our construction contain some com-
putation used to guarantee their behavior as building blocks for the next level.
Could we run this computation in parallel with some other one that enumerates
bad patterns and terminates the computation (creating a violation of the rules)
if a bad pattern appears?
This idea immediately faces evident problems:
– The computation performed in macro-tiles (in [7]) was limited in time and
space (and we need unlimited computations since we have infinitely many
forbidden substrings and no limit on the computational resources used to
enumerate them).
– Computations on high levels do not have access to bit sequence they need
to check: the bits that go through these macro-tiles are “deep in the sub-
conscious”, since macro-tiles operate on the level of their sons (cells of the
computation that are macro-tiles of the previous level), not individual bits.
– Even if every macro-tile checks all the bits that go through it (in some
mysterious way), a “degenerate case” could happen where an infinite vertical
line is not crossed by any macro-tile. Imagine a tile that is a left-most son
of a father macro-tile who in its turn is the left-most son of its father and so
on (see Fig. 2). They fill the right half-plane; the left half-plane is filled in
a symmetric way, and the vertical dividing line between then is not crossed
by any tile. Then, if each macro-tile takes care of forbidden substrings inside
its zone (bits that cross this macro-tile), some substrings (that cross the
dividing line) remain unchecked.
These problems are discussed in the following sections one after another; we
apologize if the description of them seemed to be quite informal and vague and
hope that they would become more clear when their solution is discussed.
In our previous construction the macro-tiles of all levels were of the same size:
each of them contained N ×N macro-tiles of the previous level for some constant
zoom factor N . Now it is not enough any more, since we need to host arbitrarily
long computations in high-level macro-tiles. So we need an increasing sequence
of zoom factors N0 , N1 , N2 , . . .; macro-tiles of the first level are blocks of N0 ×N0
tiles; macro-tiles of the second level are blocks of N1 × N1 macro-tiles of level
1 (and have size of N0 N1 × N0 N1 if measured in individual tiles). In general,
macro-tiles of level k are made of Nk−1 × Nk−1 macro-tiles of level k − 1 and
have side N0 N1 . . . Nk−1 measured in individual tiles.
However, all the macro-tiles (of different levels) carry the same program in
their computation zone. The difference between their behavior is caused by the
data: each macro-tiles “knows” its level (consciously, as a sequence of bits on its
212 B. Durand, A. Romashchenko, and A. Shen
tape). Then this level k may be used to compute Nk which is then used as a
modulus for coordinates in the father macro-tile. (Such a coordinate is a number
between 0 and Nk − 1, and the addition is performed modulo Nk .)
Of course, we need to ensure that this information is correct. Two properties
are required: (1) all macro-tiles of the same level have the same idea about their
level, and (2) these ideas are consistent between levels (each father is one level
higher than its sons). The first is easy to achieve: the level should be a part
of the side macro-color and should match in neighbor tiles. (In fact an explicit
check that brothers have the same idea about their levels is not really needed,
since the first property follows from the second one: since all tiles on the level
zero “know” their level correctly, by induction we conclude that macro-tiles of
all levels have correct information about their levels.)
To achieve the second property (consistency between level information con-
sciously known to a father and its sons) is also easy, though we need some
construction. It goes as follows: each macro-tile knows its place in the father,
so it knows whether the father should keep some bits of his level information
in that macro-tile. If yes, the macro-tile checks that this information is correct.
Each macro-tile checks only one bit of the level information, but with brothers’
help they check all the bits.1
There is one more thing we need to take care of: the level information should
fit into the tiles (and the computation needed to compute Nk knowing k should
also fit into level k tile). This means that log k, log Nk and the time needed to
compute Nk from k should be much less than Nk−1 (since the computation zone
is some fraction of Nk−1 ). So Nk should not grow too slow (say, Nk = log k is
too slow), should not grow too fast (say, Nk = 2Nk−1 is too fast) and should not
be too difficult to compute. However, these restriction still leave a lot of room
√ k 2k
for us: e.g., Nk can be proportional to k, to k, to 2k , or 22 , or 22 (any fixed
height is OK). Recall that computation deals with binary encodings of k and Nk
and normally is polynomial in their lengths.
In this way we are now able to embed computations of increasing sizes into
the macro-tiles. Now we have to explain which data these computations would
get and how the communication between levels is organized.
the program. At the same time each cell of yourself in fact is a son macro-tile,
and elementary operations of this cell (the relation between signals on its sides)
are in fact performed by a lower-level computation. But this computation is your
“sub-conscious”, you do not have direct access to its data, though the correct
functioning of the cells of your brain is guaranteed by the programs running in
your sons.
Please do not took this metaphor too seriously and keep in mind that the time
axis of the computations is just a vertical axis on the plane; configurations are
static and do not change with time. However, it could be useful while thinking
about problems of inter-level communication.
Let us decide that for each macro-tile all the bits (of the bit sequence that
needs to be checked) that cross this macro-tile form its responsibility zone. More-
over, one of the bits of this zone may be delegated to the macro-tile, and in this
case the macro-tile consciously knows this bit (is responsible for this bit). The
choice of this bit depends on the vertical position of the macro-tile in its father.
More technically, recall that a macro-tile of level k is a square whose side is
Lk = N0 · N1 · . . . · Nk−1 , so there are Lk bits of the sequence that intersect this
macro-tile. We delegate each of these bits to one of the macro-tiles it intersects.
Note that every macro-tile of the next level is made of Nk × Nk macro-tiles of
level k. We assume that Nk is much bigger than Lk (more about choice of Nk
later); this guarantees that there are enough macro-tiles of level k (in the next
level macro-tile) to serve all bits that intersect them. Let us decide that ith
macro-tile of level k (from bottom to top) in a (k + 1)-level macro-tile knows
ith bit (from the left) in its zone. Since Nk is greater than Lk , we leave some
unused space in each macro-tile of level k + 1: many macro-tiles of level k are
not responsible for any bit, but this does not create any problems.
This is our plan; however, we need a mechanism that ensures that the dele-
gated bits are indeed represented correctly (are equal to the corresponding bits
“on the ground”, in the sequence that forms the first level of our construction).
This is done in the hierarchical way: since every bit is delegated to macro-tiles
···
Nk tiles of size Lk × Lk
···
···
···
Fig. 1. Bit delegation: bits assigned to vertical lines are distributed between k-level
macro-tile (according to their positions in the father macro-tile of level k + 1)
214 B. Durand, A. Romashchenko, and A. Shen
of all levels, it is enough to ensure that the ideas about bit values are consistent
between father and son.
For this hierarchical check let us agree that every macro-tile not only knows
its own delegated bit (or the fact that there is no delegated bit), but also knows
the bit delegated to its father (if it exists) as well as father’s coordinates (in the
grandfather macro-tile). This is still an acceptable amount of information (for
keeping father’s coordinates we need to ensure that log Nk+1 , the size of father’s
coordinate, is much smaller that Nk−1 ). To make this information consistent, we
ensure that
– the data about the father’s coordinates and bits are the same among
brothers;
– if a macro-tile has the same delegated bit as its father (this fact can be
checked since a macro-tile knows its coordinates in the father and father’s
coordinates in the grandfather), these two bits coincide;
– if a macro-tile is in the place where its father keeps its delegated bit, the
actual father’s information is consistent with the information about what
the father should have.
We reserve also some time and space to check that all the patterns appeared
during the enumeration are not substrings of the bit group under consideration.
This is not a serious time/space overhead since substring search in the given bit
group can be performed rather fast, and the size of the bit group and the number
of enumeration steps are chosen small enough (having in mind this overhead).
Then in the limit any violation inside some macro-tile will be discovered (and
only degenerate case problem remains: substrings that are not covered entirely
by any tile). The degenerate case problem is considered in the next section; it
this section it remains to explain how the groups of (neighbor) bits are made
available to the computation and how they are assigned to macro-tiles.
Let us consider an infinite vertical stripe of macro-tiles of level k that share
the same Lk = N0 · . . . · Nk−1 columns. Together, these macro-tiles keep in their
memory all Lk bits of their common zone of responsibility. Each of them perform
a check for a small bit group (of length lk , which increases extremely slowly with
k and in particular is much less than Nk−1 ). We need to distribute somehow
these groups among macro-tiles of this infinite stripe.
It can be done in many different ways. For example, let us agree that the
starting point of the bit group checked by a macro-tile is the vertical coordinate of
this macro-tile in its father (if it is not too big; recall that Nk N0 N1 . . . Nk−1 ).
It remains to explain how groups of (neighbor) bits are made available to the
computational zones of the corresponding macro-tiles.
We do it in the same way as for delegated bits; the difference (and simplifica-
tion) is that now we may use only two levels of hierarchy since all the bits are
available in the previous level (and not only in the “deep unconscious”, at the
ground level). We require that this group and the coordinate that determines
its position are again known to all the sons of the macro-tile where the group
is checked. Then the sons should ensure that (1) this information is consistent
between brothers; (2) it is consistent with delegated bits where delegated bits
are in the group, and (3) it is consistent with the information in the macro-tile
(father of these brothers) itself. Since lk is small, this is a small amount of in-
formation so there is no problem of its distribution between macro-tiles of the
preceding level.
If a forbidden pattern belongs to a zone of responsibility of macro-tiles of
arbitrarily high level, then this violation is be discovered inside a macro-tile
of some level, so the tiling of the plain cannot not exist. Only the degenerate
case problem remains: so far we cannot catch forbidden substrings that are not
covered entirely by any macro-tile. We deal with the degenerate case problem in
the next section.
The problem we need to deal with: it can happen that one vertical line is not
crossed by any macro-tile of any level (see Fig. 2). In this case some substrings
are not covered entirely by any macro-tile, and we do not check them. After the
problem is realized, the solution is not difficult. We let every macro-tile check
216 B. Durand, A. Romashchenko, and A. Shen
bit groups in its extended responsibility zone that is three times wider and covers
not only the macro-tile itself but also its left and right neighbors.
Now a macro-tile of level k is given a small bit group which is a substring of
its extended responsibility zone (the width of the extended responsibility zone is
3Lk ; it is composed of the zones of responsibility of the macro-tile itself and two
its neighbors). Respectively, a macro-tile of level (k − 1) keeps the information
about three groups of bits instead of one: for its father, left uncle, and right
uncle. This information should be consistent between brothers (since they have
the same father and uncles). Moreover, it should be checked across the boundary
between macro-tiles: if two macro-tiles A and B are neighbors but have different
fathers (B’s father is A’s right uncle and A’s father is B’s left uncle), then they
should compare the information they have (about bit groups checked by fathers
of A and B) and ensure it is consistent. For this we need to increase the amount
of information kept in a macro-tile by a constant factor (a macro-tile keeps three
bit groups instead of one, etc.), but this is still acceptable.
It is easy to see that now even in the degenerate case every substring is entirely
in the extended responsibility zone of arbitrary large tiles, so all the forbidden
patterns are checked everywhere.
7 Final Adjustments
We finished our argument, but we was quite vague about the exact values of pa-
rameters saying only that some quantities should be much less than others. Now
Effective Closed Subshifts in 1D Can Be Implemented in 2D 217
we need to check again the entire construction and see that the relations between
parameters that were needed at different steps could be fulfilled together.
Let us remind the parameters used at several steps of the construction: macro-
tiles of level k+1 consist of Nk ×Nk macro-tile of level k; thus, a k-level macro-tile
consists of Lk × Lk tiles (of level 0), where Lk = N0 · . . . · Nk−1 . Macro-tiles of
level k are responsible for checking bit blocks of length lk from their extended
responsibility zone (of width 3Lk ). We have several constraints on the values of
these parameters:
– log Nk+1 Nk and even log Nk+2 Nk since every macro-tile must be able
to do simple arithmetic manipulations with its own coordinates in the father
and with coordinates of the father in the grandfather;
– Nk Lk since we need enough sons of a macro-tile of level k + 1 to keep all
bits from its zone of responsibility (we use one macro-tile of level k for each
bit);
– lk and even lk+1 should be much less than Nk−1 since a macro-tile of level
k must contain in its computational zone the bit block of length lk assigned
to itself and three bit blocks of length lk+1 assigned to its father and two
uncles (the left and right neighbors of the father);
– a k-level macro-tile should enumerate in its computational zone several for-
bidden patterns and check whether any of them is a substring of the given
(assigned to this macro-tile) lk -bits block; the number of steps in this enu-
meration must be small compared to the size of the macro-tile; for example,
let us agree that a macro-tile of level k runs this enumeration for exactly lk
steps;
– the values Nk and lk should be simple functions of k: we want to compute
lk in time polynomial in k, and compute Nk in time polynomial in log Nk
(note that typically Nk is much greater than k, so we cannot compute or
even write down its binary representation in time polynomial in k).
With all these constraints we are still quite free in the choice of parameters. For
k
example, we may let Nk = 2C2 (for some large enough constant C) and lk = k.
8 Final Remarks
One may also use essentially the same construction to implement k-dimensional
effectively closed subshifts using (k + 1)-dimensional subshifts of finite type.
How far can we go further? Can we implement evert k-dimensional effectively
closed subshifts by a tiling of the same dimension k? Another question (posed
in [13]): let us replace a finite alphabet by a Cantor space (with the standard
topology); can we represent every k-dimensional effectively closed subshifts over
a Cantor space as a continuous image of the set of tilings of dimension k + 1 (for
some finite tile set)? E. Jeandel noticed that the answers to the both questions
are negative (this fact is also a corollary of results from [4] and [16]).
218 B. Durand, A. Romashchenko, and A. Shen
References
1. Aubrun, N., Sablik, M.: personal communication (February 2010) (submitted for
publication)
2. Berger, R.: The Undecidability of the Domino Problem. Mem. Amer. Math. Soc. 66
(1966)
3. Börger, E., Grädel, E., Gurevich, Y.: The Classical Decision Problem. Springer,
Heidelberg (1987)
4. Durand, B., Levin, L., Shen, A.: Complex Tilings. J. Symbolic Logic 73(2), 593–613
(2008)
5. Durand, B., Levin, L., Shen, A.: Local Rules and Global Order, or Aperiodic
Tilings. Mathematical Intelligencer 27(1), 64–68 (2005)
6. Durand, B., Romashchenko, A., Shen, A.: Fixed Point and Aperiodic Tilings. In:
Ito, M., Toyama, M. (eds.) DLT 2008. LNCS, vol. 5257, pp. 276–288. Springer,
Heidelberg (2008)
7. Durand, B., Romashchenko, A., Shen, A.: Fixed point theorem and aperiodic
tilings (The Logic in Computer Science Column by Yuri Gurevich). Bulletin of
the EATCS 97, 126–136 (2009)
8. Durand, B., Romashchenko, A., Shen, A.: Fixed-point tile sets and their applica-
tions. CoRR abs/0910.2415 (2009), http://arxiv.org/abs/0910.2415
9. Gács, P.: Reliable Computation with Cellular Automata. J. Comput. Syst.
Sci. 32(1), 15–78 (1986)
10. Gács, P.: Reliable Cellular Automata with Self-Organization. J. Stat.
Phys. 103(1/2), 45–267 (2001)
11. Grunbaum, B., Shephard, G.C.: Tilings and Patterns. W.H. Freeman & Co., New
York (1986)
12. Gurevich, Y., Koryakov, I.: A remark on Berger’s paper on the domino problem.
Siberian Mathematical Journal 13, 319–321 (1972)
13. Hochman, M.: On the dynamic and recursive properties of multidimensional sym-
bolic systems. Inventiones mathematicae 176, 131–167 (2009)
14. Mozes, S.: Tilings, Substitution Systems and Dynamical Systems Generated by
Them. J. Analyse Math. 53, 139–186 (1989)
15. von Neumann, J.: Theory of Self-reproducing Automata. In: Burks, A. (ed.). Uni-
versity of Illinois Press, Urbana (1966)
16. Rumyantsev, A., Ushakov, M.: Forbidden Substrings, Kolmogorov Complexity and
Almost Periodic Sequences. In: Durand, B., Thomas, W. (eds.) STACS 2006.
LNCS, vol. 3884, pp. 396–407. Springer, Heidelberg (2006)
17. Wang, H.: Proving theorems by pattern recognition II. Bell System Technical Jour-
nal 40, 1–42 (1961)
was first asked: all the tools were quite standard and widely used at that time.
However, the history had chosen a different path and many nice geometric ad hoc
constructions were developed instead (by Berger, Robinson, Penrose, Ammann
and many others, see [11]; a popular exposition of Robinson-style construction is
given in [5]). In this note we try to correct this error and present a construction
that should have been discovered first but seemed to be unnoticed for more that
forty years.
1 2 1 2 1
2 1 2 1 2
1 2 1 2 1
2 1 2 1 2
1 2 1 2 1 2 1
A.3 Self-similarity
The main idea of this more sophisticated approach is to construct a “self-similar”
set of tiles. Informally speaking, this means that any tiling can be uniquely split
Effective Closed Subshifts in 1D Can Be Implemented in 2D 221
by vertical and horizontal lines into M × M blocks that behave exactly like the
individual tiles. Then, if we see a tiling and zoom out with scale 1 : M , we get
a tiling with the same tile set.
Let us give a formal definition. Assume that a non-empty set of tiles τ and
positive integer M > 1 are fixed. A macro-tile is a square of size M × M filled
with matching tiles from τ . Let ρ be a non-empty set of macro-tiles.
Definition. We say that τ implements ρ if any τ -tiling can be uniquely split by
horizontal and vertical lines into macro-tiles from ρ.
Now we give two examples that illustrate this definition: one negative and one
positive.
Negative example: Consider a set τ that consists of one tile with all white
sides. Then there is only one macro-tile (of given size M × M ). Let ρ be a one-
element set that consists of this macro-tile. Any τ -tiling (i.e., the only possible
τ -tiling) can be split into ρ-macro-tiles. However, the splitting lines are not
unique, so τ does not implements ρ.
Positive example: Let τ is a set of M 2 tiles that are indexed by pairs of integers
modulo M : The colors are pairs of integers modulo M arranged as shown (Fig. 4).
Then there exists only one τ -tiling (up to translations), and this tiling can be
uniquely split into M × M squares whose borders have colors (0, j) and (i, 0).
Therefore, τ implements a set ρ that consists of one macro-tile (Fig. 5).
Definition. A set of tiles τ is self-similar if it implements some set of macro-tiles
ρ that is isomorphic to τ .
(i, j + 1)
(i, j) (i + 1, j)
(i, j)
M
0
0
0 0
Fig. 5. The only element of ρ: border colors are pairs that contain 0
222 B. Durand, A. Romashchenko, and A. Shen
This means that there exist a 1-1-correspondence between τ and ρ such that
matching pairs of τ -tiles correspond exactly to matching pairs of ρ-macro-tiles.
The following statement follows directly from the definition:
Proposition. A self-similar tile set τ has only aperiodic tilings.
Proof. Let T be a period of some τ -tiling U . By definition U can be uniquely
split into ρ-macro-tiles. Shift by T should respect this splitting (otherwise we get
a different splitting), so T is a multiple of M . Zooming the tiling and replacing
each ρ-macro-tile by a corresponding τ -tile, we get a T /M -shift of a τ -tiling. For
the same reason T /M should be a multiple of M , then we zoom out again etc.
We conclude therefore that T is a multiple of M k for any k, i.e., T is a zero
vector.
Note also that any self-similar set τ has at least one tiling. Indeed, by definition
we can tile a M × M square (since macro-tiles exist). Replacing each τ -tile by
a corresponding macro-tile, we get a τ -tiling of M 2 × M 2 square, etc. In this
way we can tile an arbitrarily large finite region, and then standard compactness
argument (König’s lemma) shows that we can tile the entire plane.
So it remains to construct a self-similar set of tiles (a set of tiles that imple-
ments itself, up to an isomorphism).
The construction of a self-similar tile set is done in two steps. First (in
Section A.5) we explain how to construct (for a given tile set σ) another tile
set τ that implements σ (i.e., implements a set of macro-tiles isomorphic to σ).
In this construction the tile set σ is given as a program pσ that checks whether
four bit strings (representing four side colors) appear in one σ-tile. The tile set τ
then guarantees that each macro-tile encodes a computation where pσ is applied
to these four strings (“macro-colors”) and accepts them.
This gives us a mapping: for every σ we have τ = τ (σ) that implements
σ and depends on σ. Now we need a fixed point of this mapping where τ (σ)
is isomorphic to σ. It is done (Section A.6) by a classical self-referential trick
that appeared as liar’s paradox, Cantor’s diagonal argument, Russell’s paradox,
Gödel’s (first) incompleteness theorem, Tarski’s theorem, undecidability of the
Halting problem, Kleene’s fixed point (recursion) theorem and von Neumann’s
construction of self-reproducing automata — in all these cases the core argument
is essentially the same.
The same trick is used also in a classical programming challenge: to write
a program that prints its own text. Of course, for every string s it is trivial
to write a program t(s) that prints s, but how do we get t(s) = s? It seems
at first that t(s) should incorporate the string s itself plus some overhead, so
how t(s) can be equal to s? However, this first impression is false. Imagine that
our computational device is a universal Turing machine U where the program
is written in a special read-only layer of the tape. (This means that the tape
alphabet is a Cartesian product of two components, and one of the components
Effective Closed Subshifts in 1D Can Be Implemented in 2D 223
is used for the program and is never changed by U .) Then the program can get
access to its own text at any moment, and, in particular, can copy it to the
output tape.2 Now we explain in more details how to get a self-similar tile set
according to this scheme.
In this section we show how one can implement a given tile set σ, or, better to
say, how to construct a tile set τ that implements some set of macro-tiles that
is isomorphic to σ.
There are easy ways to do this. Though we cannot let τ = σ (recall that
zoom factor M should be greater than 1), we can do essentially the same for
every M > 1. Let us extend our “positive” example (with one macro-tile and
M 2 tiles) by superimposing additional colors. Superimposing two sets of colors
means the we consider the Cartesian product of color sets (so each edge carries
a pair of colors). One set of colors remains the same (M 2 colors for M 2 pairs of
integers modulo M ). Let us describe additional (superimposed) colors. Internal
edges of each macro-tile should have the same color and this color should be
different for all macro-tiles, so we allocate #σ colors for that. This gives #σ
macro-tiles that can be put into 1-1-correspondence with σ-tiles. It remains to
provide correct border colors, and this is easy to do since each tile “knows”
which σ-tile it simulates (due to the internal color). In this way we get M 2 #σ
tiles that implement the tile set σ with zoom factor M .
However, this (trivial) simulation is not really useful. Recall that our goal is
to get isomorphic σ and τ , and in this implementation τ -tiles have more colors
that σ-tiles (and we have more tiles, too). So we need a more creative encoding
of σ-colors that makes use of the space available: a side of a macro-tile has a
“macro-color” that is a sequence of M tile colors, and we can have a lot of
macro-colors in this way.
So let us assume that colors in σ are k-bit strings for some k. Then the tile
set is a subset S ⊂ Bk × Bk × Bk × Bk , i.e., a 4-ary predicate on the set Bk of
k-bit strings. Assume that S is presented by a program that computes Boolean
value S(x, y, z, w) given four k-bit strings x, y, z, w. Then we can construct a tile
set τ as follows.
We start again with a set of M 2 tiles from our example and superimpose
additional colors but use them in a more economical way. Assuming that k M ,
we allocate k places in the middle of each side of a macro-tile and allow each
of them to carry an additional color bit; then a macro-color represents a k-bit
2
Of course, this looks like cheating: we use some very special universal machine as an
interpreter of our programs, and this makes our task easy. Teachers of programming
that are seasoned enough may recall the BASIC program
10 LIST
that indeed prints its own text. However, this trick can be generalized enough to
show that a self-printing program exists in every language.
224 B. Durand, A. Romashchenko, and A. Shen
string. Then we need to arrange the internal colors in such a way that macro-
colors (k-bit strings) x, y, z and w can appear on the four sides of a macro-tile
if and only if S(x, y, z, w) is true.
To achieve this goal, let us agree that the middle part (of size, say, M/2×M/2)
in every M × M -macro-tile is a “computation zone”. Tiling rules (for superim-
posed colors) in this zone guarantee that it represents a time-space diagram of
a computation of some (fixed) universal Turing machine. (We assume that time
goes up in a vertical direction and the tape is horizontal.) It is convenient to
assume that program of this machine is written on a special read-only layer of
the tape (see the discussion in Section A.4).
Outside the computation zone the tiling rules guarantee that bits are trans-
mitted from the sides to the initial configuration of a computation.
We also require that this machine should accept its input before running out
of time (i.e., less than in M/2 steps), otherwise the tiling is impossible.
Note that in this description different parts of a macro-tile behave differently;
this is OK since we start from our example where each tile “knows” its position
in a macro-tile (keeps two integers modulo M ). So the tiles in the “wire” zone
know that they should transmit a bit, the tiles inside the computation zone know
they should obey the local rules for time-space diagram of the computation, etc.
This construction uses only bounded number of additional colors since we
have fixed the universal Turing machine (including its alphabet and number of
states); we do not need to increase the number of colors when we increase M
and k (though k should be small compared to M to leave enough space for the
wires; we do not give an exact position of the wires but it is easy to see that if
Universal
Turing
machine
program
Fig. 6. k-macro-colors are transmitted to the computation zone where they are checked
Effective Closed Subshifts in 1D Can Be Implemented in 2D 225
k/M is small enough, there is enough space for them). So the construction uses
O(M 2 ) colors (and tiles).
Now we come to the crucial point in our argument: can we arrange things in
such a way that the predicate S (i.e., the tile set it generates) is isomorphic to
the set of tiles τ used to implement it?
Assume that k = 2 log M + O(1); then macro-colors have enough space to
encode the coordinates modulo M plus superimposed colors (which require O(1)
bits for encoding).
Note that many of the rules that define τ do not depend on σ (i.e., on the
predicate S). So the program for the universal Turing machine may start by
checking these rules. It should check that
This guarantees that on the next layer macro-tiles are grouped into macro-
macro-tiles where bits are transmitted correctly to the computation zone of
a macro-macro-tile and some computation of the universal Turing machine is
performed in this zone. But we need more: this computation should be the same
computation that is performed on the macro-tile level (fixed point!). This is also
easy to achieve since in our model the text of a running program is available
to it (recall the we assume that the program is written in a read-only layer):
the program should check also that if a macro-tile is in the computation zone,
then the program bit it carries is correct (program knows the x-coordinate of a
macro-tile, so it can go at the corresponding place of its own tape to find out
which program bit resides in this place).
This sound like some magic, but we hope that our previous example (a pro-
gram for the UTM that prints its own text) makes this trick less magical (indeed,
reliable and reusable magic is called technology).
226 B. Durand, A. Romashchenko, and A. Shen
A.7 So What?
We believe that our proof is rather natural. If von Neumann lived few years more
and were asked about aperiodic tile sets, he would probably immediately give
this argument as a solution. (He was especially well prepared to it since he used
very similar self-referential tricks to construct a self-reproducing automata, see
[15].) In fact this proof somehow appeared, though not very explicitly, in P. Gács’
papers on cellular automata [10]; the attempts to understand these papers were
our starting points.
This proof is rather flexible and can be adapted to get many results usually
associated with aperiodic tilings: undecidability of domino problem (Berger [2]),
recursive inseparability of periodic tile sets and inconsistent tile sets (Gurevich
– Koryakov [12]), enforcing substitution rules (Mozes [14]) and others (see [3,6]).
But does it give something new?
We believe that indeed there are some applications that hardly could be
achieved by previous arguments. Let us conclude by mentioning two of them.
First is the construction of robust aperiodic tile sets. We can consider tilings with
holes (where no tiles are placed and therefore no matching rules are checked). A
robust aperiodic tile set should have the following property: if the set of holes is
“sparse enough”, then tiling still should be far from any periodic pattern (say, in
the sense of Besicovitch distance, i.e., the limsup of the fraction of mismatched
positions in a centered square as the size of the square goes to infinity). The no-
tion of “sparsity” should not be too restrictive here; we guarantee, for example,
that a Bernoulli random set with small enough probability p (each cell belongs
to a hole independently with probability p) is sparse.
While the first example (robust aperiodic tile sets) is rather technical (see [6]
for details), the second is more basic. Let us split all tiles in some tile set into
two classes, say, A- and B-tiles. Then we consider a fraction of A-tiles in a tiling.
If a tile set is not restrictive (allows many tilings), this fraction could vary from
one tiling to another. For classical aperiodic tilings this fraction is usually fixed:
in a big tiled region the fraction of A-tiles is close to some limit value, usually an
eigenvalue of an integer matrix (and therefore an algebraic number). The fixed-
point construction allows us to get any computable number. Here is the formal
statement: for any computable real α ∈ [0, 1] there exists a tile set τ divided into
A- and B-tiles such that for any ε > 0 there exists N such that for all n > N
the fraction of A-tiles in any τ -tiling of n × n-square is between α − ε and α + ε.
The Model Checking Problem for Prefix Classes of
Second-Order Logic: A Survey
Abstract. In this paper, we survey results related to the model checking problem
for second-order logic over classes of finite structures, including word structures
(strings), graphs, and trees, with a focus on prefix classes, that is, where all quan-
tifiers (both first- and second-order ones) are at the beginning of formulas. A
complete picture of the prefix classes defining regular and non-regular languages
over strings is known, which nearly completely coincides with the tractability
frontier; some complexity issues remain to be settled, though. Over graphs and
arbitrary relational structures, the tractability frontier is completely delineated for
the existential second-order fragment, while it is less explored for trees. Besides
surveying some of the results, we mention some open issues for research.
1 Introduction
Logicians and computer scientists have been studying for a long time the relationship
between fragments of predicate logic and the solvability and complexity of decision
problems that can be expressed within such fragments. Among the studied fragments,
quantifier prefix classes play a predominant role. This can be explained by the syntacti-
cal simplicity of such prefix classes and by the fact that they form a natural hierarchy of
increasingly complex fragments of logic that appears to be deeply related to core issues
of decidability and complexity. In fact, one of the most fruitful research programs that
kept logicians and computer scientists busy for decades was the exhaustive solution of
Hilbert’s classical Entscheidungsproblem, that is, of the problem of determining those
prefix classes of first-order logic for which formula-satisfiability (resp. finite satisfiabil-
ity of formulas) is decidable. A landmark reference (also sometimes called the “bible”)
Most of the material contained in this paper stems, modulo editorial adaptations, from the
much longer papers [15,16,26]. This paper significantly extends the earlier report [20].
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 227–250, 2010.
c Springer-Verlag Berlin Heidelberg 2010
228 T. Eiter, G. Gottlob, and T. Schwentick
on this subject is the book by Börger, Gurevich, and Grädel [6] which gives an in depth
treatment of the subject.
Quantifier prefixes emerged not only in the context of decidability theory (a com-
mon branch of recursion theory and theoretical computer science), but also in core
areas of computer science such as formal language and automata theory, and later in
complexity theory. In automata theory, Büchi [9,8], Elgot [18] and Trakhtenbrot [51]
independently proved that a language is regular iff it can be described by a sentence of
monadic second-order logic, in particular, by a sentence of monadic existential second-
order logic. In complexity theory, Fagin [19] showed that a problem on finite structures
is in NP iff it can be described by a sentence of existential second-order logic (ESO).
These fundamental results have engendered a large number of further investigations and
results on characterizing language and complexity classes by fragments of logic (see,
for instance the monographs [45,42,13,32]).
While the classical research programme of determining the prefix characterizations
of decidable fragments of first-order logic was successfully completed around 1984
(cf. [6]), until recently little was known on analogous problems on finite structures, in
particular, on the tractability/intractability frontier of the model checking problem for
prefix classes of second-order logic (SO), and in particular, of existential second-order
logic (ESO) over finite structures. In the late nineties, a number of scientists, including
Yuri Gurevich, Phokion Kolaitis, and the authors started to attack this new research
programme in a systematic manner.
By complexity of a prefix class C we mean the complexity of the following model-
checking problem: Given a fixed sentence Φ in C, decide for variable finite structures
A whether A is a model of Φ, which we denote by A |= Φ. Determining the complex-
ity of all prefix classes is an ambitious research programme, in particular the analysis
of various types of finite structures such as strings, that is, finite word structures with
successor, trees, graphs, or arbitrary finite relational structures (corresponding to rela-
tional databases). Over strings and trees, one of the main goals of this classification is
to determine the regular prefix classes, that is, those whose formulas express regular
languages only; note that by the Büchi-Elgot-Trakhtenbrot Theorem, regular fragments
over strings are (semantically) included in monadic second-order logic.
In the context of this research programme, three systematic studies were carried
out recently, that shed light on the prefix classes of the existential fragment ESO (also
denoted by Σ11 ) of second-order logic:
– In [15], the ESO prefix-classes over strings are exhaustively classified. In particular,
the precise frontier between regular and nonregular classes is traced out, and it
is shown that every class that expresses some nonregular language also expresses
some NP-complete language. There is thus a huge complexity gap in ESO: some
prefix classes can express only regular languages (which are well-known to have
extremely low complexity), while all others are intractable.
– In [16] this line of research was continued by systematically investigating the syn-
tactically more complex prefix classes Σk1 (Q) of second-order logic for each inte-
ger k > 1 and for each first-order quantifier prefix Q. An exhaustive classification
of the regular and nonregular prefix classes of this form is given, and complexity
results for the corresponding model-checking problems are derived.
The Model Checking Problem for Prefix Classes of Second-Order Logic 229
– In [26], the complexity of all ESO prefix-classes over graphs and arbitrary rela-
tional structures is analyzed, and the tractability/intractability frontier is completely
delineated. Unsurprisingly, several classes that are regular over strings become NP-
hard over graphs. Interestingly, the analysis shows that one of the NP-hard classes
becomes polynomial for the restriction to undirected graphs without self-loops.
– The precise tractability frontier of ESO and SO prefix classes over trees has not yet
been determined. There are some partial results, however. There are also important
complexity results for MSO over trees, as well as a number of expressiveness results
that show that MSO over trees is captured by regular automata and equivalent in
expressive power to simpler formalisms.
In this paper, we review these results. After relevant definitions and recalling some
classical results in the next section, we start with a brief survey of the results on ESO
prefix-classes over strings in [15] (Sect. 3), followed by full second-order logic over
strings (Sect. 4), where we also consider finer grained complexity than regularity vs.
non-regularity. After that, we turn to ESO over graphs [25] (Sec. 5) and then discuss
SO and MSO over trees (Sect. 6). The final Sect. 7 addresses further issues and open
problems.
We consider second-order logic with equality (unless explicitly stated otherwise) and
without function symbols of positive arity. Predicates are denoted by capitals and indi-
vidual variables by lower case letters; a bold face version of a letter denotes a tuple of
corresponding symbols.
A prefix is any string over the alphabet {∃, ∀}, and a prefix set is any language Q ⊆
{∃, ∀}∗ of prefixes. A prefix set Q is trivial, if Q = ∅ or Q = {λ}, that is, it consists
of the empty prefix. In the rest of this paper, we focus on nontrivial prefix sets. We
often view a prefix Q as the prefix class {Q}. A generalized prefix is any string over
the extended prefix alphabet {∃, ∀, ∃∗ , ∀∗ }. A prefix set Q is standard, if either Q =
{∃, ∀}∗ or Q can be given by some generalized prefix.
For any prefix Q, the class Σ01 (Q) is the set of all prenex first-order formulas (which
1
may contain free variables and constants) with prefix Q, and for every k ≥ 0, Σk+1 (Q)
1 1
(resp., Πk+1 ) is the set of all formulas ∃RΦ (resp., ∀RΦ) where Φ is from Π k (resp.,
Σk1 ). For any prefix set Q, the class Σk1 (Q) is the union Σk1 (Q) = Q∈Q Σk1 (Q).
We write also ESO for Σ11 . For example, ESO(∃∗ ∀∃∗ ) is the class of all formulas
∃R∃y∀x∃zϕ, where ϕ is quantifier-free; this is the class of ESO-prefix formulas,
whose first-order part is in the well-known Ackermann class with equality.
Let A = {a1 , . . . , am } be a finite alphabet. A string over A is a finite first-order
structure W = U, CaW1 , . . . , CaWm , Succ W , min W , max W , for the vocabulary σA =
{Ca1 , . . . , Cam , Succ, min, max }, where
– Succ W is the usual successor relation on U and min W and max W are the first and
the last element in U , respectively.
Observe that this representation of a string is a successor structure as discussed for in-
stance in [14]. An alternative representation uses a standard linear order < on U instead
of the successor Succ. In full ESO or monadic second-order logic, < is tantamount to
Succ since either predicate can be defined in terms of the other.
The strings W for A correspond to the nonempty finite words over A in the obvious
way; in abuse of notation, we often use W in place of the corresponding word from A∗
and vice versa.
A SO sentence Φ over the vocabulary σA is a second-order formula whose only free
variables are the predicate variables of the signature σA , and in which no constant sym-
bols except min and max occur. Such a sentence defines a language over A, denoted
L(Φ), given by L(Φ) = {W ∈ A∗ | W |= Φ}. We say that a language L ⊆ A∗ is ex-
pressed by Φ, if L(Φ) = L ∩ A+ (thus, for technical reasons, without loss of generality
we disregard the empty string); L is expressed by a set S of sentences, if L is expressed
by some Φ ∈ S. We say that S captures a class C of languages, if S expresses all and
only the languages in C.
Example 2.1. Let us consider some languages over the alphabet A = {a, b}, and how
they can be expressed using logical sentences.
– L1 = {a, b}∗ b{a, b}∗: this language is expressed by the simple sentence
∃x.Cb (x).
– L4 = {w ∈ {a, b}∗ | |w| = 2n, n ≥ 1}: we express this language by the sentence
Note that this a monadic ESO sentence. It postulates the existence of a monadic
predicate E, that is, a “coloring” of the string such that neighbored positions have
different color, and the first and last position are uncolored and colored,
respectively.
– L5 = {an bn | n ≥ 1}: Expressing this language is more involved:
Observe that this sentence is not monadic. Informally, it postulates the existence of
an arc from the first to the last position of the string W , which must be an a and a
b, respectively, and recursively arcs from the i-th to the (|W | − i + 1)-th position.
The Model Checking Problem for Prefix Classes of Second-Order Logic 231
with a linear ordering precisely characterizes the star-free regular languages. This the-
orem was extended by Thomas [49] to ω-languages, that is, languages with infinite
words. Later several hierarchies of the star-free languages were studied and logically
characterized (see, for instance [49,41,42,43]). Straubing, Thérien and Thomas [46]
showed that first-order logic with modular counting quantifiers characterize the regu-
lar languages whose syntactic monoids contain only solvable groups. These and many
other related results can be found in the books and surveys [45,49,41,42,43].
classifies all ESO(Q) classes as either regular or intractable. Among the main results
of [15] are the following findings.
(1) The class ESO(∃∗ ∀∃∗ ) is regular. This theorem is the technically most involved
result of [15]. Since this class is nonmonadic, it was not possible to exploit any of the
ideas underlying Büchi’s proof for proving it regular. The main difficulty consists in
the fact that relations of higher arity may connect elements of a string that are very
distant from one another; it was not a priori clear how a finite state automaton could
guess such connections and check their global consistency. To solve this problem, new
combinatorial methods (related to hypergraph transversals) were developed.
Interestingly, model checking for the fragment ESO(∃∗ ∀∃∗ ) is NP-complete over
graphs. For example, the well-known set-splitting problem can be expressed in it. Thus
the fact that our input structures are monadic strings is essential (just as for MSO).
(2) The class ESO(∃∗ ∀∀) is regular. The regularity proof for this fragment is easier but
also required the development of new techniques (more of logical than of combinatorial
nature). Note that model checking for this class, too, is NP-complete over graphs.
(3) Any class ESO(Q) not contained in ESO(∃∗ ∀∃∗ ) ∪ ESO(∃∗ ∀∀) is not regular.
Thus ESO(∃∗ ∀∃∗ ) and ESO(∃∗ ∀∀) are the maximal regular standard prefix classes.
The unique maximal (general) regular ESO-prefix class is the union of the two classes,
that is, ESO(∃∗ ∀∃∗ ) ∪ ESO(∃∗ ∀∀) = ESO(∃∗ ∀(∀ ∪ ∃∗ )).
As shown in [15], it turns out that there are three minimal nonregular ESO-prefix
classes, namely the standard prefix classes ESO(∀∀∀), ESO(∀∀∃), and ESO(∀∃∀).
All these classes express nonregular languages by sentences whose list of second-order
variables consists of a single binary predicate variable.
Thus, 1.-3. give a complete characterization of the regular ESO(Q) classes.
(4) The following dichotomy theorem is derived: Let ESO(Q) be any prefix class.
Then, either ESO(Q) is regular, or ESO(Q) expresses some NP-complete language.
This means that model checking for ESO(Q) is either possible by a deterministic finite
automaton (and thus in constant space and linear time) or it is already NP-complete.
Moreover, for all NP-complete classes ESO(Q), NP-hardness holds already for sen-
tences whose list of second-order variables consists of a single binary predicate vari-
able. There are no fragments of intermediate difficulty between REG and NP.
(5) The above dichotomy theorem is paralleled by the solvability of the finite satisfia-
bility problem for ESO (and thus FO) over strings. As shown in [15], over finite strings
satisfiability of a given ESO(Q) sentence is decidable iff ESO(Q) is regular.
(6) In [15], a precise characterization is given of those prefix classes of ESO which
are equivalent to MSO over strings, that is of those prefix fragments that capture the
class REG of regular languages. This provides new logical characterizations of REG.
Moreover, in [15] it is established that any regular ESO-prefix class is over strings either
equivalent to full MSO, or is contained in first-order logic, in fact, in FO(∃∗ ∀).
It is further shown that ESO(∀∗ ) is the unique minimal ESO prefix class which
captures NP. The proof uses results in [36,14] and well-known hierarchy theorems.
The main results of [15] are summarized in Fig. 1. In this figure, the ESO-prefix classes
are divided into four regions. The upper two contain all classes that express nonregular
languages, and thus also NP-complete languages. The uppermost region contains those
classes which capture NP; these classes are called NP-tailored. The region next below,
234 T. Eiter, G. Gottlob, and T. Schwentick
NP-tailored
ESO(∀∗ )
regular-tailored
regular NP-hard
separated by a dashed line, contains those classes which can express some NP-hard
languages, but not all languages in NP. Its bottom is constituted by the minimal non-
regular classes, ESO(∀∀∀), ESO(∀∃∀), and ESO(∀∀∃). The lower two regions contain
all regular classes. The maximal regular standard prefix classes are ESO(∃∗ ∀∃∗ ) and
ESO(∃∗ ∀∀). The dashed line separates the classes which capture REG(called regular-
tailored), from those which do not; the expressive capability of the latter classes is
restricted to first-order logic (in fact, to FO(∃∗ ∀)) [15]. The minimal classes which
capture REG are ESO(∀∃) and ESO(∀∀).
Potential Applications. Monadic second-order logic over strings is currently used in the
verification of hardware, software, and distributed systems. An example of a specific
tool for checking specifications based on MSO is the MONA tool developed at the
BRICS research lab in Denmark [3,31].
Observe that certain interesting desired properties of systems are most naturally for-
mulated in nonmonadic second-order logic. Consider, as an unpretentious example2,
the following property of a ring P of processors of different types, where two types
may either be compatible or incompatible with each other. We call P tolerant, if for
each processor p in P there exist two other distinct processors backup 1 (p) ∈ P and
backup 2 (p) ∈ P , both compatible to p, such that the following conditions are satisfied:
1. for each p ∈ P and for each i ∈ {1, 2}, backupi (p) is not a neighbor of p;
2. for each i, j ∈ {1, 2}, backup i (backup j (p)) ∈ {p, backup 1 (p), backup 2 (p)}.
Intuitively, we may imagine that in case p breaks down, the workload of p can be reas-
signed to backup 1 (p) or to backup 2 (p). Condition 1 reflects the intuition that if some
processor is damaged, there is some likelihood that also its neighbors are (for instance
2
Our goal here is merely to give the reader some intuition about a possible type of application.
The Model Checking Problem for Prefix Classes of Second-Order Logic 235
in case of physical affection such as radiation), thus neighbors should not be used as
backup processors. Condition 2 states that the backup processor assignment is antisym-
metric and anti-triangular; this ensures, in particular, that the system remains functional,
even if two processors of the same type are broken (further processors of incompatible
type might be broken, provided that broken processors can be simply bypassed for com-
munication).
Let T be a fixed set of processor types. We represent a ring of n processors numbered
from 1 to n where processor i is adjacent to processor i+1 (mod n) as a string of length
n from T ∗ whose i-th position is τ if the type of the i-th processor is τ ; logically, Cτ (i)
is then true. The property of P being tolerant is expressed by the following second-order
sentence Φ:
∃R1 , R2 , ∀x∃y1 , y2 . compat(x, y1 ) ∧ compat(x, y2 ) ∧
R1 (x, y1 ) ∧ R2(x, y2 ) ∧
i=1,2 j=1,2 ¬Ri (yj , x) ∧ ¬R1 (yj , yi ) ∧ ¬R2 (yj , yi ) ∧
x= y1 ∧ x = y2 ∧ y1 = y2 ∧
¬Succ(x,
y 1 ) ∧ ¬Succ(y 1 , x) ∧ ¬Succ(x,y2 ) ∧ ¬Succ(y2 , x) ∧
(x = max ) → (y1
= min ∧ y2
= min) ∧
(x = min) → (y1 = max ∧ y2
= max ) ,
where compat (x, y) is the abbreviation for the formal statement that processor x is
compatible to processor y (which can be encoded as a simple boolean formula over Cτ
atoms).
Φ is the natural second-order formulation of the tolerance property of a ring of pro-
cessors. This formula is in the fragment ESO(∃∗ ∀∃∗ ); hence, by our results, we can im-
mediately classify tolerance as a regular property, that is, a property that can be checked
by a finite automaton.
In a similar way, one can exhibit examples of ESO(∃∗ ∀∀) formulas that naturally
express interesting properties whose regularity is not completely obvious a priori. We
thus hope that our results may find applications in the field of computer-aided
verification.
Non-regular classes
Σ31 (∀∀) Σ21 (∃∃) Σ11 (∀∀∀) Σ11 (∀∀∃) Σ11 (∀∃∀) Σ21 (∃∀) Σ31 (∀∃)
Σk1 (∀) Σ21 (∀∀) Σ11 (∃∗ ∀∃∗ ) Σ11 (∃∗ ∀∀) Σ21 (∀∃) Σk1 (∃)
Regular classes
Fig. 3. Semantic inclusion relations between Σ21 (Q) classes over strings, |Q| = 2
Note that Grädel and Rosen have shown [28] that Σ11 (FO2 ), that is, existential second-
order logic with two first-order variables, is over strings regular. By the results in [16],
Σk1 (FO2 ), for k ≥ 2, is non-regular (in fact, intractable).
Figure 3 shows inclusion relationships between the classes Σ21 (Q) where Q contains
two quantifiers. Similar relationshipshold for Σk1 (Q) classes. Furthermore, as shown
in [16], we have that Σ21 (∀∃) = Σ11 ( ∀∃) and Σ31 (∃∀) = Σ21 ( ∃∀), where Σk1 ( Q)
(resp., Σk1 ( Q)) denotes the class of Σk1 sentences where the first-order part is a finite
disjunction (resp., conjunction) of prefix formulas with quantifier in Q.
We now look in more detail into these results.
Proposition 4.1 ([16]). Every formula in Σk1 (∃j ), where k ≥ 1 is odd, is equivalent
1
to some formula in Σk−1 (∃j ), and every formula in Σk1 (∀j ), where k ≥ 2 is even, is
1
equivalent to some formula in Σk−1 (∀j ).
Based on this and a generalization of the proof of Theorem 9.1 in [15], one obtains:
Theorem 4.1 ([16]). Over strings, Σ21 (∀∀) = MSO.
For the extension of the FO prefix ∃j (resp., ∀j ) in Proposition 4.1 with a single univer-
sal (resp., existential)
quantifier, a similar yet slightly weaker result holds. Let Σk1 ( Q)
(resp., Σk1 ( Q)) denote the class of Σk1 sentences where the first-order part is a finite
disjunction (resp., conjunction) of prefix formulas with quantifier in Q.
Proposition 4.2 ([16]). Every formula in Σk1 (∃j ∀), where k ≥ 1 is odd and j ≥ 0,
is equivalent to some formula in Σk−1 ( ∃j ∀), and every formula
1 1 j
inj Σk (∀ ∃), where
1
k ≥ 2 is even and j ≥ 1, is equivalent to some formula in Σk−1 ( ∀ ∃).
From this and the regularity of ESO(∀∃) over strings, one can easily derive that Σ21 (∀∃)
over strings is regular.
Theorem 4.2 ([16]). Over strings, Σ21 (∀∃) = MSO.
Non-regular Fragments. We consider first Σ21 and then Σk1 with k > 2.
Σ21 (Q) where |Q| ≤ 2. While for the FO prefixes Q = ∀∃ and Q = ∀∀, regularity of
ESO(Q) generalizes to Σ21 , this is not the case for Q = ∃∀ and Q = ∃∃.
Theorem 4.3 ([16]). Σ21 (∃∀) is nonregular.
Indeed, [16] gave an example of a non-regular language defined by a Σ21 (∃∀) sentence.
Let A = {a, b} and consider the following sentence:
Informally, this sentence is true for a string W , just if the number of b’s in W (denoted
#b(W )) is at most logarithmic in the number of a’s in W (denoted #a(W )). More
formally, L(Φ) = {w ∈ {a, b}∗ | #b(W ) ≤ log #a(W )}; by well-known properties
of regular languages, this language is not regular.
For the class Σ21 (∃∃), the proof of non-regularity in [16] is more involved. It uses the
following lemma, which shows how to emulate universal FO quantifiers using universal
SO quantifiers and existential FO quantifiers over strings.
Lemma 4.1 ([16]). Over strings, every universal first-order formula ∀xϕ(x) which
contains no predicates of arity > 2 is equivalent to some Π11 (∃∃) formula.
Proof. As we use techniques from the proof of this lemma later, we recall the sketch
from [16].
The idea is to emulate the universal quantifier ∀xi for every xi from x = x1 , . . . , xk
using a universally quantified variable Si ranging over singletons, and express “xi ” by
k
“∃xi .Si (Xi ).” Then, ∀xϕ(x) is equivalent to ∀S∃x i=1 Si (xk ) ∧ ϕ(x).
238 T. Eiter, G. Gottlob, and T. Schwentick
We can eliminate all existential variables x but two in this formula as follows.
Rewrite the quantifier-free part into a CNF i=1 δi (x), where each δi (x) is a disjunc-
tion of literals. Denote by δij,j (xj , xj ) the clause obtained from δi (x) by removing
every literal which contains some variable from x different from xj and xj . Since no
predicate in ϕ has arity > 2, formula ∀xϕ(x) is equivalent to the formula
⎛ ⎞
∀S∃x ⎝ ∃x∃yδij,j (x, y)⎠ .
i=1 j=j
The conjunction i=1 can be simulated by using universally quantified Boolean vari-
ables Z1 , . . . , Z and a control formula β which states that exactly one out of Z1 , . . . ,
Zn is true. By pulling existential quantifiers, we thus obtain
∀S∃x∃yγ,
where ⎛ ⎞
γ=β→ ⎝Zi → δij,j (x, y)⎠ .
i=1 j=j
Thus, it remains to express the variables Si ranging over singletons. For this, we use a
technique to express Si as the difference Xi,1 \ Xi,2 of two monadic predicates Xi,1
and Xi,2 which describe initial segments of the string. Fortunately, the fact that Xi,1 and
Xi,2 are not initial segments or their difference is not a singleton can be expressed by
a first-order formula ∃x∃yψi (x, y), where ψi is quantifier-free, by using the successor
predicate. Thus, we obtain
k
where γ ∗ results from γ by replacing each Si with Xi,1 and Xi,2 . By pulling existential
quantifiers, we obtain a Π11 (∃∃) formula, as desired.
Given that Σ11 (∀∀∀) contains NP-complete languages (Fig. 1), it thus follows:
Theorem 4.4 ([16]). Σ21 (∃∃) is nonregular.
Therefore, the inclusions in Fig. 3 are both strict.
Σ21 (Q) where |Q| > 2. By the results of the previous subsection and Sect. 3, we can
derive that no Σ21 (Q) prefix class where Q contains more than two variables is regular.
Indeed, Theorem 4.3 implies this for every prefix Q which contains ∃ followed by ∀,
and Theorem 4.4 implies this for every prefix Q which contains at least two existential
quantifiers. For the remaining minimal prefixes Q ∈ {∀∀∀, ∀∀∃}, non-regularity of
Σ21 (Q) follows from the results summarized in Fig. 1. Thus,
Theorem 4.5 ([16]). Σ21 (Q) is nonregular for every prefix Q such that |Q| > 2.
The Model Checking Problem for Prefix Classes of Second-Order Logic 239
Table 1. Complexity of model checking for Σk1 (Q) prefix classes, k ≤ 3 (η = {∀, ∃})
Σk1 (Q) where k > 2. Let us now consider the higher fragments of SO over strings. The
question is whether any of the regular two-variable prefixes Q ∈ {∀∀, ∀∃} for Σ21 (Q)
survives. However, as we shall see this is not the case.
Since Π21 (∀∀) is contained in Σ31 (∀∀), it follows from Theorem 4.4 that Σ31 (∀∀)
is nonregular. For the remaining class Σ31 (∀∃), one can use a result that an existential
FO quantifier, followed by another existential FO quantifier, can be emulated using an
existential SO and a FO universal quantifier. This leads to the following result.
Theorem 4.6 ([16]). Over strings, Σk1 (∃∃) ⊆ Σk1 (∀∃) for every odd k, and Σk1 (∀∀) ⊆
Σk1 (∃∀) for every even k ≥ 2.
Thus, combined with Theorem 4.4, this shows that Σ31 (∀∃) is nonregular.
In fact, the emulation of an existential FO quantifier as above via an existential SO
and an universal FO quantifier is feasible under fairly general conditions; this leads to
the following result.
Theorem 4.7 ([16]). Let P1 ∈ {∀}∗ and P2 ∈ {∃, ∀}∗ ∀{∃, ∀}∗ be first-order prefixes.
Then,
Thus, for example we obtain Σ21 (∀∀∀) ⊆ Σ21 (∃∀∀), and by repeated application
Σ21 (∀∀∀) ⊆ Σ21 (∃∃∀).
4.2 Complexity
Generalizing Fagin’s Theorem, Stockmeyer [44] showed that full SO captures the poly-
nomial hierarchy (PH). Second-order variables turn out to be quite powerful. In fact,
already two first-order variables, a single binary predicate variable, and further monadic
predicate variables are sufficient to express languages that are complete for the levels
of PH.
The results in [15] and [16] imply that deciding whether W |= Φ for a fixed formula
Φ and a given string W is intractable for all prefix classes Σk1 (Q) which are (syntacti-
cally) not included in the maximal regular prefix classes shown in Fig. 2. Table 1 shows
prefix classes up to k = 3 that are C-complete for prominent complexity classes; the
precise complexity of some classes (e.g., Σ21 (∀∗ ∃∀∗ ) and its analogue in Σ31 ) is open.
240 T. Eiter, G. Gottlob, and T. Schwentick
Indeed, by Fig. 1, Σ21 (∀∀∀) is intractable; hence by Theorem 4.7 also Σ21 (∃∀∀) is
intractability, Furthermore, the proof of Theorem 4.4 via Lemma 4.1 and Fig. 1 not only
establishes that Σ21 (∃∃) is non-regular, but in fact NP-hard.3 As Π21 (∀∀) is contained
in Π31 (∀∀) resp. Σ31 (∀∃) (cf. Theorem 4.7), also the latter prefix classes are intractable.
The complexity of SO over strings increases with the number of SO quantifier alter-
nations. Let us consider Σ21 (∃∃) more closely.
Theorem 4.8. Model checking for over strings is Σ2p -complete for each Σ21 (Q) where
Q ∈ η ∗ ∃η ∗ ∃η ∗ and η = {∀, ∃}.
Proof. (Sketch) Clearly the problem is in Σ2p . The Σ2p -hardness for Q = ∃∃ can be
shown by encoding quantified Boolean formulas (QBFs) of the form
enc(C) = [(00)+(01)+(10)-][(00)-(01)-(10)+] .
Here, p, q, r are encoded by the binary strings 00, 01, 10, respectively. Clearly, enc(F )
is obtainable from any standard representation of F in logspace.
The formulas
state that the string has at positions x and y the same letter from A and that x is a letter
of a variable encoding, respectively.
Then, let Φ be the following Σ11 (∀∀∀) sentence:
where G and V are unary, R and R are binary, and ϕ(x, y, z) is the conjunction of the
following quantifier-free formulas ϕG , ϕV , ϕR , and ϕR :
next,
ϕV = C) (x) ∧ C) (y) ∧ R(x, y) → V (x) ↔ V (y) ,
ϕR = R(x, y) → (eqcol(x, y) ∧ varenc(x)) ∧
(C( (x) ∧ C( (y)) → R(x, y) ∧
¬C( (x) ∧ Succ(z, x) → R(x, y) ↔ (R (z, y) ∧ eqcol(x, y)) , (4)
and
ϕR = Succ(z, y) → R (x, y) ↔ R(x, z) ∧ ¬C) (z) .
As shown in [15], enc(F ) |= Φ iff F is satisfiable. Back now to our QBF (1), we can
choose an encoding where we simply encode the clauses in ϕ as described and mark in
the string the occurrences of the variables qi with an additional predicate. Furthermore,
we represent truth assignments to the qi ’s by a monadic variable V . The sentence Φ for
F is rewritten to
Ψ = ∃R∃R ∃V ∀V [α1 ∧ (α2 ∨ α3 )],
where α1 is a universal first-order formula which defines proper R, R and V using ϕR ,
ϕR , and ϕV ; α2 is a ∃∃-prenex first-order formula which states that V assigns two
different occurrences of some universally quantified atom qi different truth values; and
α3 states that the assignment to p1 , . . . , pn , q1 , . . . , qm given by V and V violates ϕ.
The latter can be easily checked by a finite state automaton, and thus is expressible as
a monadic Π11 (∃∃) sentence. As Ψ contains no predicate of arity > 2, by applying the
techniques of Lemma 4.1 we can rewrite Ψ to an equivalent Σ21 (∃∃) sentence.
Other fragments of Σ21 have lower complexity. For instance,
Theorem 4.9. Model checking for Σ21 (∀∗ ∃) over strings is NP-complete, and NP-hard
for each Σ21 (Q) where Q ∈ ∀∀∀∗ ∃.
On the other hand, by generalizing the QBF encoding, we can easily encode evaluating
Σkp -complete QBFs into Σk1 (∃∃) for even k > 2 and Πkp -complete QBFs into Πk1 (∃∃)
for odd k > 1 (by adding further leading quantifiers).
NP-complete classes
E2 eaa E1 ae E1 aaa E1 E1 aa
E ∗ e∗ a E1 e∗ aa Eaa
PTIME classes
Fig. 4. ESO on arbitrary structures, directed graphs and undirected graphs with self-loops
collection of all finite structures over any relational vocabulary that contains a relation
symbol of arity ≥ 2. This characterization is obtained by showing (assuming P = NP)
that there are four minimal NP-hard and three maximal PTIME prefix classes, and that
these seven classes combine to give complete information about all other prefix classes.
This means that every other prefix either contains one of the minimal NP-hard prefix
classes as a substring (and, hence, is NP-hard) or is a substring of a maximal PTIME
prefix class (and, hence, is in PTIME). Figure 4 depicts the characterization of the NP-
hard and PTIME prefix classes of ESO on general graphs.
As seen in Fig. 4, the four minimal NP-hard classes are E2 eaa, E1 ae, E1 aaa, and
E1 E1 aa, while the three maximal PTIME classes are E ∗ e∗ a, E1 e∗ aa, and Eaa. The
NP-hardness results are established by showing that each of the four minimal prefix
classes contains ESO-sentences expressing NP-complete problems. For example, a SAT
encoding on general graphs can be expressed by an E1 ae sentence. Note that the first-
order prefix class ae played a key role in the study of the classical decision problem for
fragments of first-order logic (see [6]). As regards the maximal PTIME classes, E ∗ e∗ a
is actually FO, while the model checking problem for fixed sentences in E1 e∗ aa and
Eaa is reducible to 2SAT and, thus, is in PTIME (in fact, in NL).
The second result of [26] completely characterizes the computational complexity of
prefix classes of ESO on undirected graphs without self-loops. As mentioned earlier, it
was shown that a dichotomy still holds, but its boundary changes. The key difference
is that E ∗ ae turns out to be PTIME on undirected graphs without self-loops, while its
subclass E1 ae is NP-hard on general graphs. It can be seen that interesting properties of
graphs are expressible by E ∗ ae-sentences. Specifically, for each integer m > 0, there
is a E ∗ ae-sentence expressing that a connected graph contains a cycle whose length
is divisible by m. This was shown to be decidable in polynomial time by Thomassen
[50]. The class E ∗ ae constitutes a maximal PTIME class, because all four extensions
of E1 ae by any single first-order quantifier are NP-hard on undirected graphs without
self-loops [26]. The other minimal NP-hard prefixes on general graphs remain NP-hard
also on undirected graphs without self-loops. Consequently, over such graphs, there
are seven minimal NP-hard and four maximal PTIME prefix classes that determine the
computational complexity of all other ESO-prefix classes (see Fig. 5).
Technically, the most difficult result of [26] is the proof that E ∗ ae is PTIME on
undirected graphs without self-loops. First, using syntactic methods, it is shown that
each E ∗ ae-sentence is equivalent to some E1∗ ae-sentence. After this, it is shown that
244 T. Eiter, G. Gottlob, and T. Schwentick
NP-complete classes
E2 eaa E1 aaeE1 aea E1 aee E1 eae E1 aaa E1 E1 aa
E ∗ e∗ a E ∗ ae E1 e∗ aa Eaa
PTIME classes
Fig. 5. ESO on undirected graphs without self-loops. The dotted boxes in Figs. 4 and 5 indicate
the difference between the two cases.
for each E1∗ ae-sentence the model-checking problem over undirected graphs without
self-loops is is equivalent to a natural coloring problem called the saturation problem.
This problem asks whether there is a particular mapping from a given undirected graph
without self-loops to a fixed, directed pattern graph P which is extracted from the
E1∗ ae-formula under consideration. Depending on the labelings of cycles in P , two
cases of the saturation problem are distinguished, namely pure pattern graphs and
mixed pattern graphs. For each case, a polynomial-time algorithm is designed. In sim-
plified terms and focussed on the case of connected graphs, the one for pure pattern
graphs has three main ingredients. First, adapting results by Thomassen [50] and us-
ing a new graph coloring method, it is shown that if a E1∗ ae-sentence Φ gives rise to a
pure pattern graph, then a fixed integer k can be found such that every undirected graph
without self-loops that have tree-width bigger than k satisfies Φ. Second, Courcelle’s
Theorem [11] (see also Sect. 4) is used by which model-checking for MSO sentences
is polynomial on graphs of bounded tree-width. Third, Bodlaender’s result [5] is used
that, for each fixed k, there is a polynomial-time algorithm to check whether a given
graph has tree-width at most k.
The polynomial-time algorithm for mixed pattern graphs has a similar architecture,
but requires the development of substantial additional technical machinery, including a
generalization of the concept of graphs of bounded tree-width. The results of [26] can
be summarized in the following theorem.
Theorem 5.1. Figures 4 and 5 provide a complete classification of the complexity of
all ESO prefix classes on graphs.
6 SO over Trees
Trees are fundamental data structures widely used in computer science and mathemat-
ical linguistics. The importance of studying logical languages over trees has increased
dramatically with the advent of the World Wide Web. In fact, the Web can be consid-
ered as the world’s largest data and information repository, and most information on the
Web is semi-structured, that is, presented in tree-shaped form, for example formatted
in HTML or in XML [1]. Web pages and Web documents can thus be considered as fi-
nite labeled trees. Special query languages such as XPath [52] have been developed for
querying XML documents. The core fragments of these languages contain constructs
The Model Checking Problem for Prefix Classes of Second-Order Logic 245
that are not first-order expressible, but can be defined in second-order logic. Similarly,
most relevant data extraction tasks for selecting relevant data from a HTML Web site
and for annotating the data and transforming it into a highly structured format can be
expressed in MSO [21,22,23]. It is thus not astonishing that there has been a renewed
interest in understanding complexity and expressiveness issues related to SO over finite
trees.
There are various possible formal definitions of trees. Usually a finite tree is defined
over a universe of nodes referred to as dom. We use the monadic predicates root (.) and
leaf (.) to say that a node is the root or a leaf, respectively.
One mostly considers labeled trees, that is, trees whose nodes are labeled by letters
from a finite alphabet Σ. Each label e can be represented by a monadic “color” predicate
label e , such that for each node a ∈ dom, label i (a) is true iff i is labeled with the letter
i. (Note that these label predicates have the same role as the Ci predicates we used for
strings.)
An important distinction is the one between ranked and unranked trees. A ranked tree
is one in which each node has a number of successors, also called children bounded by
K. In this case, the successors can be represented via binary functions childk , k ≤ K,
where childi (a, b) means that node b is the i-th child of node a.
More formally, a finite ranked tree is thus defined as a finite relational structure
trk = dom, root , leaf , (child k )k≤K , (label a )a∈Σ .
where, as explained, “dom” is the set of nodes in the tree, “root ”, “leaf ”, and the
“label a ” relations are unary, and the “child k ” relations are binary.
In an unranked tree, the number of successors of a node may be unbounded. This
means that we cannot directly encode successors by a fixed number of child predicates.
Rather, we use the predicate first child and next sibling such that first child (a, b)
is true whenever node b is the first (leftmost) child of a, and next sibling (a, b) if b
is the nearest sibling to the right of a. In addition, last sibling(a) is used to indicate
that a is the last of the siblings of a node. Note that the predicates root , leaf , and
last sibling can be logically defined from first child and next sibling. However, these
definitions would require extra quantifiers and negation, so we prefer to keep these
simple predicates in the signature. The signature τur for unranked trees thus looks as
follows:
tur = dom, root , leaf , (label a )a∈Σ , first child , next sibling, last sibling.
Unfortunately, to date, no complete complexity characterization for SO prefix classes
over trees is known. Even for ESO prefix classes over trees we do not have a precise
characterization. For ESO over (ranked or unranked) trees we know the following.
– All NP-hard classes for the string case remain NP-hard in the tree case. In particular,
model checking for the classes ESO(∀∀∀), ESO(∀∀∃), and ESO(∀∃∀) remain NP-
complete.
– Model checking for ESO(∃∗ ∀∀) was shown in [15] to be feasible in polynomial
time over arbitrary structures and thus, in particular over trees. However the status
of the (in)famous class ESO(∃∗ ∀∃∗ ) over trees is actually obscure. The combina-
torial proof arguments used in [15] to prove tractability of model checking in the
246 T. Eiter, G. Gottlob, and T. Schwentick
string case do not directly carry over to trees. It is currently unclear whether they
can be adapted to cover the tree case.
– For (non-monadic) full SO prefix classes, there is even less clarity. Of course, all SO
prefix classes known to be NP-hard (with respect to model checking) over strings
mentioned in Sect. 4 are trivially also NP-hard over trees, and all tractable ESO
prefix classes over graphs mentioned in Sect. 5 are also tractable over trees.
In the rest of this section, let us concentrate on monadic second-order logic (MSO).
The Büchi-Elgot-Trakhtenbrot Theorem that MSO over strings captures the regular
languages (see Sect. 2) carries over to ranked and unranked tree structures. The regular
tree languages (for ranked as well as for unranked alphabets) are precisely those tree
languages recognizable by a number of natural forms of finite automata [7]. For space
reasons, we cannot discuss details of tree automata here, but refer the interested reader
to standard compendia such as [10,49], as well as to [39].
The following is a classical result for ranked trees [47,12], which has been shown in
[39] to hold for unranked trees as well.
Proposition 6.1. A tree language is regular iff it is definable in MSO.
Given that tree automata can be run in polynomial time over trees, the above result
immediately yields the following well-known corollary:
Corollary 6.1. Model checking for MSO formulas over (ranked or unranked) trees is
feasible in linear time.
This result was generalized by Courcelle to tree-like structures, more specifically, struc-
tures of bounded treewidth, a concept originally defined in [11]:
Theorem 6.1 ([11]). Model checking for MSO formulas over structures of bounded
treewidth is feasible in polynomial time (in fact, in linear time).
MSO has been used as a language for computer-aided verification (CAV); model check-
ers for CAV such as MONA [17] are mainly based on MSO.
Other applications are, as we already mentioned, XML querying and Web data ex-
traction [21,24,22]. However, for various reasons discussed in [22], MSO is not well-
suited as a query language and query evaluation does not properly scale in the size of the
query. Consequently, in [22] a different language was considered, viz. monadic Data-
log, which restricts the well-known Datalog language to programs where all intensional
predicates are either Boolean or monadic. For monadic Datalog, the following could be
shown:
Theorem 6.2 ([22]). Over (ranked or unranked) trees, Monadic Datalog is exactly as
expressive as MSO.
Moreover, for the complexity of evaluating monadic Datalog programs over ranked or
unranked trees is as follows:
Theorem 6.3 ([22]). Evaluating a monadic Datalog program P over a tree T is feasi-
ble in time O(P × T).
The Model Checking Problem for Prefix Classes of Second-Order Logic 247
The above result shows that, over trees, model checking for monadic datalog, unlike
for MSO, scales linearly in the size of the datalog program P , and not only in the
size of the structure T . Note that this does by no means conflict with the fact that,
over trees, monadic Datalog is exactly as expressive as MSO. In fact, translating an
MSO formula into an equivalent monadic Datalog program may come with a huge
(exponential) blow-up in size. However, as noted in [22], it appears that most practical
queries and data extraction tasks can be easily formulated by small monadic datalog
programs, and hence, the worst-case blow-up is not really relevant in practice. The
commercial Web data extraction system Lixto [4,23], which is successfully used for
many industrial applications in various domains, is mainly based on monadic Datalog.
Theorem 6.3 was recently generalized to the setting of structures of bounded tree-
width:
Theorem 6.4 ([27]). Evaluating a monadic Datalog program P over a structures T of
bounded treewidth is feasible in time O(P × T).
The main aim of this paper was to summarize results on determining the complexity
of prefix classes over finite structures, and in particular on the prefix classes Σk1 (Q)
which over strings express regular languages only. Many of the prefix classes analyzed
so far represented mathematical challenges and required novel solution methods. Some
of them could be solved with automata-theoretic techniques, others with techniques of
purely combinatorial nature, and yet others required graph-theoretic arguments.
While the exact “regularity frontier” for the Σk1 (Q) classes has been charted (see
Fig. 2), their tractability frontier with respect to model checking misses a single class,
viz. the nonregular class Σ21 (∃∀). If model checking is NP-hard for it, then the tractabil-
ity frontier coincides with the regularity frontier (just as for ESO, cf. [15]). If, on the
other hand, model checking for Σ21 (∃∀) is tractable, then the picture is slightly different.
Moreover, it would be interesting to refine our analysis of the Σk1 fragments over
strings by studying the second-order quantification patterns taking the number of the
second-order variables and their arities into account, as done for ESO over graphs
in [25] (cf. Sect. 5) and for the classical decision problem in the book by Börger, Gure-
vich, and Grädel [6].
We conclude this paper pointing out some interesting (and in our opinion important)
open issues.
• While the work on word structures concentrated so far on strings with a successor
relation Succ, one should also consider the cases where an additional predefined
linear order < is available on the word structures or the successor relation Succ is
replaced by such a linear order. While for full ESO or MSO, Succ and < are freely
interchangeable, this is not the case for many of the limited ESO-prefix classes.
Preliminary results suggest that most of the results in this paper carry over to the
< case. Other variants would be structures with function symbols [2], additional
relations like set comparison operators [30], or position arithmetic (for instance
M (i, j) ≡ 0 ≤ i ≤ j < n ∧ i+j = n−1 in [34]).
248 T. Eiter, G. Gottlob, and T. Schwentick
• Delineate the tractability/intractability frontier for all SO prefix classes over graphs,
and settle the complexity characterization. Over strings, settle the complexity of the
open fragments (including Σ21 (∃∀), Σ21 (∃∗ ∀∃∗ ), etc).
• Study SO prefix classes over further interesting classes of structures (for instance
planar graphs).
• The scope of [15,16] are finite strings. However, infinite strings or ω-words are
another important area of research. In particular, Büchi has shown that an analogue
of his theorem (Proposition 2.1) also holds for ω-words [8]. For an overview of
this and many other important results on ω-words, we refer the reader to excellent
survey of Thomas [48]. In this context, it would be interesting to see which of
the results established so far survive for ω-words. For some results, for instance
regularity of ESO(∃∗ ∀∀), this is obviously the case as no finiteness assumption on
the input word structures was made in the proof. For determining the regularity or
nonregularity of some classes such as ESO(∃∗ ∀∃∗ ), further research is needed.
References
1. Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web: From Relations to Semistructured
Data and XML. Morgan Kaufmann, San Francisco (1999)
2. Barbanchon, R., Grandjean, E.: The minimal logically-defined NP-complete problem. In:
Diekert, V., Habib, M. (eds.) STACS 2004. LNCS, vol. 2996, pp. 338–349. Springer,
Heidelberg (2004)
3. Basin, D., Klarlund, N.: Hardware verification using monadic second-order logic. In: Wolper,
P. (ed.) CAV 1995. LNCS, vol. 939, pp. 31–41. Springer, Heidelberg (1995)
4. Baumgartner, R., Flesca, S., Gottlob, G.: Visual web information extraction with Lixto. In:
VLDB, pp. 119–128 (2001)
5. Bodlaender, H.L.: A Linear-Time Algorithm for Finding Tree-Decompositions of Small
Treewidth. SIAM Journal on Computing 25, 1305–1317 (1996)
6. Börger, E., Grädel, E., Gurevich, Y.: The Classical Decision Problem. Springer, Heidelberg
(1997)
7. Brüggemann-Klein, A., Murata, M., Wood, D.: Regular tree and regular hedge languages
over non-ranked alphabets: Version 1, April 3, 2001. Technical Report HKUST-TCSC-2001-
05, Hong Kong University of Science and Technology, Hong Kong SAR, China (2001)
8. Büchi, J.R.: On a Decision Method in Restriced Second-Order Arithmetic. In: Nagel, E., et
al. (eds.) Proc. International Congress on Logic, Methodology and Philosophy of Science,
pp. 1–11. Stanford University Press, Stanford (1960)
9. Büchi, J.R.: Weak second-order arithmetic and finite automata. Zeitschrift für mathematische
Logik und Grundlagen der Mathematik 6, 66–92 (1960)
10. Comon, H., Dauchet, M., Gilleron, R., Jacquemard, F., Lugiez, D., Löding, C.,
Tison, S., Tommasi, M.: Tree Automata Techniques and Applications (Web book) (2008),
http://tata.gforge.inria.fr/ (viewed September 25, 2009)
11. Courcelle, B.: The Monadic Second-Order Logic of Graphs I: Recognizable Sets of Finite
Graphs. Information and Computation 85, 12–75 (1990)
12. Doner, J.: Tree acceptors and some of their applications. Journal of Computer and System
Sciences 4, 406–451 (1970)
13. Ebbinghaus, H.D., Flum, J.: Finite Model Theory. In: Perspectives in Mathematical Logic.
Springer, Heidelberg (1995)
The Model Checking Problem for Prefix Classes of Second-Order Logic 249
14. Eiter, T., Gottlob, G., Gurevich, Y.: Normal Forms for Second-Order Logic over Finite Struc-
tures, and Classification of NP Optimization Problems. Annals of Pure and Applied Logic 78,
111–125 (1996)
15. Eiter, T., Gottlob, G., Gurevich, Y.: Existential Second-Order Logic over Strings. Journal of
the ACM 47, 77–131 (2000)
16. Eiter, T., Gottlob, G., Schwentick, T.: Second-Order Logic over Strings: Regular and Non-
Regular Fragments. In: Kuich, W., Rozenberg, G., Salomaa, A. (eds.) DLT 2001. LNCS,
vol. 2295, pp. 37–56. Springer, Heidelberg (2002)
17. Elgaard, J., Klarlund, N., Møller, A.: MONA 1.x: New techniques for WS1S and WS2S. In:
Y. Vardi, M. (ed.) CAV 1998. LNCS, vol. 1427, pp. 516–520. Springer, Heidelberg (1998)
18. Elgot, C.C.: Decision problems of finite automata design and related arithmetics. Transac-
tions of the American Mathematical Society 98, 21–51 (1961)
19. Fagin, R.: Generalized First-Order Spectra and Polynomial-Time Recognizable Sets. In:
Karp, R.M. (ed.) Complexity of Computation, pp. 43–74. AMS (1974)
20. Gottlob, G.: Second-order logic over finite structures - report on a research programme. In:
Basin, D., Rusinowitch, M. (eds.) IJCAR 2004. LNCS (LNAI), vol. 3097, pp. 229–243.
Springer, Heidelberg (2004)
21. Gottlob, G., Koch, C.: Monadic queries over tree-structured data. In: LICS, pp. 189–202
(2002)
22. Gottlob, G., Koch, C.: Monadic datalog and the expressive power of languages for web in-
formation extraction. Journal of the ACM 51, 74–113 (2004)
23. Gottlob, G., Koch, C., Baumgartner, R., Herzog, M., Flesca, S.: The Lixto data extraction
project - back and forth between theory and practice. In: PODS, pp. 1–12 (2004)
24. Gottlob, G., Koch, C., Pichler, R.: Efficient algorithms for processing XPath queries. ACM
Trans. Database Syst. 30, 444–491 (2005)
25. Gottlob, G., Kolaitis, P., Schwentick, T.: Existential second-order logic over graphs: Chart-
ing the tractability frontier. In: 41st Annual Symposium on Foundations of Computer Sci-
ence (FOCS 2000), Redondo Beach, California, USA, November 12-14, pp. 664–674. IEEE
Computer Society Press, Los Alamitos (2000)
26. Gottlob, G., Kolaitis, P., Schwentick, T.: Existential second-order logic over graphs: Charting
the tractability frontier. Journal of the ACM 51, 312–362 (2004)
27. Gottlob, G., Pichler, R., Wei, F.: Monadic datalog over finite structures with bounded
treewidth. In: PODS, pp. 165–174 (2007)
28. Grädel, E., Rosen, E.: Two-Variable Descriptions of Regularity. In: Proceedings 14th Annual
Symposium on Logic in Computer Science (LICS 1999), Trento, Italy, July 2-5, pp. 14–23.
IEEE Computer Science Press, Los Alamitos (1999)
29. Grandjean, E.: Universal quantifiers and time complexity of random access machines. Math-
ematical Systems Theory 13, 171–187 (1985)
30. Hachaı̈chi, Y.: Fragments of monadic second-order logics over word structures. Electr. Notes
Theor. Comput. Sci. 123, 111–123 (2005)
31. Henriksen, J., Jensen, J., Jørgensen, M., Klarlund, N., Paige, B., Rauhe, T., Sandholm,
A.: Mona: Monadic second-order logic in practice. In: Brinksma, E., Steffen, B., Cleave-
land, W.R., Larsen, K.G., Margaria, T. (eds.) TACAS 1995. LNCS, vol. 1019, pp. 89–110.
Springer, Heidelberg (1995)
32. Immerman, N.: Descriptive Complexity. Springer, Heidelberg (1999)
33. Kolaitis, P., Papadimitriou, C.: Some Computational Aspects of Circumscription. Journal of
the ACM 37, 1–15 (1990)
34. Langholm, T., Bezem, M.: A descriptive characterisation of even linear languages. Gram-
mars 6, 169–181 (2003)
35. Lautemann, C., Schwentick, T., Thérien, D.: Logics for context-free languages. In: Proc.
1994 Annual Conference of the EACSL, pp. 205–216 (1995)
250 T. Eiter, G. Gottlob, and T. Schwentick
1 Introduction
The existence of a logic capturing polynomial time remains the central problem
in descriptive complexity. A proof that such a logic does not exist would yield
that P = NP. The problem was addressed by Yuri Gurevich in various papers
(e.g. [8,9]; recent articles on this question are [7,12]). The question originated
in database theory: In a fundamental paper [2] on the complexity and expres-
siveness of query languages, Chandra and Harel considered a related question,
namely they asked for a recursive enumeration of the class of all queries com-
putable in polynomial time.
By a result due to Immerman [10] and Vardi [13] least fixed-point logic LFP
captures polynomial time on ordered structures. However the property of an ar-
bitrary structure of having a universe of even cardinality is not expressible in
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 251–276, 2010.
c Springer-Verlag Berlin Heidelberg 2010
252 Y. Chen and J. Flum
LFP. There are artificial logics capturing polynomial time on arbitrary struc-
tures, but they do not fulfill a natural requirement on logics in this context:
p-Acc≤
Instance: A nondeterministic Turing machine M and
n ∈ N in unary.
Parameter: M, the size of M.
Question: Does M accept the empty input tape in at
most n steps?
Then
L≤ satisfies (1) if and only if p-Acc≤ ∈ XP.1 (2)
in normal form) a factor of the form AO(width(ϕ)) suffices, where width(ϕ), the
width of ϕ, essentially is the maximum number of free variables in a subformula
of ϕ. The main result of Sect. 4 shows that the existence of a bound of this type
for the model-checking problem of the logic L≤ is equivalent to p-Acc≤ ∈ FPT.
Let P[tc] = NP[tc] mean that for all time constructible and increasing func-
tions h the class of problems decidable in deterministic polynomial time in h
and the class of problems decidable in nondeterministic polynomial time in
h are distinct, that is, DTIME(hO(1) ) = NTIME(hO(1) ). In Sect. 6 we show
that P[tc] = NP[tc] implies that p-Acc≤ ∈ / FPT. Furthermore a stronger hy-
pothesis where DTIME(hO(1) ) = NTIME(hO(1) ) is replaced by NTIME(hO(1) ) ⊆
DTIME(hO(log h) ) implies that p-Acc≤ ∈/ XP (and thus by (2), it implies that L≤
does not capture polynomial time). In [4] we related these hypotheses to other
statements of complexity theory; in particular, we saw that P[tc] = NP[tc]
holds if there is a P-bi-immune problem in NP.
We also study some variants of p-Acc≤ . First we deal with p-Acc= , the
problem obtained from p-Acc≤ by asking for an accepting run of exactly n
steps. We show that p-Acc= is related to a logic L= as p-Acc≤ is to the logic
L≤ . In Sect. 5 we improve a result of [1] by showing that p-Acc= ∈ FPT if
and only if E = NE (that is, DTIME(2O(n) ) = NTIME(2O(n) )). Furthermore, in
Sect. 7 we introduce a halting problem for deterministic Turing machines, the
“deterministic version” of p-Acc≤ , and show that it is an example of a problem
nonuniformly fixed-parameter tractable but provably not contained in uniform
XP, to the best of our knowledge, the first natural such example.
Finally, in Sect. 8, we consider the construction problem associated with
p-Acc≤ and show that it is not fpt Turing reducible to p-Acc≤ in case p-Acc≤ ∈ /
XP.
A conference version of this paper appeared as [3].
2 Preliminaries
In this section we review some of the basic concepts of parameterized complexity
and of logics and their complexity. We refer to [5] for notions not defined here.
2.2 Logic
A vocabulary τ is a finite set of relation symbols. Each relation symbol has
an arity. A structure A of vocabulary τ , or τ -structure (or, simply structure),
consists of a nonempty set A called the universe, and an interpretation RA ⊆ Ar
of each r-ary relation symbol R ∈ τ . We say that A is finite, if A is a finite set.
All structures in this paper are assumed to be finite.
For a structure A we denote by A the size of A, that is, the length of a
reasonable encoding of A as string in {0, 1}∗ (e.g., cf. [5] for details). If necessary,
we can assume that the universe of a finite structure is [m] := {1, . . . , m} for
some natural number m ≥ 1, as all the properties of structures we consider are
invariant under isomorphisms; in particular, it suffices that from the encoding of
A we can recover A up to isomorphism. The reader will easily convince himself
that we can assume that there is a computable function lgth such that for every
vocabulary τ and m ≥ 1 (we just collect the properties of lgth we use in Sect. 4):
(a) A = lgth(τ, m) for every τ -structure A with universe of cardinality m (that
is, for fixed τ and m, the encoding of each τ -structure with universe of m
elements has length equal to lgth(τ, m));
(b) lgth(τ, m) ≥ max{2, m};
(c) for fixed τ , the function m → lgth(τ, m) is time constructible and lgth(τ, m)
is polynomial in m;
(d) lgth(τ, m) < lgth(τ , m ) for all τ , τ with τ ⊆ τ and m, m with m < m ;
(e) lgth(τ, m) = O(log |τ | · |τ | · m) for every τ containing only unary relation
symbols;
(f) lgth(τ ∪ {R}, m) = O(lgth(τ, m) + m2 ) for every binary relation symbol R not
in τ .
Sometimes, for a structure A we denote by <A the ordering on A given by the
encoding of A.
We assume familiarity with first-order logic FO and its extension least fixed-
point logic LFP. We denote by FO[τ ] and LFP[τ ] the set of sentences of vocabu-
lary τ of FO and of LFP, respectively. It is known that LFP captures polynomial
time on the class of ordered structures.
A Logic for PTIME and a Parameterized Halting Problem 255
|ϕ| · AO(|ϕ|) .
Proof. For the reader’s convenience we present a proof. Let ϕ(x̄, Ȳ ) with x̄ =
x1 , . . . , xr and Ȳ = Y1 , . . . , Ys be an LFP-formula and let A be a structure. For
interpretations R̄ of Ȳ , we set
By induction on ϕ we show how the set ϕ(A, R̄) can be evaluated in the time
bound of the claim of our proposition.
If ϕ does not start with an LFP-operator, we can write ϕ as a “first-order com-
bination” of formulas ϕ1 , . . . , ϕs , which are atomic formulas or formulas starting
with an LFP-operator. Let x̄i be the sequence of first-order variables free in ϕi .
By induction hypothesis we know that we can compute the sets ϕi (A, R̄) ⊆ A|x̄i |
256 Y. Chen and J. Flum
Hence, if L is a P-bounded logic for P, then for every L-sentence ϕ the algorithm
A witnesses that ModL (ϕ) ∈ P. However, we do not necessarily know ahead of
time the bounding polynomial.
In this section we introduce the variants of least fixed-point logic relevant to our
paper.
For a vocabulary τ let τ< := τ ∪ {<}, where < is a binary relation symbol not
in τ . For every class of τ -structures C in P closed under isomorphisms the class
of τ< -structures
is in P, too; hence, as the logic LFP captures polynomial time on the class of
ordered structures, there is an LFP[τ< ]-sentence axiomatizing C< . However, we
are interested in a sentence axiomatizing the class C.
In order to obtain a logic that captures polynomial time on all structures one
has considered variants of LFP obtained by restricting to order-invariant sen-
tences or by modifying the semantics such that all sentences are order-invariant.
In this section we recall the corresponding logics. We start by introducing the
respective notions of invariance.
(ϕ, m) ∈ Inv
(ϕ, ≤ m) ∈ Inv
Note that every LFP[τ< ]-sentence axiomatizing a class of the form C< (see (5))
is order-invariant.
The different degrees of invariance lead to the following different logics. For
all logics L we let
L[τ ] := LFP[τ< ].
Hence, these logics only differ in their semantics. The logic Linv is the first naive
attempt to get an (effectively) P-bounded logic for P. Its semantics is fixed by
A |=Linv ϕ ⇐⇒ ϕ is order-invariant and (A, <A ) |=LFP ϕ
(recall that <A denotes the ordering on A given by the encoding of A).
Clearly (and this remark will also apply to the logics Lstr , L= , and L≤ to be
defined yet)
all properties in P are expressible in Linv .
In fact, for a class C ∈ P of τ -structures closed under isomorphisms, every
LFP[τ< ]-sentence axiomatizing the class C< is an Linv [τ ]-sentence axiomatiz-
ing C.
The logic Linv is a logic for P, as
ModLinv (ϕ) = {A | (A, <A ) ∈ ModLFP (ϕ)}
if ϕ ∈ LFP[τ< ] is invariant, and ModLinv (ϕ) = ∅ otherwise. However, as already
remarked by Yuri Gurevich in [9], a simple application of a theorem of Tracht-
enbrot shows that the set of invariant LFP[τ< ]-sentences is not decidable and
thus |=Linv is not decidable; hence Linv is not a P-bounded logic for P.
For the logic Lstr we require invariance in the corresponding structure:
A |=Lstr ϕ ⇐⇒ (ϕ, A) ∈ Inv and (A, <A ) |=LFP ϕ .
For a binary relation symbol E, consider an FO {E}< -sentence ϕ expressing
that E is not a graph or that in the ordering < there are two consecutive elements
which are not related by an edge. The class ModLstr (ϕ) is the complement of the
class of graphs having a Hamiltonian path and hence it is coNP-complete (a dif-
ferent coNP-complete class was axiomatized by Gurevich in [9, Theorem 1.16]).
As an easy consequence we get:
Proposition 3. The following statements are equivalent:
– Lstr is a logic for P.
– P = NP
– Lstr is an effectively P-bounded logic for P.
Proof. Assume that P = NP. To show that Lstr is an effectively P-bounded logic
for P we consider the problem
– L= is a logic for P.
– E = NE
– L= is an effectively P-bounded logic for P.
Finally we introduce the logic L≤ , where invariance in all structures of the same
or smaller cardinality is required:
A |=L≤ ϕ ⇐⇒ (ϕ, ≤ |A|) ∈ Inv and (A, <A ) |=LFP ϕ .
If an LFP[τ< ]-sentence ϕ is not order-invariant, then the class ModL≤ (ϕ) only
contains (up to isomorphism) finitely many structures and hence it is in P.
Therefore L≤ (like Linv ) is a logic for P.
In particular, Linv and L≤ have less expressive power than Lstr (if P = NP)
and less than L= (if E = NE). Clearly if P = NP (and hence E = NE), then all,
Linv , Lstr , L= and L≤ , have the same expressive power. Otherwise we have:
Proposition 5. 1. If P = NP, then there is a class axiomatizable in Lstr but
not in L= .
2. If E = NE, then there is a class axiomatizable in L= but not in Lstr .
Proof. To get (1) we observe that the complement of the class of graphs hav-
ing a Hamiltonian path, a class axiomatizable in Lstr as we have seen, is not
axiomatizable in L= if P = NP; this is shown by the following claim.
Claim 1: Let C be a class of τ -structures. Assume that C ∈ / P. Furthermore
assume that for every m ∈ N with m ≥ 2 there is a structure Am ∈ C such that
|Am | = m. Then C is not axiomatizable in L= .
Proof of Claim 1: Assume that C = ModL= (ϕ). For m ≥ 2 we have (ϕ, m) ∈
Inv, as Am |=L= ϕ. Clearly, (ϕ, 1) ∈ Inv. Hence ϕ is order-invariant and thus
ModLFP (ϕ) = C< . So C< and hence C are in P, a contradiction.
A proof of part (2), based on the following claim, will be presented in Sect. 4.1.
260 Y. Chen and J. Flum
A ∈ C ⇐⇒ |A| ∈ X.
we get
(ϕ, ≤ |A|) ∈ Inv ⇐⇒ (A |=L≤ ϕ or A |=L≤ ¬ϕ). (6)
p-Acc≤
Instance: A nondeterministic Turing machine M and
n ∈ N in unary.
Parameter: M.
Question: Does M accept the empty input tape in at
most n steps?
A Logic for PTIME and a Parameterized Halting Problem 261
We shall see in this section how the complexity of this problem is related to
properties of the logic L≤ . We start with the following simple observation on the
complexity of p-Acc≤ .
Proposition 6. The problem p-Acc≤ is in the class FPTnu .
Proof. Fix k ∈ N; then there are only finitely many nondeterministic Turing
machines M with M = k, say, M1 , . . . , Ms . For each i ∈ [s] let i be the
smallest natural number such that there exists an accepting run of Mi , started
with empty input tape, of length . We set i = ∞ if Mi does not accept the
empty input tape. Hence the algorithm Ak that on any instance (M, n) of p-Acc≤
with M = k determines the i with M = Mi , and then accepts if and only if
i ≤ n, decides the kth slice of p-Acc≤ . It has running time O(M + n); thus it
witnesses that p-Acc≤ ∈ FPTnu .
This observation can easily be generalized. We call a parameterized problem
(Q, κ) slicewise monotone if its instances have the form (x, n), where x ∈ {0, 1}∗
and n ∈ N is given in unary, if κ(x, n) = |x|, and finally if for all x ∈ {0, 1}∗ and
n, n ∈ N we have
(x, n) ∈ Q and n < n imply (x, n ) ∈ Q.
In particular, p-Acc≤ is slicewise monotone and the preceding argument shows:
The following observations will lead to a proof of the direction “from right to
left” in the statements of Theorem 1.
For an L≤ -sentence ϕ let τϕ be the set of relation symbols distinct from <
that do occur in ϕ. For a suitable time constructible function t : N → N we
will need a nondeterministic Turing machine Mϕ (t) that, started with empty
tape, operates as follows: In a first phase it writes a word of the form 1m for
some m ≥ 1 on some tape. The second phase (the main phase) consists of at
most t(m) + 1 steps (this can be ensured as t is time constructible). If Mϕ (t)
does not stop during the first t(m) steps of the main phase, then it stops in the
next step and rejects. During these t(m) steps, Mϕ (t) guesses (the encoding of)
a τϕ -structure A with universe [m] and two orderings <1 and <2 on [m] and
checks whether (A, <1 ) |=LFP ϕ ⇐⇒ (A, <2 ) |=LFP ϕ . If this is not the case,
then Mϕ (t) accepts; otherwise it rejects.
The first phase takes m steps. To guess a τϕ -structure A with universe [m]
and two orderings <1 and <2 requires O(lgth(τϕ , m) + 2m2 ) bits (see Sect. 2.2);
thus for some d1 ∈ N the machine Mϕ (t) needs
additional steps before it stops (assuming tϕ (m) ≤ t(m)). Note that tϕ is in-
creasing. Therefore we have
(by the definition (7) of the function tϕ and the properties of the lgth-
function mentioned in Sect. 2.2).
Now we can show the direction “from right to left” in the statements of Theo-
rem 1. We give the proof for the claims (1) and (2); obvious modifications yield
(3) and (4).
Assume p-Acc≤ ∈ FPTuni (p-Acc≤ ∈ FPT), that is, assume that (M, n) ∈
p-Acc≤ can be solved in time
f (M) · ne
A |=L≤ ϕ
can be solved in time h(|ϕ|) · AO (1+depth(ϕ))·width(ϕ)
.
Before we proceed with the proof of Theorem 1, it is worthwhile to extract from
the previous argument information relevant for the logic L= . We introduce the
corresponding halting problem:
p-Acc=
Instance: A nondeterministic Turing machine M and
n ∈ N in unary.
Parameter: M.
Question: Does M accept the empty input tape in exactly
n steps?
264 Y. Chen and J. Flum
Then:
Lemma 2. If p-Acc= ∈ FPT, then L= is an effectively depth-width P-bounded
logic for P.2
and thus,
A |=L= ϕ ⇐⇒ (Mϕ (tϕ ), |A| + tϕ (|A|)) ∈
/ p-Acc= and (A, <A ) |=LFP ϕ .
Thus our claim can be derived in exactly the same way as the corresponding
statement for L≤ .
(a) If |A| < m0 , then (A, <A ) |=LFP ϕM for all orderings <A on A.
(b) If |A| ≥ m0 and the subsets P0A , . . . , PkA do not form a partition of A, then
(A, <A ) |=LFP ϕM for all orderings <A on A.
(c) Let |A| ≥ m0 and assume that P0A , . . . , PkA form a partition of A and <A is
an ordering on A. Let a1 , . . . , a|A| be the enumeration of the elements of A
according to the ordering <A and choose is such that as ∈ PiA s
for s ∈ [|A|].
(i) If there is a j ∈ [|A| − 1] such that 1, i1 , . . . , ij is the sequence of states
of a complete run of M, started with empty input tape (in particular,
is = 0 for all s ∈ [j]), then (A, <A ) |=LFP ϕM if and only if this run of M
is a rejecting one.
(ii) If for all j ∈ [|A| − 1] the sequence 1, i1 , . . . , ij does not correspond to a
complete run of M with empty input tape, then (A, <A ) |=LFP ϕM .
As by (6)
we obtain by (9)
(M, m) ∈
/ p-Acc≤ ⇐⇒ (Am |=L≤ ϕM or Am |=L≤ ¬ϕM ). (12)
Therefore, by (11) and (10), we see that there is a (computable) function f and
a constant e ∈ N (recall that for all nondeterministic Turing machines M the
depth of ϕM is one and that there is a constant bounding the width of ϕM ) such
that (M, m) ∈ p-Acc≤ can be solved in time f (M) · me . This finishes the proof
of Theorem 1.
Again we extract from the proof the information on p-Acc= and L= that we
shall need in Sect. 5.
(M, m) ∈
/ p-Acc= ⇐⇒ (Am |=L= χM or Am |=L= ¬χM ) (13)
By refining the proof of the second part of Theorem 1 we obtain the following
result (related to Gurevich’s result [9, Theorem 1.16]):
is coNP-complete.
A Logic for PTIME and a Parameterized Halting Problem 267
Proof. Clearly the problem is in coNP. Now let Q be any problem in coNP. We
give a polynomial reduction of Q to the problem in the statement. Let M be a
polynomial time nondeterministic Turing machine such that for all x ∈ {0, 1}∗
we have
x ∈ Q ⇐⇒ no run of M accepts x.
Choose c, d ∈ N such that the running time of M on input x is bounded by c·|x|d .
We fix x ∈ {0, 1}∗. Again we assume that [k] is the set of states of M and that
1 is its starting state and set τ := {P0 , . . . , Pk } with unary relation variables
P0 , . . . , Pk . Along the lines of the proof of the second part of Theorem 1, in
time polynomial in |x| one can obtain an LFP[τ< ]-sentence ϕM,x in normal form
with the properties (a)–(c) (on page 264), where now in (c)(i),(ii) we consider
complete runs started with the word x in the input tape. Again we can achieve
that the width of ϕM,x does not depend on x (once more, we use the fact that
for i ∈ [|x|] we can address the ith element in the ordering < by a formula of
width 3). Let Ax be any τ -structure with |Ax | ≥ max{m0 (M), c · |x|d }. Then
x ∈ Q ⇐⇒ Ax |=L≤ ϕM,x ,
and hence x
→ (Ax , ϕM,x ) is the desired reduction.
We close this section by a proof of Proposition 5 (2).
Proof of Proposition 5 (2): Let X be a set of natural numbers in binary in
NE \ E. Then X(un) ∈ NP \ P, where X(un) is the set of natural numbers of X
in unary. Hence there is a nondeterministic Turing machine M that given m ∈ N
in unary decides whether m ∈ X(un) in polynomial time, say, in time c · md .
We may assume that every run of M on input m has length c · md . Similarly to
the sentence ϕM in the proof of Theorem 7, we construct an LFP-sentence ρM
expressing that
if for some m ∈ N the universe has cardinality c · md and the relations
P0 , . . . , Pk code a run of M with input 1m , then it is not accepting.
Then for every {P0 , . . . , Ps }-structure A we have
A |=L= ρM ⇐⇒ |A| ∈
/ {c · md | m ∈ X(un)}. (14)
As the set {c · md | m ∈ X(un)} of natural numbers in unary is not in P, we
get that ModL= (ρM ) is not axiomatizable in Lstr by Claim 2 in the proof of
Proposition 5.
Hence p-Acc≤ ≤fpt p-Acc= . Recall that p-Acc≤ ∈ FPTnu . On the other hand,
p-Acc= ∈/ FPTnu if E = NE, as shown by the main result of this section:
Theorem 2. The following statements are equivalent:
– p-Acc= ∈ / FPT.
– p-Acc= ∈ / XPnu .
– E = NE.
In [1] it is shown that p-Acc= ∈ XP implies E = NE. By Lemma 2 and
Lemma 3 we get as a consequence of this theorem the following improvement of
Proposition 4.
Corollary 2. The following statements are equivalent:
– L= is a logic for P.
– E = NE.
– L= is an effectively depth-width P-bounded logic for P.
We prove Theorem 2 by the following two lemmas.
Lemma 4. If E = NE, then p-Acc= ∈ FPT.
Proof. Consider the classical problem:
As p-Acc= ∈ XPnu , for some d ∈ N we can decide whether M∗ accepts the empty
string in exactly 2 · n(x)c many steps in time
(2 · n(x)c )d .
ing functions h.
The assumption P[tc] = NP[tc] implies P = NP, even E = NE, as seen by
taking as h the identity function and the function 2n , respectively. At the end
of this section we are going to relate P[tc] = NP[tc] to further statements of
complexity theory. The main result of this section is:
Theorem 3. If P[tc] = NP[tc], then p-Acc≤ ∈
/ FPT.
The following idea underlies the proof of this result. Assume that p-Acc≤ ∈
FPT. Then, in particular we have a deterministic algorithm deciding p-Acc≤ ,
the (parameterized) acceptance problem for nondeterministic Turing machines.
This yields a way (different from brute force) to translate nondeterministic algo-
rithms into deterministic ones; a careful analysis of this translation shows that
NTIME(hO(1) ) ⊆ DTIME(hO(1) ) for a suitable time constructible and increasing
function h.
270 Y. Chen and J. Flum
Proof. We define h0 : N → N by
h0 (n) := f (2e · n2 )
f (M) · nO(1)
It seems that the statement (a) is much stronger than (b). In fact as shown in [4]
“not (b)” implies
I ∈ P such
there is an infinite that for all Q ∈ NP at least one of the
sets Q ∩ I and {0, 1}∗ \ Q ∩ I is an infinite set in P,
NTIME(hO(1) ) ⊆ DTIME(hO(log h) ).
p-Dtm-Exp-Acc≤
Instance: A deterministic Turing machine M and n ∈ N
in unary.
Parameter: M.
Question: Does M accept the empty input tape in at
most 2n steps?
Theorem 5. p-Dtm-Exp-Acc≤ ∈
/ XPuni .
p-Dtm-Inp-Exp-Acc≤
Instance: A deterministic Turing machine M, x ∈ {0, 1}∗,
and n ∈ N in unary.
Parameter: M + |x|.
Question: Does M accept x in ≤ 2n steps?
M0 (x)
// x ∈ {0, 1}∗
p-Constr-Acc≤
Instance: A nondeterministic Turing machine M and
n ∈ N in unary.
Parameter: M.
Problem: Construct an accepting run of ≤ n steps of M
started with empty input tape if there is one
(otherwise report that there is no such run).
Similarly as we showed that p-Acc≤ ∈ FPTnu (cf. Proposition 6), one gets that
p-Constr-Acc≤ is nonuniformly fixed-parameter tractable (it should be clear
what this means).
Often the construction problem has the same complexity as the corresponding
decision problem, that is, the construction problem is fpt Turing reducible to
the decision problem; for p-Constr-Acc≤ we can show:
Theorem 6. 1. There is an fptuni Turing reduction from p-Constr-Acc≤ to
p-Acc≤ .
2. If p-Acc≤ ∈
/ XP, then there is no fpt Turing reduction from p-Constr-Acc≤
to p-Acc≤ .
It is not hard to see that the running time of T on the instance (M, n) can be
bounded by MO(f (M)) · n.
A Logic for PTIME and a Parameterized Halting Problem 275
entries, that is O(nh(M) ) many for some computable h. For each such possibility
we simulate T by replacing the oracle queries accordingly. For those possibilities
where T yields a purported accepting run of M, we check whether it is really an
accepting run of M.
An analysis of the previous proof shows that we can even rule out the existence
of a Turing reduction with running time O(|x|f (κ(x)) ) instead of f (κ(x)) · |x|c . We
call such reductions xp Turing reductions.
Furthermore, Theorem 6 is a special case of a result for slicewise monotone
problems: Let (Q, κ) be a slicewise monotone parameterized problem (this con-
cept was defined just before Lemma 1) and assume that Q has a representation
of the form
(x, n) ∈ Q
(20)
⇐⇒ there is y ∈ {0, 1}∗ : |y| ≤ f (|x|) · nc and (x, n, y) ∈ QW ,
9 Conclusions
We have studied the relationship between the complexity of the model-checking
problems of the logics L= and L≤ and the complexity of the parameterized prob-
lems p-Acc= and p-Acc≤ . We have introduced the assumption P[tc] = NP[tc]
and seen that it implies that p-Acc≤ ∈ / FPT. A slightly stronger hypothesis
shows that p-Acc≤ ∈ / XP and hence that the logic L≤ is not an effectively P-
bounded logic for P. What are reasonable complexity theoretic assumptions that
imply p-Acc≤ ∈ / XPuni and hence that the logic L≤ is not a P-bounded logic
for P?
We believe that a study of the strength of the assumption P[tc] = NP[tc]
and of its consequences deserves further attention.
References
1. Aumann, Y., Dombb, Y.: Fixed structure complexity. In: Grohe, M., Niedermeier,
R. (eds.) IWPEC 2008. LNCS, vol. 5018, pp. 30–42. Springer, Heidelberg (2008)
2. Chandra, A.K., Harel, D.: Structure and complexity of relational queries. Journal
of Computer and System Science 25, 99–128 (1982)
3. Chen, Y., Flum, J.: A logic for PTIME and a parameterized halting problem. In:
Proceedings of the 24th Annual IEEE Symposium on Logic in Computer Science
(LICS 2009), pp. 397–406. IEEE Computer Society, Los Alamitos (2009)
4. Chen, Y., Flum, J.: On the complexity of Gödel’s proof predicate. The Journal of
Symbolic Logic 75, 239–254 (2010)
5. Flum, J., Grohe, M.: Parameterized Complexity Theory. Springer, Heidelberg
(2005)
6. Gasarch, W.I., Homer, S.: Relativizations comparing NP and exponential time.
Information and Control 58, 88–100 (1983)
7. Grohe, M.: The quest for a logic capturing PTIME. In: Proceedings of the Twenty-
Third Annual IEEE Symposium on Logic in Computer Science (LICS 2008), pp.
267–271. IEEE Computer Society, Los Alamitos (2008)
8. Gurevich, Y.: Toward logic tailored for computational complexity. In: Computation
and Proof Theory, pp. 175–216. Springer, Heidelberg (1984)
9. Gurevich, Y.: Logic and the challenge of computer science. In: Current Trends in
Theoretical Computer Science, pp. 1–57. Computer Science Press, Rockville (1988)
10. Immerman, N.: Relational queries computable in polynomial time. Information and
Control 68, 86–104 (1986)
11. Mayordomo, E.: Almost every set in exponential time is p-bi-immune. Theoretical
Computer Science 136, 487–506 (1994)
12. Nash, A., Remmel, J.B., Vianu, V.: PTIME queries revisited. In: Eiter, T., Libkin,
L. (eds.) ICDT 2005. LNCS, vol. 3363, pp. 274–288. Springer, Heidelberg (2004)
13. Vardi, M.Y.: The complexity of relational query languages (extended abstract). In:
Proceedings of the Fourteenth Annual ACM Symposium on Theory of Computing
(STOC 1982), pp. 137–146. ACM, New York (1982)
14. Vardi, M.Y.: On the complexity of bounded-variable queries. In: Proceedings of
the Fourteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of
Database Systems (PODS 1995), pp. 266–276. ACM Press, New York (1995)
Inferring Loop Invariants Using Postconditions
1 Overview
Many of the important contributions to the advancement of program proving
have been, rather than grand new concepts, specific developments and simplifi-
cations; they have removed one obstacle after another preventing the large-scale
application of proof techniques to realistic programs built by ordinary program-
mers in ordinary projects. The work described here seeks to achieve such a
practical advance by automatically generating an essential ingredient of proof
techniques: loop invariants. The key idea is that invariant generation should use
not just the text of a loop but its postcondition. Using this insight, the gin-pink
tool can infer loop invariants for non-trivial algorithms including array parti-
tioning (for Quicksort), sequential search, coincidence count, and many others.
The tool is available for free download.1
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 277–300, 2010.
c Springer-Verlag Berlin Heidelberg 2010
278 C.A. Furia and B. Meyer
Surely, the proof will succeed, but it will not teach us anything since it loses the
fundamental property of independence between the mathematical property to
be achieved and the software artifact that attempts to achieve it – the problem
and the solution.
To mitigate the Assertion Inference Paradox objection, one may invoke the
following arguments:
– The Paradox only arises if the goal is to prove correctness. Specification infer-
ence can have other applications, such as reverse-engineering legacy software.
– Another possible goal of inferring a specification may be to present it to a
programmer, who will examine it for consistency with an intuitive under-
standing of its intended behavior.
– Specification inference may produce an inconsistent specification, revealing
a flaw in the implementation.
– It does not raise the risk of circular reasoning since the specification of every
program unit is explicitly provided, not inferred.
– Having this specification of a loop’s context available gives a considerable
boost to loop invariant inference techniques. While there is a considerable
literature on invariant inference, it is surprising that none of the references
with which we are familiar use postconditions. Taking advantage of post-
conditions makes it possible – as described in the rest of this paper – to
derive the invariants of many important and sometimes sophisticated loop
algorithms that had so far eluded other techniques.
2 Illustrative Examples
This section presents the fundamental ideas behind the loop-invariant generation
technique detailed in Section 5 and demonstrates them on a few examples. It
uses an Eiffel-like [30] pseudocode, which facilitates the presentation thanks to
the native syntax for contracts and loop invariants.
As already previewed, the core idea is to generate candidate invariants by
mutating postconditions according to a few commonly recurring patterns. The
patterns capture some basic ways in which loop iterations modify the program
state towards achieving the postcondition. Drawing both from classic literature
[19,29] and our own more recent investigations we consider the following funda-
mental patterns.
Constant relaxation [29,19]: replace one or more constants by variables.
Uncoupling [29]: replace two occurrences of the same variable each by a dif-
ferent variable.
Term dropping [19]: remove a term, usually a conjunct.
Variable aging: replace a variable by an expression that represents the value
the variable had at previous iterations of the loop.
These patterns are then usually used in combination, yielding a number of mu-
tated postconditions. Each of these candidate invariants is then tested for initi-
ation and consecution (see Section 4.1) over any loop, and all verified invariants
are retained.
The following examples show each of these patterns in action. The tool de-
scribed in Sections 3 and 5 can correctly infer invariants of these (and more
complex) examples.
6 until i ≥ n
7 loop
8 i := i + 1
9 if Result ≤A[i] then Result := A[i] end
10 end
11 ensure ∀j • 1 ≤ j ∧ j ≤ n =⇒ A[j] ≤ Result
Lines 5–10 may modify variables i and Result but they do not affect the input
argument n, which is therefore a constant with respect to the loop body. The
constant relaxation technique replaces every occurrence of the constant n by a
variable i. The modified postcondition ∀ j • 1 ≤ j ∧ j ≤ i =⇒ A[j ] ≤ Result
is indeed an invariant of the loop: after every iteration, the value of Result is
the maximum value of array A over range [1..i].
2.3 Uncoupling
The reader can check that this indeed a loop invariant of all loops in routine
partition and that it allows a straightforward partial correctness proof of the
implementation.
Inferring Loop Invariants Using Postconditions 283
When generating candidate invariants, gin-pink does not apply all heuristics
at once but it tries them incrementally, according to user-supplied options. Typ-
ically, the user starts out with just constant relaxation and checks if some non-
trivial invariant is found. If not, the analysis is refined by gradually introducing
the other heuristics – and thus increasing the number of candidate invariants as
well. In the examples below we briefly discuss how often and to what extent this
is necessary in practice.
on properties of the prover (here Boogie), orthogonal to the task of inferring in-
variants. Finally, we ran gin-pink on each of the examples after commenting out
all annotations except for pre and postconditions (but leaving the simple back-
ground theories in); in a few difficult cases (discussed next) we ran additional
experiments with some of the annotations left in. After running the tests, we
measured the relevance of every automatically inferred invariant: we call an in-
ferred invariant relevant if the correctness proof needs it. Notice that our choice
of omitting postcondition clauses that Boogie cannot prove does not influence
relevance, which only measures the fraction of inferred invariants that are useful
for proving correctness.
For each example, Table 1 reports: the name of the procedure under analysis;
the length in lines of codes (the whole file including annotations and auxiliary
procedures and, in parentheses, just the main procedure); the total number of
loops (and the maximum number of nested loops, in parentheses); the total num-
ber of variables modified by the loops (scalar variables/array or map variables);
the number of mutated postconditions (i.e., candidate invariants) generated by
the tool; how many invariants it finds; the number and percentage of verified
invariants that are relevant; the total run-time of gin-pink in seconds; the source
(if any) of the implementation and the annotations. The experiments where per-
formed on a PC equipped with an Intel Quad-Core 2.40 GHz CPU and 4 Gb of
RAM, running Windows XP as guest operating system on a VirtualBox virtual
machine hosted by Ubuntu GNU/Linux 9.04 with kernel 2.6.28.
Most of the experiments succeeded with the application of the most basic
heuristics. Procedures Coincidence Count and Longest Common Subsequence are
the only two cases that required a more sophisticated uncoupling strategy where
two occurrences of the same constant within the same formula were modified
to two different aged variables. This resulted in an explosion of the number
of candidate invariants and consequently in an experiment running for over an
hour.
A few programs raised another difficulty, due to Boogie’s need for user-
supplied loop invariants to help automated deduction. Boogie cannot verify any
invariant in Shortest Path, Topological Sort, or Longest Common Subsequence
without additional invariants obtained by means other than the application of
the algorithm itself. On the other hand, the performance with programs Array
Stack Reversal and Dutch National Flag improves considerably if user-supplied
loop invariants are included, but fair results can be obtained even without any
such annotation. Table 1 reports both experiments, with and without user-
supplied annotations.
More generally, Boogie’s reasoning abilities are limited by the amount of in-
formation provided in the input file in the form of axioms and functions that
postulate sound inference rules for the program at hand. We tried to limit this
amount as much as possible by developing the necessary theories before tackling
invariant generation. In other words, the axiomatizations provided are enough
for Boogie to prove functional correctness with a properly annotated program,
but we did not strengthen them only to ameliorate the inference of invariants.
286 C.A. Furia and B. Meyer
A richer axiomatization may have removed the need for user-supplied invariants
in the programs considered.
4 Foundations
Having seen typical examples we now look at the technical choices that support
the invariant inference tools. To decouple the loop-invariant generation technique
from the specifics of the programming language, we adopt Boogie from Microsoft
Research [26] as our concrete programming language; Section 4.2 is then devoted
to a concise introduction to the features of Boogie that are essential for the
remainder. Sections 4.1 and 4.3 introduce definitions of basic concepts and some
notational conventions that will be used. We assume the reader is familiar with
standard formal definitions of the axiomatic semantics of imperative programs.
4.1 Invariants
It is impossible to establish these facts automatically for all programs but the
most trivial ones without programmer-provided annotations. The crucial aspect
is the characterization of loops, where the expressive power of universal com-
putation lies. A standard technique to abstract the semantics of any number of
iterations of a loop is by means of loop invariants.
Definition 1 (Inductive loop invariant). Formula φ is an inductive invari-
ant of loop
from Init until Exit loop Body end
iff:
In the rest of the discussion, inductive invariants will be called just invariants
for short. Note, however, that an invariant in the weaker sense of a property
that stays true throughout the loop’s execution is not necessarily an inductive
invariant: in
from x := 1 until False loop x := − x end
the formula x ≥ −1 will remain true throughout, but is not considered an
inductive invariant because {x ≥ −1} x := −x {x ≥ −1} is not a correct
Inferring Loop Invariants Using Postconditions 287
Hoare triple. In the remainder we will deal solely with inductive loop invariants,
as is customary in the program proving literature.
From a design methodology perspective, the invariant expresses a weakened
form of the loop’s postcondition. More precisely [31,19], the invariant is a form
of the loop’s postcondition that applies to a subset of the data, and satisfies the
following three properties:
1. It is strong enough to yield the postcondition when combined with the exit
condition (which states that the loop has covered the entire data).
2. It is weak enough to make it easy to write an algorithm (the loop initialization
Init) that will satisfy the invariant on a subset (usually empty or trivial) of
the data.
3. It is weak enough to make it easy to write an algorithm (the loop body Body)
that, given that the invariant holds on a subset of the data that is not the
entire data, extends it to cover a slightly larger subset.
“Easy”, in the last two conditions, means “much easier than solving the entire
original problem”. The loop consists of an approximation strategy that starts
with the initialization, establishing the invariant, then retains the invariant while
extending the scope by successive approximations to an ever larger set of the
input through repeated executions of the loop body, until it hits the exit condi-
tion, signaling that it now covers the entire data and hence satisfies the loop’s
postcondition. This explains that the various strategies of Section 2, such as
constant relaxation and uncoupling, are heuristics for mutating the loop’s post-
condition into a weaker form. The present work applies the same heuristics to
mutate postconditions of the routine that encapsulates the loop. The connec-
tion between the routine’s and the loop’s postcondition justifies the rationale
behind using weakening heuristics as mutation heuristics to generate invariant
candidates.
4.2 Boogie
Boogie, now in its second version, is both an intermediate verification language
and a verification tool.
The Boogie language combines a typed logical specification language with
an in-the-small imperative programming language with variables, procedures,
contracts, and annotations. The type system comprises a few basic primitive
types as well as type constructors such as one- and two-dimensional arrays.
It supports a relatively straightforward encoding of object-oriented language
constructs. Boogie is part of the Spec# programming environment; mappings
have been defined for other programming languages, including Eiffel [40] and C
[39]. This suggests that the results described here can be generalized to many
other contexts.
The Boogie tool verifies conformance of a procedure to its specification by gen-
erating verification conditions (VC) and feeding them to an automated theorem
prover (the standard one being Z3). The outcome of a verification attempt can
be successful or unsuccessful. In the latter case the tool provides some feedback
288 C.A. Furia and B. Meyer
Also, there is a direct correspondence between Boogie’s while loop and Eiffel’s
from ... until loop, used in the examples of Section 2 and the definitions in
Section 4.1.
subExp(φ, SubT ype) denotes the set of sub-expressions of formula φ that are of
syntactic type SubT ype. For example, subExp(is upper(v,X,1,n), M ap) denotes
all mapping sub-expressions in is upper (v,X,1,n), that is only X[j ].
replace(φ, old, new, ∗) denotes the formula obtained from φ by replacing ev-
ery occurrence of sub-expression old by expression new. Similarly, replace(φ, old,
new, n) denotes the formula obtained from φ by replacing only the n-th oc-
currence of sub-expression old by expression new, where the total ordering of
sub-expressions is given by a pre-order traversal of the expression parse tree. For
example, replace(is upper (v,X,1,n), j, h, ∗) is
∀ h : int • low ≤ h ∧ h ≤ high =⇒ A[h] ≤ m ,
while replace(is upper(v,X,1,n), j, h, 4) is:
∀ j : int • low ≤ j ∧ j ≤ high =⇒ A[h] ≤ m .
Given a while loop : while ( ... ) { Body }, targets() denotes the set of its
targets: variables (including mappings) that can be modified by its Body; this
includes global variables that appear in the modifies clause of called procedures.
Given a procedure foo, variables(foo) denotes the set of all variables that are
visible within foo, that is its locals and any global variable.
A loop is nested within another loop , and we write ≺ , iff belongs
to the Body of . Notice that if ≺ then targets( ) ⊆ targets(). Given a
procedure foo, its outer while loops are those in its body that are not nested
within any other loop.
The pseudocode in Figure 2 describes the main algorithm. The algorithm op-
erates on a given procedure and returns a set of formulas that are invariant of
some loop in the procedure. Every postcondition post among all postconditions
postconditions(a procedure) of the procedure is considered separately (line 5).
This is a coarse-grained yet effective way of implementing the term-dropping
strategy outlined in Section 2.4: the syntax of the specification language sup-
ports splitting postconditions into a number of conjuncts, each introduced by
the ensures keyword, hence each of these conjuncts is modified in isolation. It is
reasonable to assume that the splitting into ensures clauses performed by the
Inferring Loop Invariants Using Postconditions 291
1 coupled mutations
2 ( post : FORMULA; constant, variable: EXPRESSION )
3 : SET OF [FORMULA]
4 do
5 Result := replace(post, constant, variable, ∗)
6 aged variable := aging(variable, loop)
7 Result := Result ∪
8 replace(post, constant, aged variable, ∗)
6.1 Discussion
1 uncoupled mutations
2 (post : FORMULA; constant, variable: EXPRESSION)
3 : SET OF [FORMULA]
4 do
5 Result := ∅; index := 1
6 for each occurrence of constant in post do
7 Result := Result ∪
8 {replace(post, constant, variable, index)}
9 aged variable := aging(variable, loop)
10 Result := Result ∪
11 {replace(post, constant, aged variable, index)}
12 index := index + 1
6.2 Limitations
Relevant invariants obtained by postcondition mutation are most of the times
significant, practically useful, and complementary to a large extent to the cate-
gories that are better tackled by other methods (see next sub-section). Still, the
postcondition mutation technique cannot obtain every relevant invariant. Fail-
ures have two main different origins: conceptual limitations and shortcomings of
the currently used technology.
The first category covers invariants that are not expressible as mutations of
the postcondition. This is the case, in particular, whenever an invariant refers to
a local variable whose final state is not mentioned in the postcondition. For ex-
ample, the postcondition of procedure max in Section 2.1 does not mention vari-
able i because its final value n is not relevant for the correctness. Correspondingly,
invariant i ≤ n – which is involved in the partial correctness proof – cannot be ob-
tained from by mutating the postcondition. A potential solution to these concep-
tual limitations is two-fold: on the one hand, many of these invariants that escape
postcondition mutations can be obtained reliably with other inference
techniques that do not require postconditions – this is the case of invariant i ≤ n in
Inferring Loop Invariants Using Postconditions 295
Static methods. Historically, the earliest methods for invariant inference where
static as in the pioneering work of Karr [23]. Abstract interpretation and the
constraint-based approach are the two most widespread frameworks for static
invariant inference (see also [6, Chap. 12]).
Abstract interpretation is, roughly, a symbolic execution of programs over
abstract domains that over-approximates the semantics of loop iteration. Since
4
The routine assumes infinite-precision reals and does not terminate.
296 C.A. Furia and B. Meyer
the seminal work by Cousot and Cousot [11], the technique has been updated
and extended to deal with features of modern programming languages such as
object-orientation and heap memory-management (e.g., [28,9]).
Constraint-based techniques rely on sophisticated decision procedures over non-
trivial mathematical domains (such as polynomials or convex polyhedra) to repre-
sent concisely the semantics of loops with respect to certain template properties.
Static methods are sound – as is the technique introduced in this paper – and
often complete with respect to the class of invariants that they can infer. Sound-
ness and completeness are achieved by leveraging the decidability of the underly-
ing mathematical domains they represent; this implies that the extension of these
techniques to new classes of properties is often limited by undecidability. In fact,
state-of-the-art static techniques can mostly infer invariants in the form of “well-
behaving” mathematical domains such as linear inequalities [12,10], polynomials
[38,37], restricted properties of arrays [7,5,20], and linear arithmetic with unin-
terpreted functions [1]. Loop invariants in these forms are extremely useful but
rarely sufficient to prove full functional correctness of programs. In fact, one of
the main successes of abstract interpretation has been the development of sound
but incomplete tools [2] that can verify the absence of simple and common pro-
gramming errors such as division by zero or void dereferencing. Static techniques
for invariant inference are now routinely part of modern static checkers such as
ESC/Java [18], Boogie/Spec# [26], and Why/Krakatoa/Caduceus [22].
The technique of the present paper is complementary to most static tech-
niques in terms of the kinds of invariant that it can infer, because it derives
invariants directly from postconditions. In this respect “classic” static inference
and our inference by means of postcondition mutation can fruitfully work to-
gether to facilitate functional verification; to some extent this happens already
when complementing Boogie’s built-in facilities for invariant inference with our
own technique.
[34,21,8,25,24] are the approaches that, for different reasons, share more sim-
ilarities with ours. To our knowledge, [34,21,8,25] are the only other works ap-
plying a static approach to derive loop invariants from annotations. [21] relies
on user-provided assertions nested within loop bodies and essentially tries to
check whether they hold as invariants of the loop. This does not release the
burden of writing annotations nested within the code, which is quite complex
as opposed to providing only contracts in the form of pre and postconditions.
In practice, the method of [21] works only when the user-provided annotations
are very close to the actual invariant; in fact the few examples where the tech-
nique works are quite simple and the resulting invariants are usually obtainable
by other techniques that do not need annotations. [8] briefly discusses deriv-
ing the invariant of a for loop from its postcondition, within a framework for
reasoning about programs written in a specialized programming language. [25]
also leverages specifications to derive intermediate assertions, but focusing on
lower-level and type-like properties of pointers. On the other hand, [34] derives
candidate invariants from postconditions in a very different setting than ours,
with symbolic execution and model-checking techniques.
Inferring Loop Invariants Using Postconditions 297
Finally, [24] derives complex loop invariants by first encoding the loop seman-
tics as recurring relations and then instructing a rewrite-based theorem prover
to try to remove the dependency on the iterator variable(s) in the relations. It
shares with our work a practical attitude that favors powerful heuristics over
completeness and leverages state-of-the-art verification tools to boost the infer-
ence of additional annotations.
Dynamic methods. More recently, dynamic techniques have been applied to in-
variant inference. The Daikon approach of Ernst et al. [15] showed that dynamic
inference is practical and sprung much derivative work (e.g., [35,13,36] and many
others). In a nutshell, the Daikon approach consists in testing a large number of
candidate properties against several program runs; the properties that are not
violated in any of the runs are retained as “likely” invariants. This implies that
the inference is not sound but only an “educated guess”: dynamic invariant in-
ference is to static inference what testing is to program proofs. Nonetheless, just
like testing is quite effective and useful in practice, dynamic invariant inference
is efficacious and many of the guessed invariants are indeed sound.
Our approach shares with the Daikon approach the idea of guessing a candi-
date invariant and testing it a posteriori. There is an obvious difference between
our approach, which retains only invariants that can be soundly verified, and
dynamic inference techniques, which rely on a finite set of tests. A deeper dif-
ference is that Daikon guesses candidate invariants almost blindly, by trying
out a pre-defined set of user-provided templates (including comparisons between
variables, simple inequalities, and simple list comprehensions). On the contrary,
our technique assumes the availability of contracts (and postconditions in par-
ticular) and leverages it to restrict quickly the state-space of search and get to
good-quality loop invariants in a short time. As it is the case for static tech-
niques, dynamic invariant inference methods can also be usefully combined with
our technique, in such a way that invariants discovered by dynamic methods
boost the application of the postcondition-mutation approach.
References
1. Beyer, D., Henzinger, T.A., Majumdar, R., Rybalchenko, A.: Invariant synthesis for
combined theories. In: Cook, B., Podelski, A. (eds.) VMCAI 2007. LNCS, vol. 4349,
pp. 378–394. Springer, Heidelberg (2007)
2. Blanchet, B., Cousot, P., Cousot, R., Feret, J., Mauborgne, L., Miné, A., Monniaux,
D., Rival, X.: A static analyzer for large safety-critical software. In: Proceedings
of the 2003 ACM SIGPLAN Conference on Programming Language Design and
Implementation (PLDI 2003), pp. 196–207. ACM, New York (2003)
Inferring Loop Invariants Using Postconditions 299
3. Böhme, S., Leino, K.R.M., Wolff, B.: HOL-Boogie — an interactive prover for the
Boogie program-verifier. In: Mohamed, O.A., Muñoz, C., Tahar, S. (eds.) TPHOLs
2008. LNCS, vol. 5170, pp. 150–166. Springer, Heidelberg (2008)
4. Boyer, R.S., Moore, J.S.: MJRTY: A fast majority vote algorithm. In: Automated
Reasoning: Essays in Honor of Woody Bledsoe, pp. 105–118 (1991)
5. Bozga, M., Habermehl, P., Iosif, R., Konečný, F., Vojnar, T.: Automatic verifi-
cation of integer array programs. In: Bouajjani, A., Maler, O. (eds.) CAV 2009.
LNCS, vol. 5643, pp. 157–172. Springer, Heidelberg (2009)
6. Bradley, A.R., Manna, Z.: The Calculus of Computation. Springer, Heidelberg
(2007)
7. Bradley, A.R., Manna, Z., Sipma, H.B.: What’s decidable about arrays? In: Emer-
son, E.A., Namjoshi, K.S. (eds.) VMCAI 2006. LNCS, vol. 3855, pp. 427–442.
Springer, Heidelberg (2005)
8. de Caso, G., Garbervetsky, D., Gorı́n, D.: Reducing the number of annotations in a
verification-oriented imperative language. In: Proceedings of Automatic Program
Verification (2009)
9. Chang, B.Y.E., Leino, K.R.M.: Abstract interpretation with alien expressions and
heap structures. In: Cousot, R. (ed.) VMCAI 2005. LNCS, vol. 3385, pp. 147–163.
Springer, Heidelberg (2005)
10. Colón, M., Sankaranarayanan, S., Sipma, H.: Linear invariant generation using
non-linear constraint solving. In: Hunt Jr., W.A., Somenzi, F. (eds.) CAV 2003.
LNCS, vol. 2725, pp. 420–432. Springer, Heidelberg (2003)
11. Cousot, P., Cousot, R.: Abstract interpretation: A unified lattice model for static
analysis of programs by construction or approximation of fixpoints. In: Proceedings
of the 4th Annual ACM Symposium on Principles of Programming Languages
(POPL 1977), pp. 238–252 (1977)
12. Cousot, P., Halbwachs, N.: Automatic discovery of linear restraints among variables
of a program. In: Proceedings of the 5th Annual ACM Symposium on Principles
of Programming Languages (POPL 1978), pp. 84–96 (1978)
13. Csallner, C., Tillman, N., Smaragdakis, Y.: DySy: dynamic symbolic execution for
invariant inference. In: Schäfer, W., Dwyer, M.B., Gruhn, V. (eds.) Proceedings
of the 30th International Conference on Software Engineering (ICSE 2008), pp.
281–290. ACM, New York (2008)
14. Dijkstra, E.W.: A Discipline of Programming. Prentice-Hall, Englewood Cliffs
(1976)
15. Ernst, M.D., Cockrell, J., Griswold, W.G., Notkin, D.: Dynamically discovering
likely program invariants to support program evolution. IEEE Transactions of Soft-
ware Engineering 27(2), 99–123 (2001)
16. Filliâtre, J.C.: The WHY verification tool (2009), version 2.18,
http://proval.lri.fr
17. Flanagan, C., Leino, K.R.M.: Houdini, an annotation assistant for ESC/Java. In:
Oliveira, J.N., Zave, P. (eds.) FME 2001. LNCS, vol. 2021, pp. 500–517. Springer,
Heidelberg (2001)
18. Flanagan, C., Leino, K.R.M., Lillibridge, M., Nelson, G., Saxe, J.B., Stata, R.:
Extended static checking for Java. In: Proceedings of the 2002 ACM SIGPLAN
Conference on Programming Language Design and Implementation (PLDI’02).
SIGPLAN Notices, vol. 37(5), pp. 234–245. ACM, New York (2002)
19. Gries, D.: The science of programming. Springer, Heidelberg (1981)
20. Henzinger, T.A., Hottelier, T., Kovács, L., Voronkov, A.: Invariant and type infer-
ence for matrices. In: Barthe, G., Hermenegildo, M. (eds.) VMCAI 2010. LNCS,
vol. 5944, pp. 163–179. Springer, Heidelberg (2010)
300 C.A. Furia and B. Meyer
21. Janota, M.: Assertion-based loop invariant generation. In: Proceedings of the 1st
International Workshop on Invariant Generation, WING 2007 (2007)
22. Jean-Christophe Filliâtre, C.M.: The Why/Krakatoa/Caduceus platform for de-
ductive program verification. In: Damm, W., Hermanns, H. (eds.) CAV 2007.
LNCS, vol. 4590, pp. 173–177. Springer, Heidelberg (2007)
23. Karr, M.: Affine relationships among variables of a program. Acta Informatica 6,
133–151 (1976)
24. Kovács, L., Voronkov, A.: Finding loop invariants for programs over arrays using a
theorem prover. In: Chechik, M., Wirsing, M. (eds.) FASE 2009. LNCS, vol. 5503,
pp. 470–485. Springer, Heidelberg (2009)
25. Lahiri, S.K., Qadeer, S., Galeotti, J.P., Voung, J.W., Wies, T.: Intra-module infer-
ence. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 493–508.
Springer, Heidelberg (2009)
26. Leino, K.R.M.: This is Boogie 2 (June 2008), (Manuscript KRML 178),
http://research.microsoft.com/en-us/projects/boogie/
27. Leino, K.R.M., Monahan, R.: Reasoning about comprehensions with first-order SMT
solvers. In: Shin, S.Y., Ossowski, S. (eds.) Proceedings of the 2009 ACM Symposium
on Applied Computing (SAC 2009), pp. 615–622. ACM Press, New York (2009)
28. Logozzo, F.: Automatic inference of class invariants. In: Steffen, B., Levi, G. (eds.)
VMCAI 2004. LNCS, vol. 2937, pp. 211–222. Springer, Heidelberg (2004)
29. Meyer, B.: A basis for the constructive approach to programming. In: Lavington,
S.H. (ed.) Proceedings of IFIP Congress 1980, pp. 293–298 (1980)
30. Meyer, B.: Object-oriented software construction, 2nd edn. Prentice-Hall, Engle-
wood Cliffs (1997)
31. Meyer, B.: Touch of Class: learning to program well with objects and contracts.
Springer, Heidelberg (2009)
32. Morgan, C.: Programming from Specifications, 2nd edn. Prentice-Hall, Englewood
Cliffs (1994)
33. Parberry, I., Gasarch, W.: Problems on Algorithms (2002),
http://www.eng.ent.edu/ian/books/free/
34. Păsăreanu, C.S., Visser, W.: Verification of Java programs using symbolic execu-
tion and invariant generation. In: Graf, S., Mounier, L. (eds.) SPIN 2004. LNCS,
vol. 2989, pp. 164–181. Springer, Heidelberg (2004)
35. Perkings, J.H., Ernst, M.D.: Efficient incremental algorithms for dynamic detection
of likely invariants. In: Taylor, R.N., Dwyer, M.B. (eds.) Proceedings of the 12th
ACM SIGSOFT International Symposium on Foundations of Software Engineering
(SIGSOFT 2004/FSE-12), pp. 23–32. ACM, New York (2004)
36. Polikarpova, N., Ciupa, I., Meyer, B.: A comparative study of programmer-written
and automatically inferred contracts. In: Proceedings of the ACM/SIGSOFT Inter-
national Symposium on Software Testing and Analysis (ISSTA 2009), pp. 93–104
(2009)
37. Rodrı́guez-Carbonell, E., Kapur, D.: Generating all polynomial invariants in simple
loops. Journal of Symbolic Computation 42(4), 443–476 (2007)
38. Sankaranarayanan, S., Sipma, H., Manna, Z.: Non-linear loop invariant generation
using Gröbner bases. In: Jones, N.D., Leroy, X. (eds.) Proceedings of the 31st ACM
SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL
2004), pp. 318–329. ACM, New York (2004)
39. Schulte, W., Xia, S., Smans, J., Piessens, F.: A glimpse of a verifying C compiler
(extended abstract). In: C/C++ Verification Workshop (2007)
40. Tschannen, J.: Automatic verification of Eiffel programs. Master’s thesis, Chair of
Software Engineering, ETH Zürich (2009)
ASMs and Operational Algorithmic
Completeness of Lambda Calculus
Discussions with Yuri, during his many visits in Paris or Lyon, have
been a source of great inspiraton for the authors. We thank him for so
generously sharing his intuitions around the many faces of the notion of
algorithm.
1 Introduction
Since the pioneering work of Church and Kleene, going back to 1935, many
computation models have been shown to compute the same class of functions,
namely, using Turing Thesis, the class of all computable functions. Such classes
are said to be Turing complete or denotationally algorithmically complete.
This is a result about crude input/output behaviour. What about the ways
to go from the input to the output, i.e., the executions of algorithms in each of
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 301–327, 2010.
c Springer-Verlag Berlin Heidelberg 2010
302 M. Ferbus-Zanda and S. Grigorieff
these computation models? Do they constitute the same class? Is there a Thesis
for algorithms analog to Turing Thesis for computable functions?
As can be expected, denotational completeness does not imply operational
completeness. Clearly, the operational power of machines using massive paral-
lelism cannot be matched by sequential machines. For instance, on networks
of cellular automata, integer multiplication can be done in real time (cf. Atru-
bin, 1962 [1], see also Knuth, [21] p.394-399), whereas on Turing machines, an
Ω(n/ log n) time lower bound is known. Keeping within sequential computation
models, multitape Turing machines have greater operational power than one-
tape Turing machines. Again, this is shown using a complexity argument: palin-
dromes recognition can be done in linear time on two-tapes Turing machines,
whereas it requires computation time O(n2 ) on one-tape Turing machines (Hen-
nie, 1965 [18], see also [5,24]).
Though resource complexity theory may disprove operational algorithmic
completeness, there was no formalization of a notion of operational completeness
since the notion of algorithm itself had no formal mathematical modelization.
Tackled by Kolmogorov in the 50’s [20], the question for sequential algorithms
has been answered by Gurevich in the 80’s [11,12,13] (see [6] for a compre-
hensive survey of the question), with their formalization as “evolving algebras”
(now called “abstract state machines” or ASMs) which has lead to Gurevich’s
sequential Thesis.
Essentially, an ASM can be viewed as a first order multi-sorted structure and
a program which modifies some of its predicates and functions (called dynamic
items). Such dynamic items capture the moving environment of a procedural
program. The run of an ASM is the sequence of structures – also called states –
obtained by iterated application of the program. The program itself includes
two usual ingredients of procedural languages, namely affectation and the con-
ditional “if. . . then. . . else. . . ”, plus a notion of parallel block of instructions. This
last notion is a key idea which is somehow a programming counterpart to the
mathematical notion of system of equations.
Gurevich’s sequential Thesis [12,16,17] asserts that ASMs capture the notion
of sequential algorithm. Admitting this Thesis, the question of operational com-
pleteness for a sequential procedural computation model is now the comparison
of its operational power with that of ASMs.
lambda term θ with the following property. Let at1 , . . . , atp be the values (coded
as lambda terms) of all dynamic items of the ASM at step t, if the run does not
stop at step t then
K reductions
θat1 . . . atp → ··· → θat+1
1 . . . at+1
p .
If the run stops at step t then the left term reduces to a term in normal form
which gives the list of outputs if they are defined. Thus, representing the state
of the ASM at time t by the term θat1 . . . atp , a group of K successive reductions
gives the state at time t + 1. In other words, K reductions faithfully simulate one
step of the ASM run. Moreover, this group of reductions is that obtained by the
leftmost redex reduction strategy, hence it is a deterministic process. Thus, lambda
calculus is operationally complete for deterministic sequential computation.
Let us just mention that adding to lambda calculus one step reduction of
primitive operations is not an unfair trick. Every algorithm has to be “above”
some basic operations which are kind of oracles: the algorithm decomposes the
computation in elementary steps which are considered as atomic steps though
they obviously themselves require some work. In fact, such basic operations can
be quite complex: when dealing with integer matrix product (as in Strassen’s
algorithm in time O(nlog 7 )), one considers integer addition and multiplication
as basic... Building algorithms on such basic operations is indeed what ASMs do
with the so-called static items, cf. §2.3, Point 2.
The proof of our results uses Curry’s fixed point technique in lambda calculus
plus some padding arguments.
2 ASMs
2.1 The Why and How of ASMs on a Simple Example
Euclid’s Algorithm Consider Euclid’s algorithm to compute the greatest common
divisor (gcd) of two natural numbers. It turns out that such a simple algorithm
already allows to pinpoint an operational incompleteness in usual programming
languages. Denoting by rem(u, v) the remainder of u modulo v, this algorithm
can be described as follows1
Given data: two natural numbers a, b
While b
= 0 replace the pair (a, b) by (b, rem(a, b))
When b = 0 halt: a is the wanted gcd
Observe that the the pair replacement in the above while loop involves some
elementary parallelism which is the algorithmic counterpart to co-arity, i.e., the
consideration of functions with range in multidimensional spaces such as the
N2 → N2 function (x, y)
→ (y, rem(x, y)).
Yuri Gurevich has gathered as three Sequential Postulates (cf. [17,10]) some key
features of deterministic sequential algorithms for partial computable functions
(or type 1 functionals).
1. The base sets. Find out the underlying families of objects involved in the
given algorithm, i.e., objects which can be values for inputs, outputs or
environmental parameters used during the execution of the algorithm. These
families constitute the base sets of the ASM. In Euclid’s algorithm, a natural
base set is the set N of natural integers.
2. Static items. Find out which particular fixed objects in the base sets are con-
sidered and which functions and predicates over/between the base sets are
viewed as atomic in the algorithm, i.e., are not given any modus operandi.
Such objects, functions and predicates are called the primitive or static items
of the ASM. They do not change value through transitions. In Euclid’s algo-
rithm, static items are the integer 0, the rem function and the < predicate.
3. Dynamic items. Find out the diverse objects, functions and predicates over
the base sets of the ASM which vary through transitions. Such objects,
functions and predicates are called the dynamic items of the ASM. In Euclid’s
algorithm, these are a, b.
4. States: from a multi-sorted partial structure to a multi-sorted partial algebra.
Collecting all the above objects, functions and predicates leads to a first-
order multi-sorted structure of some logical typed language: any function
2
In ASM theory, an ASM is, in fact, a multialgebra (cf. point 1 of Remark §2.1).
306 M. Ferbus-Zanda and S. Grigorieff
goes from some product of sorts into some sort, any predicate is a relation
over some sorts. However, there is a difference with the usual logical notion
of multi-sorted structure: predicates and functions may be partial. A feature
which is quite natural for any theory of computability, a fortiori for any
theory of algorithms.
To such a multi-sorted structure one can associate a multi-sorted algebra
as follows. First, if not already there, add a sort for Booleans. Then replace
predicates by their characteristic functions In this way, we get a multi-sorted
structure with partial functions only, i.e. a multialgebra.
5. Programs. Finally, the execution of the algorithm can be viewed as a se-
quence of states. Going from one state to the next one amounts to applying to
the state a particular program – called the ASM program – which modifies
the interpretations of the sole dynamic symbols (but the universe itself and
the interpretations of the static items remain unchanged). Thus, the execu-
tion of the algorithm appears as an iterated application of the ASM program.
It is called the run of the ASM.
Using the three above postulates, Gurevich [16,17] proves that quite ele-
mentary instructions – namely blocks of parallel conditional updates – suffice
to get ASM programs able to simulate step by step any deterministic proce-
dural algorithm.
6. Inputs, initialization map and initial state. Inputs correspond to the values
of some distinguished static symbols in the initial state, i.e., we consider
that all inputs are given when the algorithm starts (though questionable
in general, this assumption is reasonable when dealing with algorithms to
compute a function). All input symbols have arity zero for algorithms com-
puting functions. Input symbols with non zero arity are used when dealing
with algorithms for type 1 functionals.
The initialization map associates to each dynamic symbol a term built
up with static symbols. In an initial state, the value of a dynamic symbol is
required to be that of the associated term given by the initialization map.
7. Final states and outputs. There may be several outputs, for instance if the
algorithm computes a function Nk → N with ≥ 2.
A state is final when, applying the ASM program to that state,
(a) either the Halt instruction is executed (Explicit halting),
(b) or no update is made (i.e. all conditions in conditional blocks of updates
get value False) (Implicit halting) .
In that case, the run stops and the outputs correspond to the values of
some distinguished dynamic symbols. For algorithms computing functions,
all output symbols are constants (i.e. function symbols with arity zero).
8. Exceptions. There may be a finite run of the ASM ending in a non final state.
This corresponds to exceptions in programming (for instance a division by
0) and there is no output in such cases. This happens when
(a) either the Fail instruction is executed (Explicit failing),
(b) or there is a clash between two updates which are to be done simultane-
ously (Implicit failing).
ASMs and Operational Algorithmic Completeness of Lambda Calculus 307
Remark 2.1. Let us describe how our presentation of ASMs (slightly) departs
from [10].
1. We stick to what Gurevich says in §.2.1 of [14] (Lipari Guide, 1993): “Actually,
we are interested in multi-sorted structures with partial operations”. Thus, we
do not regroup sorts into a single universe and do not extend functions with the
undef element.
2. We add the notion of initialization map which brings a syntactical counterpart
to the semantical notion of initial state. It also rules out any question about the
status of initial values of dynamic items which would not be inputs.
3. We add explicit acceptance and rejection as specific instructions in ASM
programs. Of course, they can be simulated using the other ASM instructions
(so, they are syntactic sugar) but it may be convenient to be able to explicitly
tell there is a failure when something like a division by zero is to be done. This
is what is done in many programming languages with the so-called exceptions.
Observe that Fail has some common flavor with undef. However, Fail is relative
to executions of programs whereas undef is relative to the universe on which the
program is executed.
4. As mentioned in §2.1, considering several outputs goes along with the idea of
parallel updates.
In the usual way, using variables typed by the n sorts of L, one constructs typed
L-terms and their types. The type of a term t is of the form si or si1 ×· · ·×sik →
si where si1 , . . . , sik are the types of the different variables occurring in t. Ground
terms are those which contain no variable. The semantics of typed terms is the
usual one.
Definition 2.4. Let L be an ASM vocabulary and S an ASM L-state with uni-
verse U. Suppose σ : {1, . . . , } → {1, . . . , p} is any map and τ : {1, . . . , p} →
{1, . . . , m} is a distribution of (indexes of ) sorts. Suppose t is a typed term of type
sτ (σ(1)) × · · · × sτ (σ()) → si . We let tτ,σ
S be the function Usτ (1) × · · · × Usτ (p) → Ui
such that, for all (a1 , · · · , ap ) ∈ Usτ (1) × · · · × Usτ (p) ,
tτ,σ
S (a1 , · · · , ak ) = tS (aσ(1) , · · · , aσ() ) .
L-terms with no variable are used to name particular elements in the universe U
of an ASM whereas L-terms with variables are used to name particular functions
over U.
Using the lifting process described in Definition 2.4, one can use terms con-
taining less than k variables to name functions with arity k.
Remark 2.6. Of course, the values of static symbols are basic ones, they are
not to be defined from anything else: either they are inputs or they are the
elementary pieces upon which the ASM algorithm is built.
ASMs and Operational Algorithmic Completeness of Lambda Calculus 309
is an L-program.
The intuition of programs is as follows.
– Skip is the program which does nothing. Halt halts the execution in a suc-
cessful mode and the outputs are the current values of the output symbols.
Fail also halts the execution but tells that there is a failure, so that there
is no meaningful output.
– Updates modify the interpretations of dynamic symbols, they are the basic
instructions. The left member has to be of the form α(· · · ) with α a dynamic
symbol because the interpretations of static symbols do not vary.
– The conditional constructor has the usual meaning whereas the parallel con-
structor is a new control structure to get simultaneous and independent ex-
ecutions of programs P1 , . . . , Pn .
Successor State
Definition 2.12. Let L be an ASM vocabulary and S be an L-state.
The successor state T = Succ(S, P ) of state S relative to an L-program P is
defined if only if P does not clash nor fail nor halt on S.
In that case, the successor is inductively defined via the following clauses.
Remark 2.13. In particular, αT (a) and αS (a) have the same value in case a =
(a1 , . . . , ak ) is not the value in S of any k-tuple of ground terms (t1 , . . . , tk ) such
that Active (S, P ) contains an update of the form α(t1 , . . . , tk ) := u for some
ground term u.
312 M. Ferbus-Zanda and S. Grigorieff
– S0 is J .
– i + 1 ∈ I if and only if P does not clash nor fail nor halt on Si and
Active (Si , P )
= ∅ (i.e. there is an active update3 ).
– If i + 1 ∈ I then Si+1 = Succ(Si , P ).
Remark 2.15. In case Active (Si , P ) = ∅ and P does not clash nor fail nor halt
on Si and Si = Si+1 (i.e., if the active updates do not modify Si ) then the run
is infinite: Sj = Si for every j > i.
3
Nevertheless, it is possible that Si and Succ(Si , P ) coincide, cf. Remark 2.15.
ASMs and Operational Algorithmic Completeness of Lambda Calculus 313
where the Ii,j ’s are updates or Halt or Fail and the interpretations of C1 ,. . . ,
Cn in any state are Booleans such that at most one of them is True.
Proof. For Skip, Halt, Fail consider an empty parallel block. For an update or
Halt or Fail consider a block of one conditional with a tautological condition.
Simple Boolean conjunctions allow to transform a conditional of two programs
of the wanted form into the wanted form. The same for parallel blocks of such
programs.
3 Lambda Calculus
As much as possible, our notations are taken from Barendregt’s book [3] (which
is a standard reference on Λ-calculus).
3.2 β-Reduction
Note 3.1. Symbols := are used for updates in ASMs and are also commonly used
in Λ-calculus to denote by M [x := N ] the substitution of all occurrences of a
variable x in a term M by a term N . To avoid any confusion, we shall rather
denote such a substitution by M [N/x].
M →i M N →i N M →i M
(App) (Abs)
M N →i M N M N →i M N (λx.M ) →i λx.M
2. A λ-term M has a normal form if there exists some term N in normal form
such that M N .
Remark 3.4. There are terms with no normal form. The classical example is
Ω = ΔΔ where Δ = λx . xx. Indeed, Ω is a redex and reduces to itself.
In a λ-term, there can be several subterms which are redexes, so that iterating →
reductions is a highly non deterministic process. Nevertheless, going to normal
form is a functional process.
Theorem 3.5 (Church-Rosser [7], 1936). The relation is confluent: if
M N and M N then there exists P such that N P and N P . In
particular, there exists at most one term N in normal form such that M N .
Remark 3.6. Theorem 3.5 deals with exclusively: relation →i is not confluent
for any i ≥ 1.
Theorem 3.8 (Curry & Feys [9], 1958). Reducing the leftmost redex of
terms not in normal form is a deterministic strategy which leads to the nor-
mal form if there is some.
In other words, if M has a normal form N then the sequence M = M0 →
M1 → M2 → · · · where each reduction Mi → Mi+1 reduces the leftmost redex in
Mi (if Mi is not in normal form) is necessarily finite and ends with N .
Proposition 3.10. Boolean elements True, False and usual Boolean functions
can be represented by the following λ-terms, all in normal form:
neg = λx . xFalse True
and = λxy . xyFalse
True = λxy.x
or = λxy . x True y
False = λxy.y
implies = λxy . xyTrue
iff = λxy . xy(¬y)
For a, b ∈ {True, False}, we have neg a → ¬a, and a b af b,. . . .
Proposition 3.11 (If Then Else). For all terms M, N ,
(λz . zM N ) True →2 M , (λz . zM N ) False →2 N .
We shall use the following version of iterated conditional.
Proposition 3.12. For every n ≥ 1 there exists a term Casen such that, for all
normal terms M1 , . . . , Mn and all t1 , . . . , tn ∈ {True, False},
Casen M1 . . . Mn t1 . . . tn →3n Mi
relative to leftmost reduction in case ti = True and ∀j < i tj = False.
Proof. Let ui = yi (λxi+1 . I) . . . (λxn . I), set
Casen = λy1 . . . yn z1 . . . zn . z1 u1 (z2 u2 (. . . (zn−1 un−1 (zn un I)) . . .))
and observe that, for leftmost reduction, letting Mi = ui [Mi /yi ],
Casen M1 . . . Mn t1 . . . tn →2n t1 M1 (t2 M2 (. . . (tn−1 Mn−1
(tn Mn I)) . . .))
→i Mi
→n−i Mi .
Definition 3.16. Let F be a family of functions with any arities over some
datatypes A1 , . . . , An . The ΛF -calculus is defined as follows:
Proof. Theorem 3.5 insures that β is confluent. It is immediate to see that any
two applications of the F axioms can be permuted: this is because two distinct
F-redexes in a term are always disjoint subterms. Hence →F is confluent. Observe
that is obtained by iterating finitely many times the relation β ∪ →F . Using
Hindley-Rosen Lemma (cf. Barendregt’s book [3], Proposition 3.3.5, or Hankin’s
book [19], Lemma 3.27), to prove that is confluent, it suffices to prove that
β and →F commute. One easily reduces to prove that →β and →F commute,
i.e.,
∃P (M →β P →F N ) ⇐⇒ ∃Q (M →F Q →β N ) .
Any length two such sequence of reductions involves two redexes in the term M :
a β-redex R = (λx . A)B and a F-redex C = c a1 · · · ak . There are three
cases: either R and C are disjoint subterms of M or C is a subterm of A or C is
a subterm of B. Each of these cases is straightforward.
of pattern F-terms, their types and semantics are defined as follows: Let f ∈ F
be such that f : Ai1 × · · · × Aik → Aq .
Ai Ai Ai Ai
– If xj1 1 , . . . , xjk k are typed variables then the term cf xj1 1 . . . xjk k is a pattern
Ai Ai
F-term with type Ai1 × · · · × Aik → Aq and semantics [[ cf xj1 1 . . . xjk k ]] = f .
– For j = 1, . . . , k, let tj be a pattern F-term with datatype Aj or a typed
A
variable xi j . Suppose the term t = cf t1 · · · tk contains exactly the typed
variables xA
j for (i, j) ∈ I and, for = 1, . . . , k, the term t contains exactly
i
[[ t ]((a
] i )i∈I ) = f ([[ t1 ]((a
] i )i∈I1 )), . . . , [[ tk ]((a
] k )i∈Ik ))) .
The semantics of good F-terms is best illustrated by the following example: the
function h associated to the term cg (ch y)x(cg zzx) is the one given by equality
f (x, y, z) = g(h(y), x, g(z, z, x)) which corresponds to Figure 3.9.
h x g
y z z x
The reason for the above definition is the following simple result about reduc-
tions of good terms obtained via substitutions. It is proved via a straightforward
induction on good F-terms and will be used in §4.3, 4.4
320 M. Ferbus-Zanda and S. Grigorieff
We use Curry’s fixed point Theorem and the above padding technique to insure
constant length reductions for any given update function for tuples.
1. Using the leftmost reduction strategy, for all (a1 , . . . , ak ) ∈ Aτ (i) ×· · ·×Aτ (k) ,
denoting by aI the tuple (aj )j∈I ,
Since θ and the ϕi ’s have no F-redex, we have the following leftmost reduction:
(in particular, f1,j , . . . , fp,j all take values in Aτ (j) ). There exists constants
Kmin and Lmin such that, for all K ≥ Kmin and L ≥ Lmin , there exists a
λ-term θ such that,
1. Using the leftmost reduction strategy, for all (a1 , . . . , ak ) ∈ Aτ (1) × · · · × Aτ (k)
and s ∈ {1, . . . , p, p + 1, . . . , p + q},
If rs (aIs ) = True ∧ ∀t < s rt (aIt ) = False (†)s
Remark 5.3. A simple count in the proof of Lemma 4.5 allows to bound K0 as
follows: K0 = O((size of P )2 ).
and the axioms associated to the following intuitions. For ε = (i1 , . . . , im , i),
i. Symbol Fε is to represent the function Lε → Bool such that, for σ ∈ Lε ,
Fε (σ) is True if and only if σ is functional in its first m components. In
other words, Fε checks if any two distinct sequences in σ always differ on
their first m components.
ii. Symbol Bε is to represent the function Lε × (Ai1 × · · · × Aim ) → Bool such
that, for σ ∈ Lε and a ∈ Ai1 × · · · × Aim , Bε (σ, a) is True if and only if a
is a prefix of some (m + 1)-tuple in the finite sequence σ.
iii. Symbol Vε is to represent the function Lε × (Ai1 × · · · × Aim ) → Ai such that,
for σ ∈ Lε and a ∈ Ai1 × · · · × Aim ,
- Vε (σ, a) is defined if and only if Fε (σ) = True and Bε (σ, a) = True,
326 M. Ferbus-Zanda and S. Grigorieff
eti is the list of pi + 1-tuples describing the differences between the inter-
pretations of (ηi )S0 and (ηi )St .
References
1. Atrubin, A.J.: A One-Dimensional Real-Time Iterative Multiplier. Trans. on Elec-
tronic Computers EC 14(3), 394–399 (1965)
2. Börger, E., Stärk, R.: Abstract State Machines: A Method for High-Level System
Design and Analysis. Springer, Heidelberg (2003)
3. Barendregt, H.P.: The Lambda calculus. Its syntax and semantics. North-Holland,
Amsterdam (1984)
4. Barendregt, H., Statman, R.: Böhm’s Theorem, Church’s Delta, Numeral Systems,
and Ershov Morphisms. In: Middeldorp, A., van Oostrom, V., van Raamsdonk, F.,
de Vrijer, R. (eds.) Processes, Terms and Cycles: Steps on the Road to Infinity.
LNCS, vol. 3838, pp. 40–54. Springer, Heidelberg (2005)
5. Biedl, T., Buss, J.F., Demaine, E.D., Demaine, M.L., Hajiaghayi, M., Vinaĭ, T.:
Palindrome recognition using a multidimensional tape. Theoretical Computer Sci-
ence 302(1-3), 475–480 (2003)
6. Börger, E.: The Origins and the Development of the ASM Method for High Level
System Design and Analysis. Journal of Universal Computer Science 8(1), 2–74
(2002)
7. Church, A., Rosser, J.B.: Some properties of conversion. Trans. Amer. Math.
Soc. 39, 472–482 (1937)
8. Church, A.: The Calculi of Lambda Conversion. Princeton University Press, Prince-
ton (1941)
9. Curry, H., Feys, R.: Combinatory logic, vol. I. North-Holland, Amsterdam (1958)
10. Dershowitz, N., Gurevich, Y.: A natural axiomatization of computability and proof
of Church’s Thesis. Bulletin. of Symbolic Logic 14(3), 299–350 (2008)
ASMs and Operational Algorithmic Completeness of Lambda Calculus 327
11. Gurevich, Y.: Reconsidering Turing’s Thesis: towards more realistic semantics
of programs. Technical Report CRL-TR-38-84, EEC Department. University of
Michigan (1984)
12. Gurevich, Y.: A new Thesis. Abstracts, American Math. Soc., Providence (1985)
13. Gurevich, Y.: Evolving Algebras: An Introductory Tutorial. Bulletin of the Euro-
pean Association for Theoretical Computer Science 43, 264–284 (1991); Reprinted
in Current Trends in Theoretical Computer Science, pp. 266–269. World Scientific,
Singapore (1993)
14. Gurevich, Y.: Evolving algebras 1993: Lipari guide. In: Specification and Validation
Methods, pp. 9–36. Oxford University Press, Oxford (1995)
15. Gurevich, Y.: May 1997 Draft of the ASM Guide. Tech. Report CSE-TR-336-97,
EECS Dept., University of Michigan (1997)
16. Gurevich, Y.: The Sequential ASM Thesis. Bulletin of the European Association
for Theoretical Computer Science 67, 93–124 (1999); Reprinted in Current Trends
in Theoretical Computer Science, pp. 363–392. World Scientific, Singapore (2001)
17. Gurevich, Y.: Sequential Abstract State Machines capture Sequential Algorithms.
ACM Transactions on Computational Logic 1(1), 77–111 (2000)
18. Hennie, F.C.: One-tape off-line Turing machine complexity. Information and Com-
putation 8, 553–578 (1965)
19. Hankin, C.: Lambda calculi. In: A guide for computer scientists. Graduate Texts
in Computer. Oxford University Press, Oxford (1994)
20. Kolmogorov, A.N.: On the definition of algorithm. Uspekhi Mat. Nauk. 13(4), 3–28
(1958); Translations Amer. Math. Soc. 29, 217–245 (1963)
21. Knuth, D.: The Art of Computer Programming, 3rd edn., vol. 2. Addison-Wesley,
Reading (1998)
22. Krivine, J.L.: A call-by-name lambda-calculus machine. Higher Order and Symbolic
Computation 20, 199–207 (2007)
23. Mogensen, T.: Efficient Self-Interpretation in Lambda Calculus. J. of Functional
Programming 2(3), 345–363 (1992)
24. Paul, W.: Kolmogorov complexity and lower bounds. In: Budach, L. (ed.) Second
Int. Conf. on Fundamentals of Computation Theory, pp. 325–334. Akademie, Berlin
(1979)
25. Ronchi Della Rocca, S., Paolini, L.: The Parametric Lambda-calculus. In: A Meta-
model for Computation. Springer, Heidelberg (2004)
26. Statman, R.: Church’s Lambda Delta Calculus. In: Parigot, M., Voronkov, A. (eds.)
LPAR 2000. LNCS (LNAI), vol. 1955, pp. 293–307. Springer, Heidelberg (2000)
Fixed-Point Definability and Polynomial Time
on Chordal Graphs and Line Graphs
Martin Grohe
For Yuri, in recognition of his inspiring work in finite model theory and
elsewhere.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 328–353, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Fixed-Point Definability and Polynomial Time on Chordal Graphs 329
only apply to properties of ordered structures, that is, relational structures with
one distinguished relation that is a linear order of the elements of the structure.
It is still an open question whether there are logics that characterise these com-
plexity classes on arbitrary, not necessarily ordered structures. We focus on the
class PTIME from now on. In this section, which is an updated version of [32],
we give a short survey of the quest for a logic capturing PTIME.
Ullman [2] had realised that SQL, the standard query language for relational
databases, cannot express all database queries computable in polynomial time,
Chandra and Harel [10] asked for a recursive enumeration of the class of all
relational database queries computable in polynomial time. It turned out that
Chandra and Harel’s question is equivalent to Gurevich’s question for a logic
capturing PTIME, up to a minor technical detail.1
The question of whether there is a logic that captures PTIME is still wide open,
and it is considered one of the main open problems in finite model theory and
database theory. Gurevich conjectured that there is no logic capturing PTIME.
This would not only imply that PTIME = NP — remember that by Fagin’s
Theorem there is a logic capturing NP — but it would actually have interesting
consequences for the structure of the complexity class PTIME. Dawar [15] proved
a dichotomy theorem stating that, depending on the answer to the question, there
are two fundamentally different possibilities: If there is a logic for PTIME, then
the structure of PTIME is very simple; all PTIME-properties are variants or special
cases of just one problem. If there is no logic for PTIME, then the structure of
PTIME is so complicated that it eludes all attempts for a classification. The formal
statement of the first possibility is that there is a complete problem for PTIME
under first-order reductions. The formal statement of the second possibility is
that the class of PTIME-properties is not recursively enumerable.2
gave a characterisation of the class of line graphs (more precisely, the class of all
graphs isomorphic to a line graph) by a family of nine excluded subgraphs. An
extension of the class of line graphs, which has also received a lot of attention
in the literature, is the class of claw-free graphs. A graph is claw-free if it does
not have a vertex with three pairwise nonadjacent neighbours, that is, if it does
not have a claw (displayed in Fig. 2) as an induced subgraph. It is easy to see
that all line graphs are claw-free. Recently, Chudnowsky and Seymour (see [12])
developed a structure theory for claw-free graphs.
It would be tempting to use this structure theory for claw free graphs, or at
least the simple treelike structure of chordal graphs, to prove that IFP+C captures
PTIME on these classes in a similar way as the structure theory for classes of graphs
with excluded minors is used to prove that IFP+C captures PTIME on classes with
excluded minors. Unfortunately, this is only possible on the very restricted class
of graphs that are both chordal and line graphs (an example of such a graph is
shown in Fig. 3 on p.343). We prove the following theorem:
Theorem 2
1. IFP+C does not capture PTIME on the class of chordal graphs or on the class
of line graphs.
2. IFP+C captures PTIME on the class of chordal line graphs.
Our construction to prove (1) is so simple that it will apply to any reasonable
logic, which means that if a “reasonable” logic captures PTIME on the class of
chordal graphs or on the class of line graphs, then it captures PTIME on the class
of all graphs.
(a) (b)
Fig. 1. (a) a chordal graph, which is not a line graph, and (b) the line graph of K4 ,
which is not chordal
Fig. 2. A claw
Fixed-Point Definability and Polynomial Time on Chordal Graphs 333
Further interesting graph classes closed under taking induced subgraphs are
various classes of intersection graphs. Very recently, Laubner [51] proved that
IFP+C captures PTIME on the class of all interval graphs. To conclude our dis-
cussion of classes of graphs on which IFP+C captures PTIME, let me mention a
result due to Hella, Kolaitis, and Luosto [41] stating that IFP+C captures PTIME
on almost all graphs (in a precise technical sense). Thus it seems that the results
for specific classes of graphs are not very surprising, but it should be mentioned
that almost no graphs fall in one of the natural graphs classes discussed before.
Instead of capturing all PTIME on a specific class of structures, Otto [55,56,57]
studied the question of capturing all PTIME properties satisfying certain invari-
ance conditions. Most notably, he proved that bisimulation-invariant properties
are decidable in polynomial time if and only if they are definable in the higher-
dimensional μ-calculus.
such a logic captures polynomial time on a class of structures, then this shows
that all polynomial time properties of structures in this class are based on the
principles underlying the logic. Thus even for classes for which we know that
there is a logic capturing PTIME through a polynomial-time canonisation algo-
rithm, I think it is important to find “natural” logics capturing PTIME on these
classes. In particular, I view it as an important open problem to find a natural
logic that captures PTIME on classes of graphs of bounded degree. It is known
that IFP+C does not capture PTIME on the class of all graphs of maximum degree
at most three.
Most known capturing results are proved by showing that there is a canonisa-
tion mapping that is definable in some logic. In particular, all capturing results
for IFP+C mentioned above are proved this way. It was observed by Cai, Fürer,
and Immerman [9] that for classes C of structures which admit a canonisation
mapping definable in IFP+C, a simple combinatorial algorithm known as the
Weisfeiler-Lehman (WL) algorithm [23,24] can be used as a polynomial time iso-
morphism test on C. Thus the the WL-algorithm correctly decides isomorphism
on the class of chordal line graphs and on all classes of graphs with excluded
minors. A refined version of the same approach was used by Verbitsky and others
[35,49,61] to obtain parallel isomorphism tests running in polylogarithmic time
for planar graphs and graphs of bounded tree width.
Both CP+C and IFP+R are known to be strictly more expressive than IFP+C.
Indeed, both logics can express the property used by Cai, Fürer, and Immerman
to separate IFP+C from PTIME. For both logics it is open whether they capture
polynomial time, and it is also open whether one of them semantically contains
the other.
2 Preliminaries
N0 and N denote the sets of nonnegative integers and positive integers, respec-
tively. For m, n ∈ N0 , we let [m, n] := { ∈ N0 | m ≤ ≤ n} and [n] := [1, n].
S
S the power set of a set S by 2 and the set of all k-element subsets of
We denote
S by k .
We often denote tuples (v1 , . . . , vk ) by v . If v denotes the tuple (v1 , . . . , vk ),
then by ṽ we denote the set {v1 , . . . , vk }. If v = (v1 , . . . , vk ) and w
= (w1 , . . . , w ),
then by v w
we denote the tuple (v1 , . . . , vk , w1 , . . . , w ). By |v | we denote the
length of a tuple v , that is, |(v1 , . . . , vk )| = k.
2.1 Graphs
Graphs in this paper are always finite, nonempty, and simple, where simple
means that there are no loops or parallel edges. Unless explicitly called “di-
rected”, graphs are undirected. The vertex set of a graph G is denoted by V (G)
and the edge set by E(G). We view graphs as relational structures with E(G)
being a binary relation on V (G). However, we often find it convenient to view
edges (of undirected graphs) as 2-element subsets of V (G) and use notations
like e = {u, v} and v ∈ e. Subgraphs, induced subgraphs, union, and inter-
section of graphs are defined in the usual way. We write G[W ] to denote the
induced subgraph of G with vertex set W ⊆ V (G), and we write G \ W to
denote G[V (G) \ W ]. The set {w ∈ V (G) | {v, w} ∈ E(G)} of neighbours of a
node v is denoted by N G (v), or just N (v) if G is clear from the context, and the
degree of v is the cardinality of N (v). The order of a graph, denoted by |G|, is
the number of vertices of G. The class of all graphs is denoted by G. A homo-
morphism from a graph G to a graph H is a mapping h : V (G) → V (H) that
preserves adjacency, and an isomorphism is a bijective homomorphism whose
inverse is also a homomorphism.
For every finite nonempty setV , we let K[V ] be the complete graph with
vertex set V , and we let Kn := K [n] . A clique in a graph G is a set W ⊆ V (G)
such that G[W ] is a complete graph. Paths and cycles in graphs are defined
in the usual way. The length of a path or cycle is the number of its edges.
Connectedness and connected components are defined in the usual way. A set
W ⊆ V (G) is connected in a graph G if W = ∅ and G[W ] is connected. For sets
W1 , W2 ⊆ V (G), a set S ⊂ V (G) separates W1 from W2 if there is no path from
a vertex in W1 \ S to vertex in W2 \ S in the graph G \ S.
A forest is an undirected acyclic graph, and a tree is a connected forest. It
will be a useful convention to call the vertices of trees and forests nodes. A
336 M. Grohe
rooted tree is a triple T = (V (T ), E(T ), r(T )), where (V (T ), E(T )) is a tree and
r(T ) ∈ V (T ) is a distinguished node called the root.
We occasionally have to deal with directed graphs. We allow directed graphs
to have loops. We use standard graph theoretic terminology for directed graphs,
without going through it in detail. Homomorphisms and isomorphisms of di-
rected graphs preserve the direction of the edges. Paths and cycles in a directed
graph are always meant to be directed; otherwise we will call them “paths or
cycles of the underlying undirected graph”. Note that cycles in directed graphs
may have length
1 or 2. For a directedgraph D and a vertex v ∈ V (D), we let
N D (v) := w ∈ V (D) (v, w) ∈ E(D) . Directed acyclic graphs will be of par-
ticular importance in this paper, and we introduce some additional terminology
for them: Let D be a directed acyclic graph. A node w is a child of a node v,
and v is a parent of w, if (v, w) ∈ E(D). We let D be the reflexive transitive
closure of the edge relation E(D) and D its irreflexive version. Then D is a
partial order on V (D).
A directed tree is a directed acyclic graph T in which every node has at
most one parent, and for which there is a vertex r called the root such that
for all t ∈ V (t) there is a path from r to t. There is an obvious one-to-one
correspondence between rooted trees and directed trees: For a rooted tree T with
root r := r(T )we define the corresponding directed tree T by V (T ) := V (T )
and E(T ) := (t, u) {t, u} ∈ E(T ) and t occurs on the path rT u . We freely
jump back and forth between rooted trees and directed trees, depending on
which will be more convenient. In particular, we use the terminology introduced
for directed acyclic graphs (parents, children, the partial order , et cetera) for
rooted trees.
2.3 Logics
We assume that the reader has a basic knowledge in logic. In this section, we
will informally introduce the two main logics IFP and IFP+C used in this paper.
For background and a precise definition, I refer the reader to one of the text-
books [21,28,47,52]. It will be convenient to start by briefly reviewing first-order
logic FO. Formulae of first-order logic in the language of graphs are built from
atomic formulae E(x, y) and x = y, expressing adjacency and equality of ver-
tices, by the usual Boolean connectives and existential and universal quantifiers
ranging over the vertices of a graph. First-order formulae in the language of
ordered graphs may also contain atomic formulae of the form x y with the
Fixed-Point Definability and Polynomial Time on Chordal Graphs 337
obvious meaning, and formulae in other languages may contain atomic formulae
defined for these languages. We write ϕ(x1 , . . . , xk ) to denote that the free vari-
ables of a formula ϕ are among x1 , . . . , xk . For a graph G and vertices v1 , . . . , vk ,
we write G |= ϕ[v1 , . . . , vk ] to denote that G satisfies ϕ if xi is interpreted by vi ,
for all i ∈ [k].
Inflationary fixed-point logic IFP is the extension of FO by a fixed-point oper-
ator with an inflationary semantics. To introduce this operator, let ϕ(X, x) be
a formula that, besides a k-tuple x = (x1 , . . . , xk ) of free individual variables
ranging over the vertices of a graph, has a free k-ary relation variable ranging
over k-ary relations on the vertex set. For every graph G we define a sequence
Ri = Ri (G, ϕ, X, x), for i ∈ N0 , of k-ary relations on V (G) as follows:
R0 := ∅
Ri+1 := Ri ∪ v G |= ϕ[Ri , v ] for all i ∈ N0 .
Here x is another k-tuple of individual variables, which may coincide with x. The
variables in the tuple x are the free variables of the formula ψ(x ), and for every
graph G and every tuple v ∈ V (G)k of vertices we let G |= ψ[v ] ⇐⇒ v ∈ R∞ .
These definitions can easily be extended to a situation where the formula ϕ
contains other free variables than X and and the variables in x̃; these variables
remain free variables of ψ. Now formulae of inflationary fixed-point logic IFP
in the language of graphs are built from atomic formulae E(x, y), x = y, and
Xx for relation variables X and tuples of individual variables x whose length
matches the arity of X, by the usual Boolean connectives and existential and
universal quantifiers ranging over the vertices of a graph, and the ifp-operator.
order on N (G). To avoid confusion, we always assume that V (G) and N (G) are
disjoint. We call the elements of the first sort V (G) vertices and the elements of
the second sort N (G) numbers. Individual variables of our logic range either over
the set V (G) of vertices of G or over the set N (G) of numbers of G. Relation
variables may range over mixed relations, having certain places for vertices and
certain places for numbers. Let us call the resulting logic, inflationary fixed-point
logic over the two-sorted extensions of graphs, IFP+ . We may still view IFP+ as
a logic over plain graphs, because the extension G+ is uniquely determined by
G. More precisely, we say that a sentence ϕ of IFP+ is satisfied by a graph G if
it G+ |= ϕ. Inflationary fixed-point logic with counting IFP+C is the extension
of IFP+ by counting terms formed as follows: For every formula ϕ and every
vertex variable x we add a term #x ϕ; the value of this term is the number of
assignments to x such that ϕ is satisfied.
With each IFP+C-sentence ϕ in the language of graphs we associate the graph
property Pϕ := {G | G |= ϕ}. As the set of all IFP+C-sentences is computable,
we may thus view IFP+C as an abstract logic according to the definition given in
Section 1.1. It is easy to see that IFP+C satisfies condition (G.2) and therefore
condition (G.2)C for every class C of graphs. Thus to prove that IFP+C captures
PTIME on a class C it suffices to verify (G.1)C .
In the following examples, we use the notational convention that x and vari-
ants such as x1 , x denote vertex variables and that y and variants denote number
variables.
defines the successor relation associated with the linear order ≤. The following
IFP+C-formula defines the set of even numbers in N (G):
even(y) := ifp Y ← y y = 0 ∨ ∃y ∃y Y (y ) ∧ succ(y , y ) ∧ succ(y , y) y.
where conn is the sentence from Example 3 and even(y) is the formula from
Example 4. By standard techniques from finite model theory, it can be proved
that the class of Eulerian graphs is neither definable in IFP nor in the counting
extension FO+C of first-order logic.
Fixed-Point Definability and Polynomial Time on Chordal Graphs 339
of L[λ]-formulae, where x, y , y1 , y2 , and yR for R ∈ μ are tuples of individual
variables such that y, y1 , y2 all have the same type, and for every k-ary R ∈ μ
the tuple
yR can be written as yR1 . . . yR,k , where the yR,i have the same type
as y .
In the following, let Γ (x) be an L-interpretation of μ in λ. Let G be a λ-structure
and a ∈ Gx :
3. Γ (x) is applicable to (G, a) if G |= γapp [a].
4. If Γ (x) is applicable to (G, a), we let Γ [G; a] be the μ-structure with vertex
set
V Γ [G; a] := b ∈ Gy G |= γV [a, b] ≈ ,
where
≈ is the reflexive,
symmetric, transitive closure of the binary relation
(b1 , b2 ) ∈ (Gy )2 G |= γ≈ [a, b1 , b2 ] . Furthermore, for k-ary R ∈ μ, we let
R Γ [G; a] := (b1 , . . . , bk ) ∈ V Γ [G; a] G |= γR [a, b1 , . . . , bk ] .
≈
Syntactical interpretations map λ-structures to μ-structures. The crucial ob-
servation is that they also induce a reverse translation from L[μ]-formulae to
L[λ]-formulae.
Fact 7 (Lemma on Syntactical Interpretations). Let Γ (x) be an L-in-
terpretation of μ in λ. Then for every L[μ]-sentence ϕ there is an L[λ]-formula
ϕ−Γ (x) such that the following holds for all λ-structures G and all tuples a ∈ Gx :
If Γ (x) is applicable to (G, a), then
G |= ϕ−Γ [a] ⇐⇒ Γ [G; a] |= ϕ .
A proof of this fact for first-order logic can be found in [22]. The proof for the
other logics considered here is an easy adaptation of the one for first-order logic.
340 M. Grohe
Example 8. Let K be the class of all complete graphs. It is easy to see that
there is no IFP+C-formula ϕ(x1 , x2 ) such that for all K ∈ K the binary relation
ϕ[K; x1 , x2 ] is a linear order of V (K).
However, there is an FO+C-definable canonisation mapping for the class K:
Let
Γ = γapp , γV (y ), γ≈ (y1 , y2 ), γE (y1 , y2 ), γ (y1 , y2 )
– γapp := ∀x x = x;
– γV (y) := 1 ≤ y ∧ y ≤ ord, where ord := #x x = x;
– γ≈ (y1 , y2 ) := y1 = y2 ;
– γE (y1 , y2 ) := ¬y1 = y2 ;
– γ (y1 , y2 ) := y1 ≤ y2 .
3 Negative Results
In this section, we prove that IFP+C does not capture PTIME on the classes of
chordal graphs and line graphs. Actually, our proof yields a more general result:
Any logic that captures PTIME on any of these two classes and that is “closed
under first-order reductions” captures PTIME on the class of all graphs. It will be
obvious what we mean by “closed under first-order reductions” from the proofs,
and it is also clear that most “natural” logics will satisfy this closure condition.
It follows from our constructions that if there is a logic capturing PTIME on one
of the two classes, then there is a logic capturing PTIME on all graphs.
Our negative results for IFP+C are based on the following theorem:
Fact 11 (Cai, Fürer, and Immerman [9]). There is a PTIME-decidable prop-
erty PCFI of graphs that is not definable in IFP+C.
Without loss of generality we assume that all G ∈ PCFI are connected and of
order at least 4.
G |= ϕ−Γ̂ ⇐⇒ Ĝ ∼
= Γ̂ [G] |= ϕ .
Thus ϕ−Γ̂ defines PCFI , which is a contradiction.
L(G)
The set γ(t) is called the cone of (T, β) at t. It easy to see that for every
t ∈ V (T )\{r(T )} with parent s the set β(t)∩β(s) separates γ(t) from V (G)\γ(t).
Furthermore, for every clique X of G there is a t ∈ V (T ) such that X ⊆ β(t).
(See Diestel’s textbook [20] for proofs of these facts and background on tree
decompositions.) Another useful fact is that every tree decomposition (T, β) of a
graph G can be transformed into a tree decomposition (T , β ) such that for all
t ∈ V (T ) there exists a t ∈ V (T ) such that β (t ) = β(t), and for all t, u ∈ V (T )
with t = u it holds that β (t) ⊆ β (u).
Fact 20. A nonempty graph G is chordal if and only if G has a tree decomposi-
tion into cliques, that is, a tree decomposition (T, β) such that for all t ∈ V (T )
the bag β(t) is a clique of G.
For a graph G, we let MCL(G) be the set of all maximal cliques in G with
respect to set inclusion. If we combine Fact 20 with the observations about tree
decomposition stated before the fact, we obtain the following lemma:
Lemma 21. Let G be a nonempty chordal graph. Then G has a tree decompo-
sition (T, β) with the following properties:
(i) For every t ∈ V (T ) it holds that β(t) ∈ MCL(G).
(ii) For every X ∈ MCL(G) there is exactly one t ∈ V (T ) such that β(t) = X.
We call a tree decomposition satisfying conditions (i) and (ii) a good tree de-
composition of G.
Let us now turn to line graphs. Let L := L(G) be the line graph of a graph G.
For every v ∈ V (G), let X(v) := {e ∈ E(G) | v ∈ e} ⊆ V (L). Unless v is an
isolated vertex, X(v) is a clique in L. Furthermore, we have
L= L[X(v)] .
v∈V (G)
Observe that for all v, w ∈ V (G), if e := {v, w} ∈ E(G) then X(v)∩X(w) = {e},
and if {v, w}
∈ E(G) then X(v) ∩ X(w) = ∅. The following proposition, which
is probably well-known, characterises the line graphs that are chordal:
Note that on the right hand side, we do not only consider chordless cycles.
Proof. For the forward direction, suppose that L ∈ CD, and let C ⊆ G be a
cycle. Then L[E(C)] is a chordless cycle in L. Hence |C| ≤ 3, that is, C is a
triangle.
For the backward direction, suppose that all cycles in G are triangles, and
let C ⊆ L be a chordless cycle of length k. Let e1 , . . . , ek be the vertices of C
in cyclic order. To simplify the notation, let e0 := ek . Then for all i ∈ [k] it
Fixed-Point Definability and Polynomial Time on Chordal Graphs 345
holds that {ei−1 , ei } ∈ E(L) and thus ei−1 ∩ ei = ∅. Let v0 , v1 ∈ V (G) such that
e1 = {v0 , v1 }, and for i ∈ [2, k], let vi ∈ ei \ ei−1 . Then vi
= vj for all j ∈ [i − 2],
and if i < k even for j ∈ [0, i − 2], because the cycle C is chordless and thus
ei ∩ ej = ∅. Furthermore, vk = v0 . Thus {v1 , . . . , vk } is the vertex set of a cycle
in G, and we have k = 3.
Lemma 23. Let L = L(G) ∈ CD ∩L, and let X ∈ MCL(L) and e = {v, w} ∈ X.
Then X = X(v)or X = X(w) orthere is an x ∈ V (G) such that {x, v}, {x, w} ∈
E(G) and X = e, {x, v}, {x, w} .
Proof. Let L = L(G) for some graph G. Suppose for contradiction that |X1 ∩
X2 | ≥ 3. Then |X1 |, |X2 | ≥ 4, because X1 and X2 are distinct maximal cliques.
By Lemma 23, it follows that there are vertices v1 , v2 ∈ V (G) such that X1 =
X(v1 ) and X2 = X(v2 ), which implies |X1 ∩ X2 | ≤ 1. This is a contradiction.
4.2 Canonisation
Theorem 27. The class CD∩L of all chordal line graphs admits IFP+C-definable
canonisation.
Corollary 28. IFP+C captures PTIME on the class of all chordal line graphs.
Proof (Proof of Theorem 27). The proof resembles the proof that classes of
graphs of bounded tree width admit IFP+C-definable canonisation [34] and also
the proof of Theorem 7.2 (the “Second Lifting Theorem”) in [31]. Both of these
proofs are generalisations of the simple proof that the class of trees admits
IFP+C-definable canonisation (see, for example, [36]). We shall describe an in-
ductive construction that associates with each chordal line graph G a canonical
copy G whose universe is an initial segment of the natural numbers. For read-
ers with some experience in finite model theory, it will be straightforward to
formalise the construction in IFP+C. We only describe the canonisation of con-
nected chordal line graphs that are not complete graphs. It is easy to extend
it to arbitrary chordal line graphs. For complete graphs, which are chordal line
graphs, cf. Example 8.
To describe the construction, we fix a connected graph G ∈ CD ∩ L that is
not a complete graph. Note that this implies |G| ≥ 3. Let (T, β T ) be a good tree
decomposition of G. As G is not a complete graph, we have |T | ≥ 2. Without
loss of generality we may assume that the root r(T ) has exactly one child in T ,
because every tree has at least one node of degree at most 1 and properties (i),
(ii) of a good decomposition do not depend on the choice of the root. It will be
convenient to view the rooted tree T as a directed graph, where the edges are
directed from parents to children.
Let U be the set of all triples (u1 , u2 , u3 ) ∈ V (G)3 such that u3 = u1 , u2
(possibly, u1 = u2 ), and there is a unique X ∈ MCL(G) such that u1 , u2 , u3 ∈ X.
For all u = (u1 , u2 , u3 ) ∈ U , let A(u) be the connected component of G \
{u1 , u2 } that contains u3 (possibly, A(u) = G \ {u1 , u2 }). We define mappings
Fixed-Point Definability and Polynomial Time on Chordal Graphs 347
αT (s), β U (u) = β T (s), γ U (u) = γ T (s), and σ U (v ) = σ T (s). The set αU (v ) is the
vertex set of a connected component of G \ σ U (v ) which is contained in αU (u) ⊆
γ U (u) = γ T (s), and by (2) it holds that αU (v ) ∩ β U (u) = ∅. Hence there is a
child t of s such that αU (v ) ⊆ αT (t). Let v := g(t). If αU (v ) ⊂ αT (t) = αU (v ),
then u v v , which contradicts (u, v ) ∈ F . Hence αU (v ) = αT (t) and thus
σ U (v ) = σ T (t). This also implies γ U (v ) = γ T (t) and β U (v ) = β T (t). We let
h(v ) := t.
To prove that h is really a homomorphism, it remains to prove that for all
u ∈ U0 with (u , v ) ∈ F0 we also have h(u ) = s. So let u ∈ U0 with (u , v ) ∈ F0 ,
and let s = h(u ). Suppose for contradiction that s = s . If s T s then αU (u ) ⊃
α (u) and thus u u, which contradicts (u , v ) ∈ F0 . Thus s
U
T s, and similarly
T T T T T
s s . But then both σ (s) and σ (s ) separate γ (s) T γ (s )T in G.
from This
U T T T T
contradicts α (v ) ⊆ α (s) ∩ α (s ) ⊆ γ (s) ∩ γ (s ) \ σ (s) ∪ σ (s ) .
Thus essentially, the “treelike” decomposition (D0 , β U ) is the same as the tree
decomposition (T, β T ). However, the decomposition (D0 , β U ) is IFP-definable
with three parameters fixing the tuple u0 = g(r(T )).
Let us now turn to the canonisation. For every u ∈ U0 , we let G(u) := G[γ(u)].
Then G = G(u0 ). We inductively define for every u = (u1 , u2 , u3 ) ∈ U0 a graph
H(u) with the following properties:
(i) V H(u) = [nu ], where nu := |γ(u)| = V Gu ) .
(ii) There is an isomorphism fu from G(u) to H(u) such that if u1 = u2
it holds that fu (u1 ) = 1 and fu (u2 ) = 2, and if u1 = u2 it holds that
fu (u1 ) = 1.
For the induction basis, let u ∈ U0 with N D0 (u) = ∅. Then γ U (u) = β U (u), and
G(u) = K[β U (u)]. We let n := nu = |β U (u)| and H(u) := Kn . Then (i) and (ii)
are obviously satisfied.
For the induction step, let u ∈ U0 and N D0 (u) = {v 1 , . . . , v n }
= ∅. It follows
from Claim 2 that for all i, j ∈ [n], either γ(v i ) = γ(v j ) or γ(v i ) ∩ γ(v j ) =
σ(v i ) ∩ σ(v j ) ⊆ β(u). We may assume without loss of generality that there are
i1 , . . . , im ∈ [n] such that i1 < i2 < . . . < im and for all j, j ∈ [m] with j = j we
have γ(v ij ) = γ(v ij ) and for all j ∈ [m], i ∈ [ij , ij+1 − 1] we have γ(v i ) = γ(v ij ).
Here and in the following we let im+1 := n + 1.
The class of all graphs whose vertex set is a subset of N may be ordered
lexicographically; we let H ≤s-lex H if either V (H) is lexicographically smaller
than V (H ), that is, the first element of the symmetric difference V (H)V (H )
belongs to V (H ), or V (H) = V (H ) and E(H) is lexicographically smaller than
E(H ) with respect to the lexicographical ordering of unordered pairs of natural
numbers, or H = H . Without loss of generality we may assume that for each
j ∈ [m] it holds that
and, furthermore,
Note that, even though the graphs G(v i1 ), G(v i2 ), . . . , G(v im ) are vertex disjoint
subgraphs of G(u), they may be isomorphic, and hence not all of the inequalities
in (3) need to be strict. For all j ∈ [m], let vj := v ij and Gj := G(vj ) an
Hj := H(vj ). Then H1 ≤s-lex H2 ≤s-lex . . . ≤s-lex Hm . Let j1 , . . . , j ∈ [m] such
that j1 < j2 < . . . < j and Hj = Hji for all i ∈ [], j ∈ [ji , ji+1 − 1], where
j+1 = m + 1, and Hji = Hji+1 for all i ∈ [ − 1]. For all i ∈ [], let Ji := Hji .
Furthermore, let ni := |Ji | and ki := ji+1 − ji and qi := |σ U (v ij )| and
m
U
q := β (u) \ β U (vj ) .
j=1
Remark 29. Implicitly, the previous proof heavily depends on the concepts in-
troduced in [31]. In particular, the definable directed graph D together with the
definable mappings σ and α constitute a definable tree decomposition. However,
our theorem does not follow directly from Theorem 7.2 of [31].
The class CD ∩ L of chordal line graphs is fairly restricted, and there may be
an easier way to prove the canonisation theorem by using Proposition 22. The
proof given here has the advantage that it generalises to the class of all chordal
graphs that have a good tree decomposition where the bags of the neighbours
of a node intersect in a “bounded way”. We omit the details.
5 Further Research
I mentioned several important open problems related to the quest for a logic
capturing PTIME in the survey in Section 1. Further open problems can be found
in [32]. Here, I will briefly discuss a few open problems related to classes closed
under taking induced subgraphs, or equivalently, classes defined by excluding
(finitely or infinitely many) induced subgraphs.
A fairly obvious, but not particularly interesting generalisation of our positive
capturing result is pointed out in Remark 29. I conjecture that our theorem for
chordal line graphs can be generalised to the class of chordal claw-free graphs,
that is, I conjecture that the class of chordal claw-free graphs admits IFP+C-
definable canonisation. Further natural classes of graphs closed under taking
induced subgraphs are the classes of disk intersection graphs and unit disk in-
tersection graphs. It is open whether IFP+C or any other logic captures PTIME
on these classes. A very interesting and rich family of classes of graphs closed
under taking induced subgraphs is the family of classes of graphs of bounded
rank width [58], or equivalently, bounded clique width [13]. It is conceivable
that IFP+C captures polynomial time on all classes of bounded rank width. To
the best of my knowledge, currently it is not even known whether isomorphism
testing for graphs of bounded rank width is in polynomial time.
Acknowledgements
I would like to thank Yijia Chen and Bastian Laubner for valuable comments
on an earlier version of this paper.
References
1. Abiteboul, S., Vianu, V.: Non-deterministic languages to express deterministic
transformations. In: Proceedings of the 9th ACM Symposium on Principles of
Database Systems, pp. 218–229 (1990)
Fixed-Point Definability and Polynomial Time on Chordal Graphs 351
2. Aho, A., Ullman, J.: The universality of data retrieval languages. In: Proceedings
of the Sixth Annual ACM Symposium on Principles of Programming Languages,
pp. 110–120 (1979)
3. Babai, L., Grigoryev, D., Mount, D.: Isomorphism of graphs with bounded eigen-
value multiplicity. In: Proceedings of the 14th ACM Symposium on Theory of
Computing, pp. 310–324 (1982)
4. Babai, L., Luks, E.: Canonical labeling of graphs. In: Proceedings of the 15th ACM
Symposium on Theory of Computing, pp. 171–183 (1983)
5. Beineke, L.: Characterizations of derived graphs. Journal of Combinatorial The-
ory 9, 129–135 (1970)
6. Blass, A., Gurevich, Y., Shelah, S.: Choiceless polynomial time. Annals of Pure
and Applied Logic 100, 141–187 (1999)
7. Blass, A., Gurevich, Y., Shelah, S.: On polynomial time computation over un-
ordered structures. Journal of Symbolic Logic 67, 1093–1125 (2002)
8. Bodlaender, H.: Polynomial algorithms for graph isomorphism and chromatic index
on partial k-trees. Journal of Algorithms 11, 631–643 (1990)
9. Cai, J., Fürer, M., Immerman, N.: An optimal lower bound on the number of
variables for graph identification. Combinatorica 12, 389–410 (1992)
10. Chandra, A., Harel, D.: Structure and complexity of relational queries. Journal of
Computer and System Sciences 25, 99–128 (1982)
11. Chudnovsky, M., Robertson, N., Seymour, P., Thomas, R.: The strong perfect
graph theorem. Annals of Mathematics 164, 51–229 (2006)
12. Chudnovsky, M., Seymour, P.: The structure of claw-free graphs. In: Webb, B.
(ed.) Surveys in Combinatorics. London Mathematical Society Lecture Note Series,
vol. 327, pp. 153–171. Cambridge University Press, Cambridge (2005)
13. Courcelle, B., Olariu, S.: Upper bounds to the clique-width of graphs. Discrete
Applied Mathematics 101, 77–114 (2000)
14. Dawar, A.: Generalized quantifiers and logical reducibilities. Journal of Logic and
Computation 5, 213–226 (1995)
15. Dawar, A.: A restricted second order logic for finite structures. In: Leivant, D. (ed.)
LCC 1994. LNCS, vol. 960, Springer, Heidelberg (1995)
16. Dawar, A., Grohe, M., Holm, B., Laubner, B.: Logics with rank operators. In:
Proceedings of the 24th IEEE Symposium on Logic in Computer Science, pp. 113–
122 (2009)
17. Dawar, A., Hella, L.: The expressive power of finitely many generalized quantifiers.
In: Proceedings of the 9th IEEE Symposium on Logic in Computer Science (1994)
18. Dawar, A., Richerby, D.: A fixed-point logic with symmetric choice. In: Baaz,
M., Makowsky, J.A. (eds.) CSL 2003. LNCS, vol. 2803, pp. 169–182. Springer,
Heidelberg (2003)
19. Dawar, A., Richerby, D., Rossman, B.: Choiceless polynomial time, counting and
the Cai-Fürer-Immerman graphs: (Extended abstract). Electronic Notes on Theo-
retical Compututer Science 143, 13–26 (2006)
20. Diestel, R.: Graph Theory, 3rd edn. Springer, Heidelberg (2005)
21. Ebbinghaus, H.D., Flum, J.: Finite Model Theory, 2nd edn. Springer, Heidelberg
(1999)
22. Ebbinghaus, H.D., Flum, J., Thomas, W.: Mathematical Logic, 2nd edn. Springer,
Heidelberg (1994)
23. Evdokimov, S., Karpinski, M., Ponomarenko, I.: On a new high dimensional
Weisfeiler-Lehman algorithm. Journal of Algebraic Combinatorics 10, 29–45 (1999)
24. Evdokimov, S., Ponomarenko, I.: On highly closed cellular algebras and highly
closed isomorphism. Electronic Journal of Combinatorics 6, #R18 (1999)
352 M. Grohe
25. Fagin, R.: Generalized first–order spectra and polynomial–time recognizable sets.
In: Karp, R.M. (ed.) Complexity of Computation. SIAM-AMS Proceedings, vol. 7,
pp. 43–73 (1974)
26. Filotti, I.S., Mayer, J.N.: A polynomial-time algorithm for determining the isomor-
phism of graphs of fixed genus. In: Proceedings of the 12th ACM Symposium on
Theory of Computing, pp. 236–243 (1980)
27. Gire, F., Hoang, H.: An extension of fixpoint logic with a symmetry-based choice
construct. Information and Computation 144, 40–65 (1998)
28. Grädel, E., Kolaitis, P., Libkin, L., Marx, M., Spencer, J., Vardi, M., Venema,
Y., Weinstein, S.: Finite Model Theory and Its Applications. Springer, Heidelberg
(2007)
29. Grädel, E., Otto, M.: On Logics with Two Variables. Theoretical Computer Sci-
ence 224, 73–113 (1999)
30. Grohe, M.: Fixed-point logics on planar graphs. In: Proceedings of the 13th IEEE
Symposium on Logic in Computer Science, pp. 6–15 (1998)
31. Grohe, M.: Definable tree decompositions. In: Proceedings of the 23rd IEEE Sym-
posium on Logic in Computer Science, pp. 406–417 (2008)
32. Grohe, M.: The quest for a logic capturing PTIME. In: Proceedings of the 23rd
IEEE Symposium on Logic in Computer Science, pp. 267–271 (2008)
33. Grohe, M.: Fixed-point definability and polynomial time on graphs with excluded
minors. In: Proceedings of the 25th IEEE Symposium on Logic in Computer Science
(2010) (to appear)
34. Grohe, M., Mariño, J.: Definability and descriptive complexity on databases
of bounded tree-width. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS,
vol. 1540, pp. 70–82. Springer, Heidelberg (1998)
35. Grohe, M., Verbitsky, O.: Testing graph isomorphism in parallel by playing a game.
In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006, Part I.
LNCS, vol. 4051, pp. 3–14. Springer, Heidelberg (2006)
36. Grädel, E.: Finite model theory and descriptive complexity. In: Grädel, E., Kolaitis,
P.G., Libkin, L., Marx, M., Spencer, J., Vardi, M.Y., Venema, Y., Weinstein, S.
(eds.) Finite Model Theory and Its Applications, pp. 125–230. Springer, Heidelberg
(2007)
37. Gurevich, Y.: Logic and the challenge of computer science. In: Börger, E. (ed.)
Current trends in theoretical computer science, pp. 1–57. Computer Science Press,
Rockville (1988)
38. Gurevich, Y.: Sequential abstract-state machines capture sequential algorithms.
ACM Transaction on Computational Logic 1, 77–111 (2000)
39. Gurevich, Y., Shelah, S.: Fixed point extensions of first–order logic. Annals of Pure
and Applied Logic 32, 265–280 (1986)
40. Hella, L.: Definability hierarchies of generalized quantifiers. Annals of Pure and
Applied Logic 43, 235–271 (1989)
41. Hella, L., Kolaitis, P., Luosto, K.: Almost everywhere equivalence of logics in finite
model theory. Bulletin of Symbolic Logic 2, 422–443 (1996)
42. Hopcroft, J. E., Wong, J.: Linear time algorithm for isomorphism of planar graphs.
In: Proceedings of the 6th ACM Symposium on Theory of Computing, pp. 172–184
(1974)
43. Hopcroft, J.E., Tarjan, R.: Isomorphism of planar graphs (working paper). In:
Miller, R.E., Thatcher, J.W. (eds.) Complexity of Computer Computations.
Plenum Press, New York (1972)
44. Immerman, N.: Relational queries computable in polynomial time. Information and
Control 68, 86–104 (1986)
Fixed-Point Definability and Polynomial Time on Chordal Graphs 353
45. Immerman, N.: Expressibility as a complexity measure: results and directions. In:
Proceedings of the 2nd IEEE Symposium on Structure in Complexity Theory, pp.
194–202 (1987)
46. Immerman, N.: Languages that capture complexity classes. SIAM Journal on Com-
puting 16, 760–778 (1987)
47. Immerman, N.: Descriptive Complexity. Springer, Heidelberg (1999)
48. Immerman, N., Lander, E.: Describing graphs: A first-order approach to graph
canonization. In: Selman, A. (ed.) Complexity theory retrospective, pp. 59–81.
Springer, Heidelberg (1990)
49. Köbler, J., Verbitsky, O.: From invariants to canonization in parallel. In: Hirsch,
E.A., Razborov, A.A., Semenov, A., Slissenko, A. (eds.) Computer Science – Theory
and Applications. LNCS, vol. 5010, pp. 216–227. Springer, Heidelberg (2008)
50. Kreutzer, S.: Expressive equivalence of least and inflationary fixed-point logic. An-
nals of Pure and Applied Logic 130, 61–78 (2004)
51. Laubner, B.: Capturing polynomial time on interval graphs. In: Proceedings of the
25th IEEE Symposium on Logic in Computer Science (2010) (to appear)
52. Libkin, L.: Elements of Finite Model Theory. Springer, Heidelberg (2004)
53. Luks, E.: Isomorphism of graphs of bounded valance can be tested in polynomial
time. Journal of Computer and System Sciences 25, 42–65 (1982)
54. Miller, G.L.: Isomorphism testing for graphs of bounded genus. In: Proceedings of
the 12th ACM Symposium on Theory of Computing, pp. 225–235 (1980)
55. Otto, M.: Bounded variable logics and counting – A study in finite models. Lecture
Notes in Logic, vol. 9. Springer, Heidelberg (1997)
56. Otto, M.: Canonization for two variables and puzzles on the square. Annals of Pure
and Applied Logic 85, 243–282 (1997)
57. Otto, M.: Bisimulation-invariant PTIME and higher-dimensional μ-calculus. The-
oretical Computer Science 224, 237–265 (1999)
58. Oum, S.I., Seymour, P.: Approximating clique-width and branch-width. Journal of
Combinatorial Theory, Series B 96, 514–528 (2006)
59. Roussopoulos, N.: A max{m, n} algorithm for determining the graph H from its
line graph G. Information Processing Letters 2, 108–112 (1973)
60. Vardi, M.: The complexity of relational query languages. In: Proceedings of the
14th ACM Symposium on Theory of Computing, pp. 137–146 (1982)
61. Verbitsky, O.: Planar graphs: Logical complexity and parallel isomorphism tests. In:
Thomas, W., Weil, P. (eds.) STACS 2007. LNCS, vol. 4393, pp. 682–693. Springer,
Heidelberg (2007)
62. Whitney, H.: Congruent graphs and the connectivity of graphs. American Journal
of Mathematics 54, 150–168 (1932)
Ibn Sı̄nā on Analysis: 1. Proof Search. Or:
Abstract State Machines as a Tool
for History of Logic
Wilfrid Hodges
1 Introduction
This paper contains a translation and commentary on Sect. 9.6 of Ibn Sı̄nā’s
major work on logic, the volume ‘Syllogism’ (Qiyās) from his encyclopedic Šifā’,
a work written in Arabic in the 1020s. (Sect. 9.6 is the first of four sections,
9.6–9.9, on what Ibn Sı̄nā calls ‘analysis’; hence ‘Analysis: 1’ in the title of this
paper.) The section is itself a loose commentary on some lines in Aristotle’s
Prior Analytics i.32. It falls into two parts. In the first part Ibn Sı̄nā describes
what he sees as the task of logical ‘analysis’ (tah.lı̄l ). One ingredient of that task
is to complete formal proofs which have a piece missing, and Ibn Sı̄nā gives his
account of this in the second part. A special case of this problem (though not
one mentioned by Ibn Sı̄nā himself) is to find a formal proof where everything is
missing except the conclusion, and this is precisely the task of proof search. To
the best of my knowledge, Ibn Sı̄nā’s account is the first work to come anywhere
near describing a proof search algorithm in formal logic.
Abstract State Machines (ASMs [6]) were introduced by Yuri Gurevich [10],
in whose honour this essay is written. They give a framework for describing
algorithms with complete precision at whatever level of refinement we choose.
The main business of this paper is to describe Ibn Sı̄nā’s intended algorithm. The
fact that Ibn Sı̄nā himself is less than explicit about some details is no excuse for
us to lapse into vagueness. If we want to record with decent precision what Ibn
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 354–404, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Ibn Sı̄nā on Analysis: 1. Proof Search 355
Sı̄nā used or understood, and what he didn’t, we need the best descriptive tools;
and so I turned to ASMs. Fortunately the work is already partly done, because
a famous early application of ASMs was Börger and Rosenzweig’s specification
of the proof search algorithm of Prolog to meet the ISO 1995 standard [5].
As far as I know, the use of ASMs below is the first application of ASMs to
the history of logic, and one of the first applications of ASMs in the humani-
ties. (A recent paper [11] calls for applications in linguistics, but these go in a
rather different direction.) In practice the task of constructing an ASM was an
invaluable research tool; it kept raising questions to be addressed to Ibn Sı̄nā’s
text. Remarkably often Ibn Sı̄nā does answer these questions in his text, though
I often had to refer to other sections of the Qiyās for clarifications. I doubt that
any other logician between Aristotle and Leibniz would have come through this
test as successfully as Ibn Sı̄nā does.
The paper has an unusually wide spread of prerequisites. First there is the
Arabic text of Ibn Sı̄nā and its historical background. Second there are the math-
ematical facts about syllogisms. Third there is the methodology of Abstract State
Machines. Unfortunately papers are linear strings of text, so some prerequisites
will have to wait their turn. The structure of the paper is as follows:
Sect. 1. Introduction.
Sect. 2. Historical background (on logic from Aristotle to Ibn Sı̄nā).
Sect. 3. tah.s.ı̄l (roughly the counterpart in Ibn Sı̄nā of Tarski’s notion of
setting up a deductive theory).
Sect. 4. Mathematical prerequisites on syllogisms.
Sect. 5. Extracting the algorithm.
Sect. 6. Review.
Appendix A. Translation of Qiyās 9.6.
Appendix B. Notes on the text translated.
Appendix C. The ASM.
The passage translated in Appendix A (Qiyās 9.6) needs to be matched up with
Qiyās Sects. 2.4 on categorical syllogisms, 9.3 on compound syllogisms, 9.4 on
supplying missing premises of simple syllogisms and 9.7–9 on other aspects of
analysis. I will do my best to get translations of these sections onto my website
at http://wilfridhodges.co.uk. Meanwhile Tony Street [30] gives a useful
summary of Ibn Sı̄nā’s theory of predicative syllogisms.
I thank Egon Börger, Jamal Ouhalla, Roshdi Rashed, Gabriel Sabbagh and
the referee for some valuable remarks and suggestions, and Amirouche Moktefi
for advice on the Arabic translation. But I take full responsibility for errors;
there are bound to be some, though I believe the use of ASMs has eliminated
many of the more serious ones.
2 Historical Background
In the middle of the 4th century BC, Aristotle noticed that many arguments in
mathematics, metaphysics and elsewhere have one of a small number of forms,
356 W. Hodges
name Analytics, both for this book and for Aristotle’s work Posterior Analytics
on the theory of knowledge ([28] p. 400). He also calls attention to the mathemat-
ical use of analúein to mean working backwards from conclusions to premises.
This usage agrees with the part of Aristotle’s ‘analysis’ that consists of finding
a missing premise.
Ibn Sı̄nā’s Sect. 9.6 falls into two parts. The first part, from paragraph [9.6.1]
to [9.6.5], more or less matches Aristotle’s text. The second part, consisting of
paragraphs [9.6.6] to [9.6.12], is completely new. It picks up Aristotle’s brief
remark that the syllogism being analysed may have a premise missing, and it
discusses how to fill the hole. Ibn Sı̄nā’s presentation in this part is very unusual:
instead of explaining his method, he illustrates it with sixty-four examples and
some comments. For example here are two of his examples for completing a
syllogism whose conclusion (the ‘goal’) is the sentence ‘Some C is an A’:
If the h.ās.il [premises] are ‘Every D is a C’ and ‘Every B is an A’, and
‘Every (or some) D is a B’ is attached, then this makes the syllogism
h.ās.il. (1)
If the h.ās.il [premises] are ‘Every D is a C’ and ‘Some B is an A’, it
can’t be used. (From Problems 9, 10 in Appendix A below.)
Clearly this text needs some interpretation. We begin with the word ‘h.ās.il ’,
which expresses a central notion in Ibn Sı̄nā’s methodology.
3 tah
. s.ı̄l
There are two notions to be brought together here. One is tah.lı̄l, which is the
Arabic word that Ibn Sı̄nā uses to translate Aristotle’s análusis. Ibn Sı̄nā re-
garded Prior Analytics i.32–46 as a manual of analysis, and he commented on
these sections in Sects. 9.6 to 9.9 of his Qiyās. The material in Sects. 9.7–9 is
not directly related to that in 9.6, but it is needed for a full picture of Ibn Sı̄nā’s
understanding of analysis.
The second notion is tah.s.ı̄l, which means ‘making h.ās.il ’. There are no easy
English translations of tah.s.ı̄l and h.ās.il, and even if there were, we would still
need to explain how the notions fit into Ibn Sı̄nā’s view of philosophical activity.
At the most literal level, h.ās.il means ‘available for use’, so that tah.s.ı̄l means
‘making available for use’. The word h.ās.il occurs nine times in Qiyās 9.6, and
its grammatical relatives many times more. A thing is muh.as.s.al if it has been
made available for use. Here is a remarkable example of the literal usage:
. . . some people demonstrate without any rule, like Archimedes who
demonstrated mathematically, since in his time logic wasn’t yet available (2)
(lam yakun muh.as.s.al ). (Qiyās [15] 15.10f.)
Ibn Sı̄nā has his history confused – Archimedes was born a hundred years later
than Aristotle. The idea that Archimedes demonstrated ‘without any rule’ is
puzzling. Roshdi Rashed (personal communication) suggests that the point is
that geometrical reasoning, of the kind that Archimedes used, is not algorithmic.
358 W. Hodges
For Ibn Sı̄nā, one of the main tasks of a philosopher was to apply tah.s.ı̄l to
the ideas of earlier philosophers. He refers several times to commentators on
Aristotle as muh.as.s.ilūn, people who make h.ās.il. A typical example is in Išārāt :
Nothing but this has been stated by earlier scholars (muh.as.s.ilūn), but
(3)
in a manner overlooked by recent ones. ([20] I.9.2, p. 150 of Inati.)
Here Ibn Sı̄nā sets out two independent classifications of kinds of knowledge. The
first classification is into those forms of knowledge which depend on reflective
thinking and those which come to us without our having to think reflectively.
The second classification, which is fundamental throughout Ibn Sı̄nā’s logic and
epistemology, is between two processes that lead to knowledge. The first of these
processes is conceptualisation (tas.awwur ); it leads us to having a concept, and
Ibn Sı̄nā counts this as a kind of knowledge. The second process is assent (tas.dı̄q),
i.e. coming to recognise that a proposition is true; it leads to knowledge of the
fact stated by the proposition. Although Ibn Sı̄nā in his first sentence uses ‘de-
terminate’ only for knowledge not dependent on reflective thinking, the rest of
his text shows that this is just an accident of style, and both kinds of knowl-
edge can be determinate. In fact the passage suggests that for knowledge, being
determinate and being ‘obtained’ amount to the same thing.
The passage gives us strong clues about usages (a) and (b), because tas.awwur
leads to knowledge of concepts and tas.dı̄q leads to knowledge of propositions.
Take concepts first. Here Ibn Sı̄nā’s usage slots in with a philosophical usage
that had been around already for many decades. The 9th century translator of
Ibn Sı̄nā on Analysis: 1. Proof Search 359
In fact one performs exactly the ‘analysis’ that we saw Aristotle himself describ-
ing in Prior Analytics i.32ff. But while for Aristotle and Alexander this kind of
analysis was one of the general tools of logic, Ibn Sı̄nā thought he could point
to a large body of published work specifically devoted to it, namely the philo-
sophical commentaries. (There is a hint of this view already in Philoponus [24]
p. 315 l. 20, where he says that the syllogism to be analysed may come from ‘the
ancients’.)
To fill in the history a little, the idea of commenting on a philosopher by
reducing that philosopher’s arguments to syllogistic form seems to have surfaced
first among the Middle Platonist commentators on Plato’s dialogues in the first
century AD. It may have been encouraged by a desire to show that Plato was just
as good a logician as Aristotle (a view that Ibn Sı̄nā explicitly rejects with con-
tempt [17] pp. 114f). For example Alcinous [1] 158.42–159.3 finds the following
second-figure syllogism in Plato’s Parmenides 137d8–138a1:
A thing that has no parts is neither straight nor circular. A thing that
has a shape is either straight or circular. Therefore a thing that has no (7)
parts has no shape.
(The second premise is obviously false. In any case Plato as I read him gives
‘straight’ and ‘circular’ as typical examples of shapes, not as an exhaustive list.
But Alcinous wasn’t the world’s greatest logician.)
Most of the surviving Roman Empire or Arabic commentaries on Aristotle,
including those of Ibn Sı̄nā, do contain explicit reductions of particular argu-
ments to syllogistic form. These reductions form a very small proportion of the
text of the commentaries. But probably Ibn Sı̄nā regarded it as a criterion of
the quality of a commentary that it should be straightforward to analyse the
commentator’s arguments into syllogistic form. The analogy with modern set
theory applies here too. We don’t expect set theorists to set out their arguments
as first-order deductions from the Zermelo-Fraenkel axioms, but we do take it as
a criterion of a sound set-theoretic argument that it should be routine to reduce
the argument to this form.
By implication we have already said what it should mean to describe a syl-
logism as determinate. We make a syllogism determinate by analysing it into
a form so that it makes its conclusion determinate. This involves putting it
into one of the standard syllogistic moods, and ensuring that its premises are
determinate.
There are a couple of nuts-and-bolts points about tah.s.ı̄l that can be made
here as well as anywhere. First, the notion of ‘determinate’ is relational: a thing
can be determinate for me but not for you. This is explicit in both (4) and (5). As
far as I’m aware, there is no notion in Ibn Sı̄nā of a thing being ‘determinate in
itself but not for us’, such as we might expect in 13th or 14th century Scholastics.
And second, the set of propositions that are determinate for you is dynamic:
you can add new items to the set by deducing them from things already in the
set. This causes some problems of terminology. In proof search we assume we
have a database T of sentences, and we search for proofs of given sentences from
assumptions that are in T . In Ibn Sı̄nā’s case the set T is the set of propositions
Ibn Sı̄nā on Analysis: 1. Proof Search 361
that are already determinate. But it’s natural for him to say that a successful
proof search makes another proposition φ determinate, and it could look as if
he is saying that φ is added to the database. Granted, Prolog has a function
assert which does exactly that. But adding φ to T is completely different from
deducing φ from things already in T , and it’s the latter that is important for
the proof search algorithm. The remedy is to distinguish strictly between those
propositions that were already determinate and those that become determinate
through application of the algorithm. Ibn Sı̄nā’s choice of words doesn’t always
help us to make this distinction; see Problem 32 in Appendix A and the note on
it in Appendix B.
The letters ‘A’ and ‘B’ are place-holders for two distinct ‘terms’ (h.add ). Warning:
terms in traditional logic are not at all the same thing as terms in modern logic.
For present purposes we can think of terms in Ibn Sı̄nā as being the meanings
of actual or possible common nouns. (There is no requirement that the nouns
describe nonempty classes.)
Ibn Sı̄nā believed that when reasoning we manipulate terms in our minds
through linguistic expressions that mean them. This allowed him to do the same
in his logical theory, for example using common nouns as surrogates for their
meanings. A syllogistic sentence of the form ‘Every A is a B’ is got by putting
common nouns in place of ‘A’ and ‘B’, with the sole restriction that the two
common nouns must have different meanings. By ‘syllogistic sentence’ we will
mean a sentence of one of the four forms in (8).
362 W. Hodges
A syllogistic sentence can be identified by four features. The first is the ‘sub-
ject’ (mawd.ūc ), which is the term put for ‘A’. The second is the ‘predicate’
(mah.mūl ), which is the term put for ‘B’. The third is the ‘quantity’ (kam),
which is either ‘existentially quantified’ (juz’ı̄) or ‘universally quantified’ (kullı̄).
The fourth is the ‘quality’ (kaifa), which is either ‘affirmative’ (mūjib) or ‘nega-
tive’ (maslūb). For purposes of the ASM I treat a syllogistic sentence as a 4-tuple
[subject,predicate,quantity,quality] (9)
using 0 for existentially quantified and affirmative, and 1 for universally quanti-
fied and negative. (See (Def1) in Appendix C.)
The conditions for ‘Every A is a B’ to be true are satisfied exactly when those
for ‘Some A is not a B’ are not satisfied. So each of these syllogistic sentences
means the same as the negation of the other. We say they are ‘contradictories’
of each other, and we write φ̄ for the contradictory of φ. Likewise ‘No A is a B’
and ‘Some A is a B’ are contradictories.
By ‘formal sentences’ I mean the expressions that we get if we put uninter-
preted 1-ary relation symbols (we call them ‘term symbols’) in place of ‘A’, ‘B’
in (8) above. The truth conditions translate at once into conditions for a formal
sentence to be true in a structure. So we have a model-theoretic notion of entail-
ment: a set T of formal sentences entails a formal sentence ψ if and only if there
is no structure in which all the formal sentences in T are true but ψ is not true.
Though this notion was unknown to Ibn Sı̄nā, it gives us some mathematics that
will be helpful for understanding various things that Ibn Sı̄nā does.
For example it allows us to demonstrate all the cases where one formal sen-
tence entails another. They are as follows (where we write ⇒ for ‘entails’):
Every A is a B. Every B is an A.
⇓ ⇓
Some A is a B. ⇔ Some B is an A.
(10)
Some A is not a B. Some B is not an A.
⇑ ⇑
No A is a B. ⇔ No B is an A.
The top and bottom halves of this diagram are not independent. Each sentence
in the bottom half is the contradictory of its counterpart in the top half. Hence
the arrows in the bottom half go the opposite way to those in the top half. Ibn
Sı̄nā recognised all the instances of these entailments as examples of ‘following
from’.
These two theorems are equivalent to results in §46 of Thom [31], which Thom
proves proof-theoretically. But they can be proved directly from the truth condi-
tions in (8). Ibn Sı̄nā himself probably knew Theorem 1 from experience, though
it’s hard to see how he could have proved it. On the other hand he almost cer-
tainly didn’t know Theorem 2. Any form of this result involves partitioning
occurrences of terms in syllogistic sentences into the two classes that we called
distributed and undistributed, and no such partition has been found in Ibn Sı̄nā’s
logical writings.
Now given an inconsistent circle as in (11), we can take out any one sentence,
say φi . Then the remaining sentences entail φi ; moreover all entailments between
formal sentences, where there are no redundant sentences in the entailing set,
are formed in this way. List the entailing sentences in their order in the circle:
Then the sequence (12) has the property that every term symbol occurs twice,
in two adjacent sentences of the sequence, except for one term symbol that
occurs only in the first sentence and another one that occurs only in the last
sentence. We describe a sequence (12) with this property as a ‘linkage’ (qarı̄na,
though strictly Ibn Sı̄nā uses the term only for such sequences of length 2). The
sequence (12) and the sentence φi together form a ‘formal separated syllogism’
whose ‘premises’ (muqaddamāt ) are the sentences in (12) and whose ‘conclusion’
(natı̄ja) is the sentence φi . The expression ‘separated syllogism’ (qiyās mafs.ūl )
364 W. Hodges
is from Ibn Sı̄nā (Qiyās [15] p. 436.1), though strictly he uses it only when there
are more than two premises.
So we can speak of a ‘separated syllogism’, meaning an entailment between
syllogistic sentences, got by taking a formal separated syllogism and replacing
the distinct term symbols by distinct terms. The separated syllogisms that Ibn
Sı̄nā recognises all have the property that their premises entail their conclusion
(model-theoretically); in his terminology the conclusion ‘follows from’ (yalzam)
the premises. But later in this section it will take us some time to unpick the
relationship between Ibn Sı̄nā’s notion of following from and our notion of en-
tailment.
But first we turn to the notion that the proof search algorithm is meant to
deal with: separated syllogisms with a premise missing. Suppose for example that
we have a separated syllogism with premises [φ1 , . . . , φm ] and conclusion χ, and
we remove one or more adjacent premises, say φj and φj+1 . In the inconsistent
circle the contradictory of χ belongs at the beginning or the end; we will put it
at the end:
[φ1 , . . . , φj−1 , φj+2 , . . . , φm , χ̄]. (13)
Now we can describe the gap as follows. It comes immediately after the (j − 1)-
th sentence in the sequence (13); we call the number j − 1 the ‘gap site’. If φ1
and φ2 had been removed, the gap would be immediately after χ̄, which is the
(m − 1)-th sentence in [φ3 , . . . , φm , χ̄], so the gap site would be m − 1. Also when
the linkage (13) contains at least two sentences, there is a unique term shared by
the lefthand missing sentence and the one to the left of it; we call this term the
‘left edge’ of the gap. Likewise there is a unique term shared by the righthand
missing sentence and the one to the right of it in (13); we call this term the
‘right edge’ of the gap.
Thus in Problem 20 Ibn Sı̄nā gives the following example:
Conclusion (understood from Problem 12) ‘Some C is not an A’. (14)
Premises ‘Some D is a C’ and ‘No A is a B’.
Putting the contradictory of the conclusion at the end gives the sequence
The gap site is 1, the left edge is D and the right edge is B. The definitions just
given are more formal than Ibn Sı̄nā himself uses. But he provides several types
of example with different gap sites. At Problem 7 he uses the left and right edges
of the gap. In this problem he does also include an irrelevant term, and clearly
he knows that it’s irrelevant; perhaps he wants to encourage the student to work
out that only the left and right gaps are needed at that stage in the algorithm.
(See the notes on Problem 7.)
Ibn Sı̄nā doesn’t consider the case where all the premises are missing – which
is actually the case that corresponds to the proof search problem for Prolog. In
this case the gap comes immediately after the contradictory of the conclusion, so
the gap site is 1. But with only one sentence present, there is no way of telling
which of its terms is the left edge and which is the right. We need a definite
Ibn Sı̄nā on Analysis: 1. Proof Search 365
choice; I stipulate that in this case the left edge is the subject of the conclusion
and the right edge is its predicate. (This is not quite arbitrary; it reconciles two
messages that Ibn Sı̄nā sends about which end of the gap to start with when we
fill it. Namely in Qiyās [15] Sect. 9.3 he works from the left side to the right, and
when finding middles in Sect. 9.4 he starts with the subject of the conclusion.)
Identifying the gap site and the left and right edges is necessary for the algo-
rithm, so I made it a module of the ASM. See (ASM3) in Appendix C for the
module Describe. I haven’t bothered to spell out the formal definition in cases
like this where there is a purely book-keeping manipulation that can be specified
unambiguously in English.
Theorem 3. Let T be a consistent set of formal sentences and C the set of all
formal sentences ψ such that T entails ψ and there is no proper subset of T that
entails ψ. Then if C is not empty, C contains a sentence ψ which entails all the
other sentences in C.
We call the sentence χ in the conclusion of Corollary 1 the ‘weakest fill’; it’s
unique up to the equivalences in (10).
Ibn Sı̄nā never attempts to apply any definition of ‘follows from’ directly to
separated syllogisms with more than two premises. For simple syllogisms he
understands ‘follows from’ in terms of how our minds manipulate ideas, and it
would hardly be plausible to assume that we could hold in our minds a set of a
thousand premises. Instead he maintains that a separated syllogism is shorthand
for a more complex kind of syllogism, namely a tree of simple syllogisms. At
Qiyās [15] p. 436.1 he describes such a tree as a ‘connected syllogism’ (qiyās
maws.ūl ). He explains at Qiyās [15] p. 442.8 that separated syllogisms are so-
called because in them the intermediate conclusions (the conclusions of all the
simple syllogisms except the one at the root of the tree) are separated from the
premises (presumably he means the premises at the leaves of the tree), so that
the premises are mentioned explicitly but the intermediate conclusions are left
out. At Burhān [16] 141.15ff he comments that a connected syllogism with a
thousand intermediate steps is no big deal provided we are ‘mentally prepared
for the drudgery’.
So part of the job of analysis is to find these intermediate conclusions. Ibn
Sı̄nā discusses an example in detail at Qiyās Sect. 9.3, p. 442.8–443.13. The
368 W. Hodges
text is corrupt, but on one reconstruction Ibn Sı̄nā is discussing the separated
syllogism with premises
‘Every J is a D’, ‘Every D is an H’, ‘Every H is a Z’, ‘Every Z is an I’ (18)
and conclusion ‘Every J is an I’. The intermediate conclusions are ‘potential’,
he says. To find them, we start with two explicitly stated premises and draw a
conclusion φ from them, and then we form a syllogism with φ as first premise
and another of the explicit premises as second premise, and so on. An example
would be to prove ‘Every J is an H’ first, and then ‘Every J is a Z’. He warns
us against starting with the second and third premises to deduce ‘Every D is
Z’ – this is not ‘the arrangement that we chose’. (He adds that we could have
chosen a different arrangement.)
Exactly this procedure, starting from the lefthand end, appears in Problem
3. (In Arabic of course it is the righthand end. I won’t say this again.) Ibn
Sı̄nā takes a supposed separated syllogism of length 3 with the middle premise
missing. He suggests a way of filling it, so that the three premises are
‘No C is a B’, ‘Every D is a B’ and ‘Every A is a D’. (19)
He first infers ‘No C is a D’ from the first two premises, and then he infers the
required conclusion ‘No C is an A’ from this and the third premise. Since this
is one of the first problems, it’s presumably meant as a strong clue about the
procedure to be followed.
So the procedure appears in the ASM of Appendix C as module (ASM4),
called Synthesise. Ibn Sı̄nā’s word for ‘synthesis’ is tarkı̄b, which means forming
a compound; he also uses it for the compound formed. At Qiyās [15] p. 434.11 he
explains that ‘synthesising a syllogism’ means forming a connected compound
syllogism, which is the main thing that this module does.
Now it’s clear that if φ1 , . . . , φ5 are formal sentences such that φ1 and φ2
entail φ4 , and φ3 and φ4 entail φ5 , then φ1 , φ2 , φ3 together entail φ5 . But Ibn
Sı̄nā needs more than this. His procedure is also meant to tell us when the raw
materials can’t be filled out into a syllogism. Suppose we infer φ4 from φ1 , φ2 and
then find that φ5 doesn’t follow from φ3 , φ4 ; what does this show? How do we
know we couldn’t have proved φ5 from φ1 , φ2 , φ3 by choosing φ4 differently, or
by starting at the righthand end? If Ibn Sı̄nā had tried to prove the correctness
of his algorithm, he would have had to face this question.
In fact there is a positive answer, at least in terms of model-theoretic entail-
ment. The heart of the matter is the following result.
Theorem 4. Suppose [φ1 , . . . , φn ] and [ψ1 , . . . , ψm ] are linkages of formal sen-
tences. Then the following are equivalent:
(a) [φ1 , . . . , φn , ψ1 , . . . , ψm ] forms an inconsistent minimal circle.
(b) The set ψ1 , . . . , ψm has a strongest consequence θ, and [φ1 , . . . , φn , θ] is an
inconsistent minimal circle.
The theorem tells us that (so long as there are no irredundancies in the premises)
we can take any segment of the premises of a separated syllogism, and shrink it
Ibn Sı̄nā on Analysis: 1. Proof Search 369
down to its strongest consequence. The result will still be a separated syllogism
entailing the same conclusion. At least this is true for model-theoretic entailment.
But consider for example the syllogism
Every B is a C. Every D is a B. Some D is an A. Therefore some C is (20)
an A.
Model-theoretically the three premises do entail the conclusion. But if we try
to build a connected syllogism, starting from the lefthand end as in Ibn Sı̄nā’s
examples, we immediately hit a problem. The first two premises violate the
fourth figure condition.
A possible way around this is to start by drawing a conclusion from the second
and third premises. By the premise order condition this conclusion must be ‘Some
B is an A’, by the third-figure mood Disamis. So we have the intermediate
syllogism
In [9.6.4] and [9.6.5] of Qiyās 9.6, Ibn Sı̄nā refers briefly to some other kinds of
syllogism.
Earlier in the Qiyās ([15] p. 106) Ibn Sı̄nā has distinguished between two kinds
of syllogism which he calls respectively ‘recombinant’ (iqtirānı̄) and ‘duplicative’
(istitnā’ı̄). A recombinant syllogism has two premises, each of them built out
¯
of two parts; one of these parts is the same in both premises. The conclusion is
formed by recombining the two remaining parts. Simple syllogisms as in Subsect.
4.3 above fit this description. But so do some propositional (šart.ı̄) syllogisms,
for example
the gap in the original datum. Then by Theorem 4 above, it suffices to continue
with φ as new goal and an empty datum. But even this would add 0 to the
possible lengths of data. I haven’t followed this route, because it would imply
some mechanism for feeding back the result of the calculation with φ as goal
into the original problem.
But the case of length 0 is interesting anyway, not least because it corresponds
to the Prolog proof search problem. For that reason I set up the ASM to handle
data of length 0. Ibn Sı̄nā himself may have reckoned that he had said enough
about the case of data of length 0 already in Qiyās [15] Sect. 9.4 ‘On obtaining
premises, and on tah.s.ı̄l of syllogisms with a given goal’.
The second reason for doubting that Ibn Sı̄nā intends a restriction to lengths
1 and 2 is his statement at 465.2 that he will deal with the case of ‘more than
two premises’ in the appendices. We don’t have the promised appendix; see the
note on this passage. Of course he might have said in the appendix that these
longer data can be handled, but only by a different procedure. I think this is
unlikely, for the first reason just given.
Nevertheless there is a good reason for Ibn Sı̄nā to concentrate on the case
of length 2. If the datum has length greater than 2, it always contains two
adjacent sentences that share a term. So we can reduce the length of the datum
immediately, by replacing these two sentences by their strongest consequence –
unless they are sterile, in which case the problem has no positive solution. We
can’t be sure that Ibn Sı̄nā intended this way of working, but it makes good
sense and I have built it into the ASM.
The other possibility is that Ibn Sı̄nā intends his procedure to apply where
the datum contains more than one gap, or perhaps even when it contains no
gap at all. He does in fact discuss the case of more than one gap in paragraph
[9.6.7]. His view is that it can be handled but at the cost of a more complicated
procedure, which again he will describe in the appendices. The main thing we
would need to do in order to extend our ASM to more than one gap would be to
incorporate some further machinery to control the search; see Subsect. 6.2 below
for a discussion of what would be required. Presumably Ibn Sı̄nā’s appendix
would have said something about this too. The case of no gaps is covered by
the procedures of Qiyās Sect. 9.3, which we have incorporated into the module
Synthesise; so this case is at least implicitly in Ibn Sı̄nā’s algorithm already.
In his initial remarks on analysis in [9.6.1], Ibn Sı̄nā says that the text to be
analysed may contain ‘something superfluous’, and our rule will need to tell us
how to ‘strip off defects’. This suggests that the procedure should also eliminate
redundant parts of the datum. None of the 64 problems suggests any way of
doing this. Indeed it’s not clear what the aim would be if Ibn Sı̄nā did allow
this. One could always start by removing the entire datum and working from
the goal alone; would this count? If not, would the aim be to throw away as little
as possible of the datum? This could lead to serious complexities. So I think we
can sensibly assume that the procedure is not meant to eliminate redundant
parts of the datum.
374 W. Hodges
studied in Sect. 3 above. First and foremost, there is the wording that we quoted
in the affirmative case: ‘[the syllogism] has been made determinate’. Add to this
that in 10 problems he says that the premises in the datum are determinate;
this is irrelevant for the logical task. In 6 of the problems with an affirmative
answer, he requires that the attached sentence is ‘true’ or ‘true for you’ or ‘clear’
(bayyin – this must mean ‘clearly true’). Finally there are two problems (1 and
2) where Ibn Sı̄nā finds a sentence χ that solves the logical task, and then adds
that if the sentence is not ‘clear’ or true, then it doesn’t solve the problem and
one ‘needs a middle’ (i.e. has to look for a two-sentence filling for the gap).
So there is clear evidence that Ibn Sı̄nā also has in mind another task:
Given that the datum consists of sentences that are already determinate,
discover whether or not there is a sequence of sentences [χ1 , . . . , χm ] that
are already determinate, which can be put into the gap of the datum so
(27)
that the datum becomes the premise sequence of a determinate separated
syllogism whose conclusion is the goal. When the answer is Yes, supply
a sequence [χ1 , . . . , χm ] with this property.
I call this the ‘tah.s.ı̄l task’. The two tasks are connected by the fact that a
negative answer to the logical task implies a negative answer to the tah..sı̄l task,
but otherwise the tasks are independent.
I think it’s inconceivable that Ibn Sı̄nā was in any way confused about the
difference between the logical task and the tah.s.ı̄l task. But I wouldn’t put it
past him to be deliberately ambiguous in hopes of catching both tasks under
the same general description. There is some evidence of deliberate ambiguity. In
Subsect. 5.1 we interpreted the word ‘found’ (mawjūd ) as meaning datum, i.e.
‘the thing you found in front of you when you were given the problem’; but it
would be entirely in keeping with Ibn Sı̄nā’s logical vocabulary if we read it as
‘found to be true’, i.e. determinate. Likewise the phrase ‘you have’ (kāna c indak )
could also mean ‘according to you’, in other words, ‘it’s determinate for you that
. . . ’.
It would also be in character for Ibn Sı̄nā to leave the ambiguity as a deliberate
trap for idle or unintelligent students.
In sum, we have identified two tasks that the procedure is meant to perform.
The logical task is well-defined apart from the uncertainty about what separated
syllogisms Ibn Sı̄nā accepts. But at least we can rigorously check the correctness
of Ibn Sı̄nā’s own solutions of his 64 problems. The tah.s.ı̄l task is well-defined
apart from the same uncertainty about separated syllogisms, though it does re-
quire us to know what sentences are ‘already determinate’. The set of things that
are already determinate is the counterpart of the set of clauses of the Prolog pro-
gram in the Prolog case. Börger and Rosenzweig [5] build this set of clauses into
their ASM through a predicate P ROGRAM and a basic operation clause list.
I prefer not to do that here, because it would pre-empt a question we have to
discuss in a moment, namely whether Ibn Sı̄nā considers that the set of sentences
that are already determinate can be read off mechanically.
376 W. Hodges
a B’. Then we should unpack the definition of the term A, and extract from it
sentences of the form ‘Every A is a C’. For each of these, we should see whether
we can also prove ‘Every C is a B’. If we have no success with the definition
of A, Ibn Sı̄nā advises looking next at the properties that we can prove for A,
using the principles of the relevant science.
In the cases where χ has the form ‘No A is a B’ or ‘Some A is a B’, the
situation is symmetrical and we can start with either A or B. In the case of
‘Some A is not a B’, Ibn Sı̄nā’s wording suggests – I can’t put it stronger than
that – that we start with properties that some A is known to have. So a general
rule that covers all cases would be that we start by looking for determinate
sentences that involve the subject term of χ. (Note that the subject term could
be either the left edge or the right edge of the gap.)
Ibn Sı̄nā comes back to the matter at Burhān [16] pp. 138.22ff and 139.10ff.
He claims that in mathematics most sentences have the form ‘Every A is a B’
(here he is agreeing with Aristotle Posterior Analytics A14). He suggests that
when χ has this form in mathematics, if there is a middle as required, then
one can be found by unpacking the definition of the subject term of χ. (This
seems to me a gross oversimplification outside elementary linear algebra.) In this
case it would be reasonable to say that the list of possible terms can be found
mechanically from the definition of the subject term, so we would only need to
include in the ASM a basic function for finding the definitions of terms. But Ibn
Sı̄nā goes on to say that outside mathematics things are not so straightforward.
We would need to consider the inherent accidents of the subject term of χ, and
in the worst case even its non-inherent accidents.
He comes back again to the same question in his autobiography. He tells us
that sometimes he was ‘at a loss about a problem, concerning which I was unable
to find the middle term in a syllogism’, and so he resorted to prayer, then to
alcohol and then to sleep; ‘many problems became clear to me while asleep’ ([12]
p. 27f). Prayer, alcohol and sleep are not mechanical procedures.
All in all, I think it would be very unwise to assume that Ibn Sı̄nā thinks we
can list in advance all the determinate sentences that involve the subject term
of χ. This is a pity, because the backtracking algorithm of [5] (which Börger and
Stärk display as an ASM module on page 114 of [6]) assumes that we can make
this list.
At this point I am going to cheat and call on a relatively advanced kind of
Abstract State Machine called an asynchronous multi-agent ASM ([6] Chapter
6). This multi-agent ASM has a family of ‘agents’ who each perform according
to their own ASMs, at their own speeds and for the most part independently.
But there can be super-global procedures that pass messages to and from the
agents. The set of agents can be ‘potentially dynamic’, in other words there can
be super-global procedures that add new agents. In ASMs one can treat the set
of threads in a Java program as a dynamic set of agents; I thank Egon Börger
for this example. (The term ‘super-global’ is to distinguish from those features
of the agent ASMs that are global within these ASMs.)
378 W. Hodges
In this setting, suppose an agent reaches a point where ‘it needs a middle’.
The agent then sends a message to the super-global agent who operates the
super-global procedures; prayer, alcohol and sleep might be ways of sending this
message. The super-global agent responds by listing all the possible options; but
instead of sending the list to the agent, it splits the agent into a family of agents,
each of whom has one of the options to work on. I see Ibn Sı̄nā identifying the
global agent as the Active Intellect, and the agents who carry out the algorithm
as possible intellects, so that
when a connection occurs between our souls and [the Active Intellect],
there are imprinted from it in them the intellected forms which are
(28)
specific for this specific preparation for specific judgements. (Išārāt [20]
II.3 iš. 13.)
But that’s an aside – the super-global agent has a precise job to do, which is
encoded in the ASM as a super-global basic function.
All the agents do the same calculation for the logical task. When the logical
task has delivered an affirmative answer, they switch to the tah..sı̄l task and may
have to split. So for the tah.s.ı̄l task we need to clarify the notion of correctness
of the ASM, as follows. The ASM is correct for the tah.s.ı̄l task if: (1) when the
task has a negative answer, all (lower level) agents return a negative answer; (2)
when the task has an affirmative answer, at least one agent returns an affirma-
tive answer; and (3) every agent returning an affirmative answer also returns a
sequence of sentences which is a correct fill for the gap in the datum.
A fragment of the backtracking procedure is still needed, but for a more
limited purpose, namely to find the weakest fill in a datum. Ibn Sı̄nā shows at
Problems 3 and 7 that he expects the student to find it by listing possibilities
and trying each in turn. The edges of the gap are known, and they provide the
two terms of the weakest fill. So there are eight possible sentences to consider.
Given this approach, it makes sense to list the possibilities in an order where
ψ comes before χ whenever χ entails ψ; so when we first find a possible fill we
know it is a weakest one. The function listsentences at (Def2) in Appendix C
provides such a list.
Ibn Sı̄nā allows the student to use background knowledge to cut down from
eight to a shorter list of possible fills; see the notes on Problems 3 and 7. I count
this move as a shortcut, not as a part of the algorithm.
Are we sure that no further backtracking is needed? For example, perhaps we
find a weakest fill, but then further down the line we discover that the resulting
connected syllogism runs into trouble with the fourth figure condition, so that
we need to backtrack and try the converse of the weakest fill instead. I believe
that this problem doesn’t arise, because the premise order condition fixes the
order of the terms in all the intermediate sentences in the connected syllogism,
independent of the order of the terms in the premises of the separated syllogism.
To be sure of this we need a correctness proof; but I think this would be wasted
effort until we have an answer to the question about which connected syllogisms
to accept.
Ibn Sı̄nā on Analysis: 1. Proof Search 379
6 Review
We must do two things here. The first is to give an informal summary of the
algorithm, and the second is to place it in the history of logic and mathematics.
A more formal description of the algorithm is given in Appendix C, in the form
of an asynchronous multi-agent ASM, where each agent follows its own agent
ASM within the multi-agent ASM.
goal-datum pair
@
@
R
@
@
@
@
R
@
ActiveIntellect Ramify
@
I
@
@
@
φj but not vice versa, then j < i. Then the module splits the goal-datum into
eight clones, and it fills the gap in the i-th clone with the sentence φi . So now
there are eight goal-datum pairs, none of which has a gap.
There is a subtlety if the goal-datum pair that passes to Ramify has an empty
datum. In this case there is always a sentence that fills the gap and entails the
goal, namely the goal itself. So in this case Ramify makes just one new page,
in which the datum is changed to the goal sentence.
After Ramify has done its work, the first of the resulting gap-free goal-datum
pairs passes to Synthesise, which shrinks down any adjacent pair of sentences
in the datum with a term in common, until either the datum consists of a single
sentence, or a sterile pair of sentences has come to light. If a sterile pair of
sentences comes to light, the goal-datum pair is discarded and the next of the
eight clones passes into Synthesise for similar treatment; and so on. If none of
the eight clones are left, the module reports failure.
If a goal-datum pair with a single-sentence datum survives, it passes to the
module Select. This module checks which of three cases hold: (1) the datum
equals the goal, and it is already determinate; (2) the datum equals the goal,
but it is not already determinate; (3) the datum doesn’t equal the goal. In case
(1) the module reports success in the tah..sı̄l task and the algorithm halts. In
case (2) the module reports success in the logical task (if it hasn’t already been
reported), restores the gappy goal-datum pair that Ramify had filled, and sends
this pair to the Active Intellect with a request for a determinate sentence that
attaches at one side of the restored gap. The Active Intellect compiles a list of
all the determinate sentences that could be used, and it makes one clone of the
goal-datum pair for each such sentence ψ. The clone that goes with ψ has ψ
inserted into its gap; but the gap is not completely filled, so we once again have
a goal-datum pair with a gap. All these new goal-datum pairs are sent back into
Describe in parallel, and so on around the cycle. In case (3) the same happens
as the failure case in the previous paragraph: the goal-datum pair is discarded
and the next of the eight is called for, unless none of the eight are left, in which
case the module reports failure.
There are several places where a module reports success or failure. If no success
has been reported yet, then the first report of success or failure is a report on the
logical task, except in case (1) for Select. If logical failure has been reported,
the algorithm halts. If logical success has been reported, a later report of failure
is a report on the tah.s.ı̄l task, and again the algorithm halts. If logical success
has been reported, the only further report of success that makes any difference
is a report of tah.s.ı̄l success in case (1) for Select.
This is the algorithm in broad outline. We need to clarify what are the separate
steps, and how the algorithm decides which step happens when – what Ibn Sı̄nā
refers to as the ‘order’. The description below is very much based on Gurevich’s
notion of an ASM and the use made of it by Börger and Rosenzweig in [5].
The idea of goal-datum pairs swimming around between modules is only a
metaphor. A different metaphor is more realistic: the calculator (or ‘agent’)
does each piece of calculation by writing out one or more pages that state the
Ibn Sı̄nā on Analysis: 1. Proof Search 381
results of the calculation. (The pages are the ‘nodes’ of [5].) A step of the cal-
culation could involve writing several pages, but only where the pages can be
written simultaneously. For example when Ramify makes eight clones and fills
them, in principle this can be done on eight pages simultaneously (though eight
hands would be useful), so it counts as a single step. But when Synthesise
shrinks down the datum, the result of shrinking down the first pair of sentences
is generally an input to the operation of shrinking the next pair. So shrinking
down a single pair of sentences to their strongest consequence is a whole step. In
general Synthesise will process a goal-datum pair for several steps until there
is no fat left on the datum; this will involve producing a succession of new pages
with shorter datum sequences.
In principle the agent could go to work on any existing page at any time,
using any one of the four modules Describe, Synthesise, Ramify or Select.
What decides which page and which module the agent will take next?
Written in a separate place, not on the pages, there are three further pieces
of information stored in ‘global variables’. The first is the label of the ‘current
page’, i.e. the page now being processed. The agent reads the current page and
acts according to instructions in the algorithm; these instructions refer to the
contents of the current page, and to the values of the global variables. The
instructions tell the agent what new pages to produce, and what changes to
make to the global variables. So for example if the agent is looking at page 5,
the instructions may tell the agent to change the current page variable to 6; the
effect is that when page 5 has been dealt with, the agent turns next to page 6.
And so on.
There are two other global variables besides ‘current page’. One of them
records the goal (which is fixed at the start and never changes). The other
global variable stores reports of success or failure (and starts with the value
‘ignorance’).
The rest of the information needed for controlling the calculations consists of
six records on each page, as follows. (In Appendix C these six records are called
‘properties’ of the page.) The first is a record of the datum on that page. (The
starting page carries the datum given by the problem to be solved.) The second
records the gap site for the current goal-datum pair; the record may also show
that there is no gap, or that the gap site needs calculating. The third is a record
of the left and right edges of the gap. The fourth is a record of the fill, i.e. the
sentence that was put in the gap when Ramify was last used.
The fifth and sixth records on the page store information about the movement
between the pages. One of them records ‘previous page’; what this means is that
when a page p is being read and a new page q is constructed according to the
information in p, then p is recorded as ‘previous page’ on q. (After the algorithm
has reported success, one will need to work backwards from the final page to
its previous page, its previous page’s previous page and so on in order to recon-
struct the required connected syllogism.) The other record is called ‘next’. The
main function of ‘next’ is that when a group of pages p1 , . . . , pn are constructed
simultaneously, ‘next’ on page pi (where i < n) indicates pi+1 . When the agent
382 W. Hodges
is reading pi and has to discard it, the value of ‘next’ on pi tells the agent which
page to try next. The agent makes this happen by changing the value of the
global variable ‘current page’ to pi+1 .
If we have the algorithm set up correctly, then at any stage the records on
the current page p will determine uniquely which module takes care of this stage
of the calculation – unless the algorithm has halted with a report of success or
failure. For example if the record of the gap site on p says that there is no gap,
the module that applies will be one of the two at the bottom of the diagram.
If and only if the record says that the gap needs calculating, the module that
applies will be Describe. If the record says that there is a gap, the module that
applies will be one of Synthesise, Ramify and ActiveIntellect.
The module ActiveIntellect, which is operated by a higher force, comes
into play when and only when the record ‘next’ on the current page indicates
prayer, alcohol or sleep – or more prosaically when it has the value ‘needs a
middle’.
Assuming that neither Describe nor ActiveIntellect has been called,
what settles the choice between Synthesise, Ramify and Select? The answer
is that Synthesise applies if and only if the datum on the current page has
two adjacent sentences with a term in common. (The module appears twice in
the flow diagram above, but it has the same job to perform in both cases.) If
Synthesise doesn’t apply, then Ramify applies if there is a gap in the goal-
datum pair, and Select applies if there isn’t.
It may be helpful to note that when Ramify or Select applies to a page,
then the datum on the page has been slimmed down as much as possible by
Synthesise. So if the goal-datum pair has a gap (as at Ramify), the datum
consists of at most two sentences; if the pair has no gap (as at Select), the
datum is a single sentence. I suggested earlier that this explains why Ibn Sı̄nā
confines his 64 problems to cases where the datum has at most two sentences.
For further details refer to Appendix C.
My remarks here will be very incomplete – this is already a long paper. I am very
much indebted to Roshdi Rashed for his comments and information, though he
should not be held responsible for any particular claims I make.
A ‘search algorithm’ is a mechanical procedure which allows its user to find
a solution of a problem, or establish that there is no solution, by running sys-
tematically through a set of possible partial or total solutions. (The set is called
the ‘search space’.) Ibn Sı̄nā’s algorithm, insofar as it really is an algorithm, is
a search algorithm for finding solutions to the logical and tah..sı̄l problems. It
searches through partial or total compound syllogisms that extend the datum.
I know of no other examples of search algorithms in the medieval Arabic litera-
ture. In modern times search algorithms go back at least to Tarry’s maze-solving
algorithm of 1895 ([4] p. 18ff), though the best known examples are from the
second half of the 20th century.
Ibn Sı̄nā on Analysis: 1. Proof Search 383
– then this would have brought him close to al-Kalı̄l’s listing procedures, and
¯
he would very likely have given us the earliest description of a search through
lexicographic listing. This makes it all the more painful that we don’t have the
appendix which he said would discuss problems with more than one gap. (See
the note on 465.2.)
There is a major difference between Ibn Sı̄nā’s discussion and that of al-
Kwārizmı̄. Namely, Ibn Sı̄nā never makes any attempt to show that his algo-
¯
rithm is correct. (If he had done, he would certainly have given a much better
algorithm.) There are several aspects to this difference. First, al-Kwārizmı̄ is fol-
¯
lowing every mathematician’s dream: to solve a problem by reducing it to some
apparently quite different problem that is easy to solve or has already been
solved. The reduction of a problem in algebra to one in geometry is a beautiful
example; and incidentally it runs clean counter to the aristotelian tendency to
keep the various sciences in a rigid hierarchy. I doubt that Ibn Sı̄nā ever had this
mathematical dream. In the same way as Aristotle, he writes mathematics like
an intelligent outsider, not like a true addict.
The second aspect concerns how Ibn Sı̄nā sees the nature of logic. For Ibn
Sı̄nā logic is not about when this follows from that; it’s about how we can see
from first principles that this follows from that. For example if we are given a
linkage and a sentence, Theorem 2 gives a fast way of testing whether the linkage
entails the sentence without needing to construct any simple syllogisms at all. In
Ibn Sı̄nā’s time this theorem wasn’t yet known. But even if it had been, it would
have established a logical fact by going outside the basic processes of deduction,
and so Ibn Sı̄nā very probably wouldn’t have used it. The fact that Ibn Sı̄nā
uses only direct and bottom-level methods was a great help for extracting the
algorithm from Ibn Sı̄nā’s text. One knew in advance that there were no hidden
tricks or changes of viewpoint or appeals to intuition. The student was expected
to solve the problems by direct application of basic facts of logic, and all that
Ibn Sı̄nā was teaching him was how to apply the steps in the right order (as he
himself says at [9.6.12]).
For balance one should add that in general Ibn Sı̄nā was certainly prepared to
use metatheorems of logic as well as theorems. In fact he despised logicians who
couldn’t do this. But the metatheorems that he used were ones that summed up
elementary facts about syllogisms, not ones that introduced new ideas.
In one other respect Ibn Sı̄nā’s algorithm matches the mathematics of his time.
He achieves the effect of induction by reducing more complex cases to simpler
ones, until he reaches ground level. We might compare Proposition 8 of Tābit
¯
ibn Qurra in Rashed [26] p. 337ff, and Rashed’s analysis on page 159. Tābit
¯
computes an n-term sum by writing the terms to be summed, then below them
n−1 terms to be summed, and so on down to a single term. This produces a two-
dimensional array, and Tābit computes the sum of the top line from properties
¯
of the whole array. For his exposition he takes n = 4 as a typical case. We saw
that Ibn Sı̄nā takes cases of length 2 or 3, but here the parallel may break down,
because we found that these cases play a special role in the calculation.
Ibn Sı̄nā on Analysis: 1. Proof Search 385
Appendices
461.4,5 [9.6.3] So when you have found a syllogism, you start by looking for
its two premises. You do this before looking for the terms, because
gathering up fewer things is easier [than gathering up many]. Also
when you start with the terms, it can be that there are more than
two ways of combining them into two premises, so that the cases you
would need to consider would ramify. The reason for that is that by
locating the terms you don’t thereby locate the premises as things
composed [from the terms]. You would have to examine the case of
each term, and then examine four possible ways of combining [pairs
461.10 of terms]. So you would have to consider five items: first you would
consider the terms [themselves], and then you would consider the
four cases which arise from the ways of composing the premises
from two terms. But if you locate the two premises, it’s enough
for you to consider one more thing, namely to list the terms. Thus
when you have found two premises, locating the syllogism and how
it behaves will be easy for you.
461.12 [9.6.4] Then the first step is to investigate whether each of the
premises shares one of its terms with the goal but is distinguished
from the goal by another [term]. Suppose [it does, and] one of the
two premises shares both its terms with one part of the second
premise, while another part of the second premise – not the whole
of it – shares both the terms of the goal. Then the syllogism is
duplicative, and the premise which has one part overlapping the
goal and another part overlapping the other premise is a proposi-
462.1 tional compound, while the other premise is a duplication. So look
carefully at [the sentence] which has a part overlapping the goal in
two terms: is it meet-like or difference-like? If it is meet-like then
find out whether its overlap [with the goal] is its first or second
clause, and find out whether that other [sentence] is the same [as
this part of the premise], or is its contradictory. If the premise is
difference-like, then find out whether the overlapping [clauses] are
the same or contradictories. Do the same with the other [premise],
which is the duplicating one. In this way your syllogism is analysed
462.5 into the propositional moods.
462.5 [9.6.5] If this is not the case, and for every [sentence] of the syllogism
the goal (which is proved through [the syllogism]) overlaps it in just
one term, then you know that the syllogism is recombinant. If you
have found that each of the premises overlaps the conclusion, then
look for the middle term, so that you find the figure. Then connect
the terms to the conclusion, so as to find the major and minor
[premises] and the other things that you should be looking for. If
you can’t find a middle term, then the syllogism is not simple;
462.10 instead you have a compound syllogism with at least four terms.
[9.6.6] [First case: two given premises, each sharing one term with
the goal]
Ibn Sı̄nā on Analysis: 1. Proof Search 387
[9.6.8] [Second case.] If you have found two premises that share [a
term] with each other, and one of them shares [a term] with the
goal, then this shared [term] is either the subject or the predicate
of the goal. Suppose it is the subject. 465.5
Work through the remaining cases of this kind for yourself, taking the com-
pound [syllogisms] in turn.
390 W. Hodges
466.3 [9.6.9] You should know that when we said: ‘This makes [the syllo-
gism] determinate’, this meant determinate without having to alter
[the syllogism] by making a conversion in the found [premises]. Also
you should know that we are not putting ourselves to the trouble of
telling you now what figure the determinate [syllogism] is [proved]
466.5 in. If you don’t understand that, and didn’t memorise what was
said [about it earlier], you won’t have been able to make any use of
this [lesson].
[9.6.10] [Third case: Two premises which share one term with each
other, and one of them shares a term with the predicate of the goal.]
466.6 [Problem 37.] If the shared [term] is in the predicate of the goal,
and the goal is universally quantified affirmative [thus: ‘Every C is
an A’]; and you have [the premises] ‘Every D is a B’ and ‘Every B
is an A’, and ‘Every C is a D’ is attached, this makes [the syllogism]
determinate.
466.7 [Problem 38.] If the goal is universally quantified negative [thus:
‘No C is an A’], and the found [premises] are ‘Every D is a B’ and
‘No B is an A’, and ‘Every C is a D’ is attached, this makes [the
syllogism] determinate.
466.9 [Problem 39.] If the found [premises] that you have are ‘No D is
a B’ and ‘Every A is a B’, and ‘Every C is a D’ is attached, this
makes [the syllogism] determinate.
466.10 [Problem 40.] If you have [the premises] ‘Every D is a B’ and ‘No A
is a B’, and ‘Every C is a D’ is attached, this makes [the syllogism]
determinate.
466.11 [Problem 41.] If the goal is existentially quantified affirmative [thus:
‘Some C is an A’], and you have [the premises] ‘Some B is a D’
and ‘Every D is an A’, and ‘Every B is a C’ is attached, you can
use it.
466.13 [Problem 42.] If you have: ‘Some B is a D’, and ‘Every A is a D’,
it can’t be used.
466.13 [Problem 43.] If you have ‘Some D is a B’ and ‘Every B is an A’,
and [the attached premise] is ‘Every D is a C’, you can use it.
466.14 [Problem 44.] If you have ‘Some D is a B’ and ‘Some A is a D’,
it can’t be used, even with the order [of the terms in a premise]
converted.
466.15 [Problem 45.] If your goal is existentially quantified negative [thus:
‘Some C is not an A’], and you have [the premises] ‘Some B is a
D’ and ‘No D is an A’, and ‘Every B is a C’ is attached, you can
use it.
467.1 [Problem 46.] Or you have ‘Every B is a D’ and ‘Some D is not an
A’ – then you can’t use it.
467.2 [Problem 47.] If you have [the premises] ‘Not every B is a D’ and
‘Every D is an A’, you can’t use it.
467.2 [Problem 48.] If you have ‘No B is a D’ and ‘Some D is an A’, you
can’t use it.
Ibn Sı̄nā on Analysis: 1. Proof Search 391
[Problem 49.] If you have ‘Some D is a B’ and ‘No A is a B’, and 467.3
‘Every D is a C’ is attached, you can use it.
[Problem 50.] If you have ‘No D is a B’ and ‘Every A is a B’, and 467.4
‘Some C is a D’ is attached, you can use it.
[Problem 51.] If you have ‘Not every D is a B’, and ‘Some A is a 467.5
B’, it can’t be used.
Try out for yourself the compound [syllogisms] where the overlap 467.7
is with the predicate of the goal, in the same relation as above.
These, and similar [examples] that we handle by comparison with
them, are instances of analysis where you have two premises. 467.9
[9.6.11] [Fourth case: One premise, which shares a term with the
goal.]
[Problem 52.] In the case where you have a single premise, which 467.9
overlaps the predicate of the conclusion, and the goal is universally
quantified affirmative, namely ‘Every C is an A’, and you have [the
premise] ‘Every D is an A’, then if ‘Every C is a D’ is attached,
this makes [the syllogism] determinate.
[Problem 53.] If you have ‘Every A is a D’, it can’t be used. 467.12
[Problem 54.] If the goal is universally quantified negative [thus: 467.12
‘No C is an A’], and you have [the premise] ‘No D is an A’ or
‘No A is a D’, and ‘Every C is a D’ is attached, this makes [the
syllogism] determinate.
[Problem 55.] If you have [the premise] ‘Every D is an A’, then [the 467.14
syllogism] can’t be made determinate.
[Problem 56.] Rather, if you have ‘Every A is a D’, and it’s true 467.14
that ‘No C is a D’, this makes [the syllogism] determinate.
[Problem 57.] If the goal is existentially quantified affirmative [thus: 467.15
‘Some C is an A’], and you have [the premise] ‘Some D is an A’,
and ‘Every D is a C’ is attached, you can use it.
[Problem 58.] If you have [the premise] ‘Every D is an A’, and 467.16
‘Some C is a D’ is attached, you can use it.
[Problem 59.] If you have ‘Some A is a D’, you can’t use it at all, 467.17
unless you convert [the premise].
[Problem 60.] If the goal is existentially quantified negative [thus: 467.18
‘Some C is not an A’], and you have [the premise] ‘Every D is an
A’, you can’t use it at all.
[Problem 61.] Rather, if [the premise] is ‘No D is an A’, and ‘Some 467.19
C is a D’ is attached, you can use it. 468.1
[Problem 62.] Likewise if you have ‘Some D is an A’, it can’t be 468.1
used.
[Problem 63.] If you have [the premise] ‘Not every D is an A’, and 468.2
‘Every D is a C’ is attached, you can use it.
[Problem 64.] If [the premise] is ‘Not every A is a D’, it can’t be 468.3
used.
392 W. Hodges
468.4 [9.6.12] When you put the steps in this order, as I have shown you,
you will reach the [required] terms, figures and moods. And the
terms that you encounter will be ones within the formats mentioned
above as ones that can be used.
468.7 Apply exactly the same considerations to propositional compounds.
The text above is translated from the Arabic text in [15], which is a volume from
the Cairo edition of the Šifā’, published under the overall editorship of Ibrahim
Madkour.
Title
Ibn Sı̄nā writes ‘intafac X bi-Y’ to express ‘X can use Y’. The passive form,
which occurs in the title, is ‘untufic bi-Y’, meaning ‘Y can be used’. I haven’t
found this meaning in the dictionaries, including Goichon [9]. But it’s fairly
common in Ibn Sı̄nā’s logical writing. For example in Burhān [16] 13.14 one
can’t use (lam yantafic bi-) what a teacher says unless one thinks for oneself;
63.8 there are students who can use (intafac bi-) a compass but are still
stupid; 141.13 in debate one can’t use (lā yantafic bi-) a proof that requires
very many middle terms; c Ibāra [14] 2.12f sciences are developed so that
later generations can use (yantafic bi-) them. Dozy [7] comes nearest with
the meaning ‘trouver son compte à’, which Gabriel Sabbagh kindly tells me
can be translated as ‘finds advantageous or useful’.
[9.6.1]
460.5f ‘not connected but separated’ (gair maws.ūl bal mafs.ūl ): See Subsects.
4.2 and 4.4 above for these notions.
[9.6.2]
461.13 ‘its terms’: Ibn Sı̄nā is discussing propositional syllogisms here, so for
example the ‘terms’ of the proposition ‘if p then q’ are p and q, both of
which are sentences and not terms in the usual sense. See Subsect. 4.5
above.
[9.6.6]
Problem 3. The two terms that occur once only in the given goal and premises
are B and D, so we are looking for a sentence φ with terms B
and D. The goal is universally quantified, so all the premises are
universally quantified, and in particular φ is universally quantified.
The goal is negative, so there is exactly one negative premise; hence
the remaining two premises including φ must be affirmative. Thus
φ must be either ‘Every B is a D’ or ‘Every D is a B’. We try
both in turn. If we combine ‘Every B is a D’ with ‘No C is a B’
as the premises of a simple syllogism, then since B is subject in
one and predicate in the other, the syllogism is in first figure, and
its minor premise is ‘No C is a B’ since this is the one with the
middle term D as its predicate. But the only mood in first figure
with two universally quantified premises and one of them negative
is the second mood (Celarent in the Latin nomenclature), whose
minor premise is affirmative. So ‘Every B is a D’ can’t be used, and
we have to try ‘Every D is a B’ instead. The result is the following
connected compound syllogism, which meets the requirements:
No C is a B. Every D is a B.
(29)
No C is a D. Every A is a D.
No C is an A.
394 W. Hodges
465.2 ‘in the appendices’ (bil-lawāh. iq): In several places in the Šifā’ Ibn
Sı̄nā refers to things that will appear in the appendices. But no
396 W. Hodges
work of this name or with exactly the required contents has been
found. It has been suggested that Ibn Sı̄nā’s two other works
Tac lı̄qāt and Mubāh.atāt contain material that was intended for
¯
the appendices. (Gutas [12] pp. 141–144.) But the published ver-
sions of these two works contain only philosophical material, and
nothing about proof search. More’s the pity, because Ibn Sı̄nā’s
treatment of incomplete syllogisms with two or more gaps would
have shown us more about how he handled problems of search. See
Subsect. 6.2 for more on the historical context.
Problem 27. Ibn Sı̄nā doesn’t say what premise linking D to A will work. There
may be a subtle reason. This is the first example with two premises
φ1 , φ2 to the left of the gap, so the student has a choice between
first combining φ1 with φ2 before combining the result with the
test sentence; or first combining φ2 with the test sentence and
then bringing in φ1 . The first route is clearly more sensible, be-
cause the result of combining φ1 with φ2 will be the same for each
test sentence. Ibn Sı̄nā forces the student to see this, by putting
pressure on the student to try several test sentences. But the effect
is slightly spoiled by the fact that in this particular case the answer
‘Every D is an A’ is obvious without any calculation.
Problem 29. Paragraph [9.6.9] below suggests that Ibn Sı̄nā is talking about con-
verting a premise. But why should anybody think of converting a
premise in this example? A possible explanation lies in the fact that
this is the first example in this block where a premise has its terms
out of the obvious order. We might expect (C, B)(B, D), (D, A),
but instead the last premise gives (A, D). Perhaps Ibn Sı̄nā had
students who (apparently like Smith [3] Note to 42b5–26) assumed
that switches like this don’t occur. Ibn Sı̄nā had made the same
point already at Qiyās 444.5, where it seems to have confused the
copyists.
Problem 32. The Arabic contains two occurrences of qad h.us.s.il, but they must
mean different things. In general qad with the past tense is a per-
fective marker: it indicates that the present state is the outcome
of a previous action described by the verb. But previous to what?
At the first occurrence here the phrase must mean previous to
the problem having been posed, hence ‘already determinate’. But
the second occurrence describes the outcome of the algorithm, so it
can’t mean that; it must mean that the application of the algorithm
created the present situation, hence ‘this makes it determinate’.
Also the ’a at the end of the sentence in line 465.13 should be
deleted (as in one ms); cf. Problem 7.
Problem 33. This is the one problem where Ibn Sı̄nā gives the premises in an
order that doesn’t form a linkage where the goal subject points
leftwards and the goal predicate points rightwards. The reason for
this is explained in Subsect. 4.4 above.
Ibn Sı̄nā on Analysis: 1. Proof Search 397
[9.6.10]
Problem 37. Delete mādā at the beginning of line 466.6. Also the ’i in the Cairo
¯
edition is a misprint for ’in.
Problem 44. The student might worry that these two premises violate the fourth
figure condition. Strictly this is not relevant, because the connected
syllogism wouldn’t combine these two premises in a simple syllo-
gism; but it may explain why Ibn Sı̄nā remarks that no conversion
is needed.
Problem 48. In the Cairo text the first premise is ‘No B is a C’, violating the
case assumption for [9.6.10]. Read ‘No B is a D’, following two
manuscripts.
Problem 51. The Cairo text has ‘Every A is a B’ for the second found premise.
There must be a slip, because in that case we get a syllogism by
attaching ‘Every D is a C’. But on the Cairo reading this is also
the only example in this block where the second found premise is
the same as in the previous example. So I have replaced ‘Every A’
by ‘Some A’.
[9.6.11]
Problem 59. The only sentence that will complete the syllogism logically is ‘Ev-
ery D is a C’. The middle term is D, which is subject in ‘Every D
is a C’ and predicate in ‘Some A is a D’, so the syllogism violates
the fourth figure condition. Converting the premise ‘Some A is a
D’ to ‘Some D is an A’ yields a third-figure syllogism in Disamis.
Problem 61. We have to correct ‘Some C is an A’ to ‘Some C is a D’.
Problem 62. The Cairo text reads ‘Likewise if [the premise] is ‘No A is a D’, and
you have (c indak ) ‘Some D is an A’ or ‘Some A is a C’, it can’t be
used.’ There are several problems with this. First, with the datum
‘No A is a D’ we get the goal by appending ‘Some C is a D’; so
the datum is presumably wrong. Second, this is the one problem
where Ibn Sı̄nā seems to introduce the appended sentence with
398 W. Hodges
c
indak ; in 28 other problems c indak introduces the datum. Third,
the sentence ‘Some A is a C’ is silly here, because it has the same
terms as the goal. We can get a reasonable problem by deleting
the first and third syllogistic sentences and the text around them,
as I have done in the translation. Then Ibn Sı̄nā is saying correctly
that the goal can’t be reached from the datum ‘Some D is an A’.
[9.6.12]
where p, q and r are declarative sentences. Ibn Sı̄nā brings this to a form
analogous to a predicative syllogism by the device of ‘replacing “if” by
“whenever” ’ (e.g. Qiyās 471.5), so that the sense becomes
C The ASM
Briefly, an ASM consists of a set of rules operated by modules; for example the
module ProofSearch below has four modules, namely Describe, Synthe-
sise, Ramify and Select. At each step in a computation, all the rules are
applied once and simultaneously; if they clash, the machine stops. Rules nor-
mally begin with a condition, so they do nothing unless the condition is met.
They can activate other rules by resetting parameters so that the conditions
for the other rules are met. For example when the conditions for Synthesise
are met, the rules of Synthesise have the effect of shortening the datum by 1
every time they operate. They continue to operate until there are no consecutive
formulas in the datum with a term in common; at this point the condition for
Synthesise fails, but that for Ramify may be met, so that the rules of Ramify
take over.
The notation X := Y means that the value of the parameter X becomes Y .
The notation X means the set of nonempty finite sequences of elements of X.
I hope the rest is reasonably self-explanatory.
The logical part of this ASM was implemented in Perl 5 and run on all of
Ibn Sı̄nā’s 64 problems. There were discrepancies from Ibn Sı̄nā’s solution (as
reported in the Cairo text) at Problems 23, 33, 51, 61 and 62. These are all
Ibn Sı̄nā on Analysis: 1. Proof Search 399
Multi-agent ASM
(ASM1) ActiveIntellect=
forall ι ∈ INTELLECT if next(currpage).ι = needsmiddle then
let k = |hasils(f ill(currpage).ι)|
if k > 0 then
let ιk := ι
if k > 1 then
let ι1 , . . . , ιk−1 = new(INTELLECT )
forall 1 i k
let p = new(PAGE ).ιi
datum(p).ιi := insert(datum(currpage).ιi ,
i-th(hasils(f ill(currpage).ιi)), gapsite(currpage).ιi )
gapsite(p).ιi := needscalculating
previous(p).ιi := currpage.ιi
next(p).ιi := 0
X(p).ιi := X(currpage).ιi
where X = edges, f ill
currpage.ιi := p
where hasils(φ).ι is the set of all sentences that are determinate for the intellect
ι and have exactly one term in common with φ; we assume this set is finite. In
general the i-th sentence in the set will contain a term t that is not already in
TERM .ιi , so the Active Intellect will need to add t to TERM .ιi ; the term t is
the imprinted form that we met in (28).
Globals:
goal ∈ SENTENCE
(input from the problem)
currpage ∈ N
(initially 0)
report ∈ {ignorance, logicalf ailure, logicalsuccess, tahsilsuccess}
(initially = ignorance)
400 W. Hodges
Page properties:
datum : PAGE → SENTENCE
(initially input from the problem)
gapsite : PAGE → N ∪ {needscalculating}
(initially = needscalculating)
edges : PAGE → TERM 2
f ill : PAGE → SENTENCE
previous : PAGE → PAGE
next : PAGE → PAGE ∪{needsmiddle}
(initially 0)
Agent modules
(ASM3) Describe =
if gapsite(currpage) = needscalculating then
let p = new(PAGE )
take datum(currpage), goal and identify the gap site, the left
edge and the right edge. (If no gap then the gap site is 0.)
gapsite(p) := calculated gap site
edges(p) := (left edge,right edge)
previous(p) := currpage
X(p) := X(currpage)
where X = datum, f ill, next
currpage := p
(ASM4) Synthesise =
if gapsite(currpage) 0 and next(currpage) 0 and
(length(datum(currpage)) > 2 or
(length(datum(currpage))
= 2 and gapsite(currpage) = 1)) then
1 if gapsite(currpage) = 1,
let k =
2 otherwise
let = length(datum(currpage))
let φ = consequence(k-th(datum(currpage)),
(k+1)-th(datum(currpage)))
if φ = sterile then
let α = replacepair(datum(currpage), φ, k)
let p = new(PAGE )
datum(p) := α
previous(p) := currpage
if gapsite(currpage) > 1 then
gapsite(p) := gapsite(currpage) − 1
else
Ibn Sı̄nā on Analysis: 1. Proof Search 401
gapsite(p) := gapsite(currpage)
X(p) := X(currpage)
where X = edges, f ill, next
currpage := p
else
if next(currpage) > 0 then
currpage := next(currpage)
else
if report = ignorance then
report := logicalf ailure
(ASM5) Ramify=
if gapsite(currpage) > 0 and next(currpage) 0 and
(length(datum(currpage)) 1 or
(length(datum(currpage)) = 2 and gapsite(currpage) = 1)) then
if length(datum(currpage)) 1 then
let p1 , . . . , p8 = new(PAGE )
forall 1 i 8
let φ = listsentences(1-th(edges(currpage)),
2-th(edges(currpage)), i)
datum(pi ) := insert(datum(currpage),φ, gapsite(currpage))
f ill(pi ) := φ
gapsite(pi ) := 0
edges(pi ) := edges(currpage)
previous(pi ) := currpage
forall 1 j 7
next(pi ) := pi+1
next(p8 ) := 0
currpage := p1
else
let p = new(PAGE )
datum(p) := insert(datum(currpage), goal, 1)
gapsite(p) := 0
edges(p) := edges(currpage)
f ill(p) := goal
previous(p) := currpage
next(p) := 0
(ASM6) Select=
if gapsite(currpage) = 0 and length(datum(currpage)) = 1 then
if 1-th(datum(currpage) = goal then
if hasil(f ill) = true then
report := tahsilsuccess
else
let k = least k 1 such that
402 W. Hodges
Basic functions
References
25. Ramsey, F.P.: Foundations: Essays in Philosophy, Logic, Mathematics and Eco-
nomics. In: Mellor, D.H. (ed.). Routledge & Kegan Paul, London (1978)
26. Rashed, R.: Les Mathématiques Infinitésimales du IXe au XIe Siècle, vol. 1, Fon-
dateurs et Commentateurs. Al-Furqān, London (1996)
27. Rashed, R.: Al-Khwārizmı̄, Le Commencement de l’Algèbre. Blanchard, Paris
(2007)
28. Ross, W.D.: Aristotle’s Prior and Posterior Analytics. Clarendon Press, Oxford
(1949)
29. Shehaby, N.: The Propositional Logic of Avicenna. Reidel, Dordrecht (1973)
30. Street, T.: An outline of Avicenna’s syllogistic. Archiv für Geschichte der Philoso-
phie 84(2), 129–160 (2002)
31. Thom, P.: The Syllogism. Philosophia Verlag, Munich (1981)
32. Versteegh, K.: Landmarks in Linguistic Thought III: The Arabic Linguistic Tradi-
tion. Routledge, London (1997)
33. Zermelo, E.: Untersuchungen über die Grundlagen der Mengenlehre I. Mathema-
tische Annalen 65, 261–281 (1908)
Abstract State Machines and the Inquiry
Process
1 Introduction
The idea of a mathematically rigorous semantic basis for software has long held
an allure for software engineers, for a variety of reasons. Software problems and
products can be captured in a precise language; sophisticated analysis tech-
niques, including automated verification, can be brought to bear; complex sys-
tems can be synthesized from multiple partial specifications. The advent of tools
for “lightweight” analysis [21] has made formal methods an option for a broad
range of applications.
Another, less heralded benefit is the role formal methods can play as a catalyst
for inquiry, provoking the constructive questioning that uncovers tacit assump-
tions and unforeseen consequences. This holds special interest for us as software
engineering educators. From the narrow perspective of professional training, we
must prepare our students to engage in the challenging workplace tasks of re-
quirements elicitation and analysis – tasks that are complicated by the invisibility
of software and the wide range of application domains [10]. From a wider ped-
agogical perspective, our primary mission is to expose our students to complex
problems and to promote active, inquiry-based problem solving.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 405–413, 2010.
c Springer-Verlag Berlin Heidelberg 2010
406 J.K. Huggins and C. Wallace
At the heart of software engineering and computer science lies problem solving, a
practice that is fundamentally heuristic due to epistemic, cognitive and temporal
constraints on the human problem solver [25,24]. An expert problem solver typ-
ically employs heuristics tacitly, in a way that is invisible to any observer [23].
This presents a problem for the educator: how can we convey our internalized
“know-how” to our students? Clearly, presenting the “final product” is not suffi-
cient; students must be explicitly encouraged to evaluate and refine the work at
hand, rather than accepting it at face value. This requires constant, systematic
questioning – both internal (analyzing given information) and external (eliciting
further information from other stakeholders).
In the field of technical communication, overreliance on the “final copy” has
generally been acknowledged as a mistake in past educational practice. In cur-
rent practice, revision is viewed as a valuable practice in itself, not simply as
a disposable step toward an end product [14]. Inspired by this example, some
software engineering educators are emphasizing a process of inquiry mediated
by writing and revision. For example, students in Wright’s software engineering
course [27] use a template structure to document the design decisions made dur-
ing a project. The author notes that “the processes students use to think about
and organize what they know and do not know about the design problems is more
important to the students’ learning than the artifacts they generate.” Also, case
studies developed at Penn State [12] and Michigan Tech [10] contain documents
produced through the lifespan of various software projects. The documents are
hyperlinked to show the evolution of the project over time.
In his classic treatise How to Solve It [25], Pólya paraphrases Pappus’ explanation
of Heuristic (analyomenos), the “art of problem solving” expounded by Euclid
and others. Included in this paraphrase is the following:
Pólya remarks that the original text sounds as strange and counterintuitive as
the English paraphrase: “is it not mere self-deception to assume that the prob-
lem we have to solve is solved?” (146). Upon reflection, however, the meaning
is clear: “[i]n order to examine the condition, we have to conceive, to repre-
sent to ourselves, or to visualize geometrically the relations which the condition
prescribes between [the unknown] and [the data]; how could we do so without
conceiving, representing, or visualizing [the unknown] as existent?” (146). To
derive the consequences of the problem condition, thereby uncovering hidden
contradictions or ambiguities, we must take a provisional step of representing
the unknown as known. Doing so allows us to inhabit and explore the problem
space.
How best to represent the unknown depends on the nature of the problem.
In geometrical problems, for instance, sketching figures is a natural and effective
approach. In the context of invisible, arbitrarily complex software, the choice
of representation is not as simple. Accuracy is important, yet excessive effort
in building an accurate representation draws time and attention away from the
main problem-solving task. Furthermore, in a problem solving session involv-
ing multiple stakeholders, complex notation may marginalize those without the
necessary expertise.
Börger makes the case for ASMs as the right choice for software (and hard-
ware, for that matter) [4]. Most relevant to us is his discussion of constructing an
appropriate ground model – an expression of the software problem as stated by
the stakeholders, and the basis for subsequent refinements. Subsequent analysis
and synthesis “will remain an intellectual exercise if we do not ‘know’ that [the
ground model] is ’correct’ ” (261). Börger enumerates the qualities of a good
ground model, some of which are shared by formal methods in general. The pos-
itive characteristics that seem particularly applicable to ASMs are abstraction,
simplicity and conciseness, to ensure acceptance and understanding by all stake-
holders. Thus ASM models can be sketched in a quick, flexible way – “on the
back of an envelope”, to use a popular metaphor – yet with sufficient accuracy
(abstraction and falsifiability) to elicit critical questions.
In the early days of ASMs, much work focused on applying ASMs to complex
real-world software problems, investigating the thesis (later proved by Gurevich
[16]) that ASMs are powerful enough to capture (sequential) algorithms at any
level of detail. A pattern emerges throughout this investigatory genre: while
the process of constructing the ASM models invariably digs up ambiguities or
contradictions buried in the original natural language, the authors typically do
not reflect on the role that ASMs played in uncovering them. Usually, the ASM
models are generally presented as faits accomplis: fully specified, with little or no
discussion of alternatives. The process of envisioning, selecting and discarding
alternatives, and the rationale for selecting a particular design [11], are left to
the reader’s imagination. Given the audience of this work – primarily academics
and professionals – and the space limitations enforced on scholarly work, it is
perhaps not surprising that reflection on analysis and design is omitted; such
readers have an implicit understanding of these activities and are more interested
in the details of the final product.
One can find occasional hints of the design process buried in the exposition
of ASM papers. For instance, in an overview of their ASM models of the ARM
processor [19], the authors state, “[o]ne can always tailor an ASM to a higher or
a lower level of abstraction, depending on one’s interests [. . . ] Our models were
chosen in order to effectively demonstrate the difference between pipelined and
non-pipelined versions of the same microprocessor.” (576) But this describes only
the design motivation, not the process. In other papers, there are intimations of
a design process in cases where peculiarities of the problem domain necessitate
unusual design choices. For example, in the ASM paper on the C programming
language [17], the authors justify the use of a dynamic external function to
resolve operator evaluation order, which is officially underspecified (and therefore
potentially dynamically determined). In the ASM describing the compilation of
Prolog code to the Warren Abstract Machine [5], the authors discuss the dilemma
presented by occur checks during unification: either follow the mathematical
definition, or follow the example of numerous implementations and ignore the
issue. In the ASM for the Kerberos authentication protocol [1], the authors must
first formalize the limits of an adversary before the security of the system can
be proven; several alternative adversaries are discussed before one is chosen and
formalized as an ASM.
By and large, these hints at a design process are the exception rather than
the rule. For newcomers to ASMs, the models given in the research literature
can be difficult to use as a model for their own applications. Generally, students
have little trouble understanding the ASM models as presented in the papers –
activity corresponding to the “comprehension” level of Bloom’s taxonomy [3].
However, the paucity of information on how to create their own ASMs presents
an obstacle to the higher levels of “application”, “analysis” and “synthesis”.
While reading ASMs may certainly grant insight into the particular problems
being studied, and may convince readers of the general utility of ASMs, there
are few clues as to how to develop a new ASM for oneself.
Abstract State Machines and the Inquiry Process 409
watches, handheld video games, pocket electronic dictionaries), all obtained at low
cost from various secondhand stores. Many of these “black boxes” exhibit unusual
behavior, and documentation is nonexistent. In the first phase of the exercise, each
team of 3–4 students “plays” with a device, exploring its behavior in reaction to
human input. The team then maps the machine-environment relationships to a
set of Problem Frames. Finally, the students represent the behavior they have
observed in terms of an ASM. In the second phase, students present what they
have discovered to the entire class, and audience members join in with their own
lines of questioning. In the exercise, the Problem Frame and ASM documents
serve as focus points from which team members explore possible behaviors. Fur-
thermore, the value of these formalisms as a communication medium is made
evident during the class presentations; the documents are not created purely to
satisfy the instructor, but to convey information to classmates.
6 Results
Our experience shows that ASMs can be a valuable classroom tool – if introduced
with care. Evaluations of the second author’s software quality assurance course
showed that students significantly broadened their range of inquiry techniques
[9]. However, students can easily develop a cynical attitude toward ASMs as just
another form of “useless” documentation. The pragmatic value of ASM-mediated
inquiry must be made clear early and often.
One of the major risks is also one of the major advantages touted by ASM
supporters: the “natural” character of ASM code and its similarity in form to
pseudocode. This similarity may lead students to treat ASMs as a freeform
“style” rather than a well-defined language. This in turn can lead to a skeptical
or cynical attitude that anything is a valid ASM. The use of automated tools can
mitigate this risk. The error checking functionality of these tools can help to avoid
fundamental misunderstandings, and simulation and automated test generation
allows for deeper investigation than possible by hand. On the other hand, with
the introduction of a programming environment, we lose the spontaneous “back
of the envelope” feel that is useful in an active inquiry session.
A related risk stems from the (well-placed) ASM emphasis on abstraction. To
a student who has spent many hours writing long programs in earlier projects, a
high-level ASM of only a few lines may seem like a worthless or even fraudulent
artifact. In one humorous episode, a clever student presented his ASM program,
consisting of a single update CurrentState := NextState, and praised its “high
level of abstraction”. Holding to the dogma that “abstraction is good”, without
illustrating why it is good, can actually have the effect of immunizing students
against the concept.
Above all, formalization for the sake of formalization must be avoided. As
Jackson and Wing contend [21], “[t]here can be no point embarking on the con-
struction of a specification until it is known exactly what the specification is for;
which risks it is intended to mitigate; and in which respects it will inevitably
prove inadequate. [21]” While students who only encounter complete, fully de-
veloped ASMs may attain a passive, “read-only” attitude to ASMs, those who
Abstract State Machines and the Inquiry Process 411
write their own without using them in a meaningful way may pick up an equally
unproductive “write-only” attitude. Students must be presented with the techni-
cal details of ASMs within a teleological context that gives purpose to the whole
enterprise.
7 Conclusion
In the introduction to the ASM Primer [20], we claimed a need “to provide a
gentler introduction [to ASMs], focusing more on the use of the technique than
on formal definitions.” (1) In retrospect, it seems that our focus on “use” can
be more precisely described as a contextualization of ASMs within a culture of
inquiry. The formal documents that comprise the ASM literature reflect such a
culture. One admires the mathematical beauty of the resulting work, and the skill
of those who produce it. Yet such documents yield few clues regarding how those
works were produced, or how someone could produce a similar result on their
own. Problem Frames, which combine a simple problem representation technique
with heuristics for problem analysis, may provide an instructive example. Along
similar lines, a set of design patterns could be developed for ASMs, gently guiding
design and analysis.
ASMs are a powerful tool, capable of representing a wide variety of types of
algorithms at multiple levels of abstraction. But their value depends crucially
upon the training given to the practitioner. One does not train a craftsman by
simply showing completed works; one usually works alongside a senior craftsman,
who shows the potential of the tools in the proper hands. Much of the growth of
the ASM community has happened precisely because of this type of mentoring,
as each generation of researchers mentors the next into maturity. But if ASMs
are to become a widely-used tool, we will need to find different ways to teach
about ASMs. The spirit of inquiry common to the ASM community will need to
find expression in forms in addition to our collective oral history.
References
1. Bella, G., Riccobene, E.: Formal Analysis of the Kerberos Authentication System.
Journal of Universal Computer Science 3(12), 1337–1381 (1997)
2. Berry, D.M.: The Importance of Ignorance in Requirements Engineering. Journal
of Systems and Software 28(2), 179–184 (1995)
3. Bloom, B.S.: Taxonomy of Educational Objectives. In: Handbook I: The Cognitive
Domain. Longman, White Plains, New York (1956)
412 J.K. Huggins and C. Wallace
4. Börger, E.: Why Use Evolving Algebras for Hardware and Software Engineer-
ing? In: Bartosek, M., Staudek, J., Wiedermann, J. (eds.) SOFSEM 1995. LNCS,
vol. 1012, pp. 236–271. Springer, Heidelberg (1995)
5. Börger, E., Rosenzweig, D.: The WAM – Definition and Compiler Correctness.
In: Beierle, L.C., Pluemer, L. (eds.) Logic Programming: Formal Methods and
Practical Applications. North-Holland Series in Computer Science and Artificial
Intelligence (1994)
6. Börger, E., Stärk, R.: Abstract State Machines: A Method for High-Level System
Design and Analysis. Springer, Heidelberg (2003)
7. Börger, E., Stärk, R., Schmid, J.: Java and the Java Virtual Machine: Definition,
Verification, Validation. Springer, Heidelberg (2001)
8. Börger, E., Schulte, W.: Initialization Problems for Java. Software: Principles and
Tools 19(4), 175–178 (2000)
9. Brady, A., Seigel, M., Vosecky, T., Wallace, C.: Addressing Communication Is-
sues in Software Development through Case Studies. In: Conference on Software
Engineering Education & Training (2007)
10. Brady, A., Seigel, M., Vosecky, T., Wallace, C.: Speaking of Software: Case Studies
in Software Communication. In: Ellis, H.J.C., Demurjian, S.A., Naveda, J.F. (eds.)
Software Engineering: Effective Teaching and Learning Approaches and Practices
(2008)
11. Burge, J.E., Carroll, J.M., McCall, R., Mistrı́k, I.: Rationale-Based Software En-
gineering. Springer, Heidelberg (2008)
12. Carroll, J.M., Rosson, M.B.: A Case Library for Teaching Usability Engineering:
Design Rationale, Development, and Classroom Experience. Journal of Educational
Resources in Computing 5(1), 3 (2005)
13. DeSanto, F.: Gurevich Abstract State Machines. Communicator: EECS Depart-
ment Newsletter, University of Michigan (December 1997)
14. Flower, L.: Problem Solving Strategies for Writing in College and Community.
Wadsworth Publishing, Belmont (1997)
15. Gurevich, Y.: Evolving Algebras: An Attempt to Discover Semantics. In: Rozen-
berg, G., Salomå A. (ed.) Current Trends in Theoretical Computer Science, pp.
266–292. World Scientific, Singapore (1993)
16. Gurevich, Y.: Sequential Abstract State Machines Capture Sequential Algorithms.
ACM Transactions on Computational Logic 1(1), 77–111 (2000)
17. Gurevich, Y., Huggins, J.K.: The Semantics of the C Programming Language.
In: Martini, S., Börger, E., Kleine Büning, H., Jäger, G., Richter, M.M. (eds.)
CSL 1992. LNCS, vol. 702, pp. 274–308. Springer, Heidelberg (1993)
18. Gurevich, Y., Schulte, W., Wallace, C.: Investigating Java Concurrency using Ab-
stract State Machines. In: Gurevich, Y., Kutter, P.W., Odersky, M., Thiele, L.
(eds.) ASM 2000. LNCS, vol. 1912. Springer, Heidelberg (2000)
19. Huggins, J.K., van Campenhout, D.: Specification and Verification of Pipelining
in the ARM2 RISC Microprocessor. ACM Transactions on Design Automation of
Electronic Systems 3(4), 563–580 (1998)
20. Huggins, J.K., Wallace, C.: An Abstract State Machine Primer. Technical Report
02-04, Computer Science Department, Michigan Technological University (2002)
21. Jackson, D., Wing, J.: Lightweight Formal Methods. IEEE Computer 29(4), 21–22
(1996)
22. Jackson, M.: Problem Frames. Addison-Wesley, Reading (2000)
23. Johnson, R.R.: User Centered Technology: A Rhetorical Theory for Computers
and Other Mundane Artifacts. SUNY Press (1998)
Abstract State Machines and the Inquiry Process 413
24. Newell, A., Simon, H.A.: Human Problem Solving. Prentice-Hall, Englewood Cliffs
(1972)
25. Pólya, G.: How to Solve It. Princeton University Press, Princeton (1948)
26. Wallace, C., Wang, X., Bluth, V.: A Course in Problem Analysis and Structuring
through Problem Frames. In: Conference on Software Engineering Education and
Training (2006)
27. Wright, D.R.: The Decision Pattern: Capturing and Communicating Design Intent.
In: ACM International Conference on Design of Communication, pp. 69–74 (2007)
The Algebra of Adjacency Patterns:
Rees Matrix Semigroups with Reversion
For Yuri Gurevich, who built many bridges between logic and algebra,
on the occasion of his seventieth birthday.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 414–443, 2010.
c Springer-Verlag Berlin Heidelberg 2010
The Algebra of Adjacency Patterns: Rees Matrix Semigroups with Reversion 415
Speaking about naturalness, we want to stress that the algebraic objects (ad-
jacency semigroups) that we use here to interpret graphs have not been invented
for this specific purpose. Indeed, adjacency semigroups belong to a well estab-
lished class of unary semigroups1 that have been considered by many authors.
We shall demonstrate how graph theory both sheds a new light on some pre-
viously known algebraic results and provides their extensions and generaliza-
tions. By surjectivity we mean that, on the level of appropriate classes of graphs
and unary semigroups, the interpretation map introduced in this paper becomes
“nearly” onto; moreover, the map induces a lattice isomorphism between the
lattices of such classes provided one excludes just one element on the semigroup
side. This implies that our approach allows one to interpret both graphs within
unary semigroups and unary semigroups within graphs.
The paper is structured as follows. In Sect. 2 we recall some notions related
to graphs and their classes and present a few results and examples from graph
theory that are used in the sequel. Section 3 contains our construction and the
formulations of our main results: Theorems A and B. These theorems are proved
in Sect. 4 and 5 respectively while Sect. 6 collects some of their applications.
We assume the reader’s acquaintance with basic concepts of universal algebra
and first-order logic such as ultraproducts or the HSP-theorem, see, e.g., [4]. As
far as graphs and semigroups are concerned, we have tried to keep the presen-
tation to a reasonable extent self-contained. We do occasionally mention some
non-trivial facts of semigroup theory but only in order to place our considera-
tions in a proper perspective. Thus, most of the material should be accessible to
readers with very basic semigroup-theoretic background (such as some knowl-
edge of Green’s relations and of the Rees matrix construction over the trivial
group, cf. [12]).
Lemma 2.2. Let L be an ultraproduct closed class of graphs and let G be a graph.
We have G ∈ ISP+ Pu (L) if and only if there is at least one homomorphism from
G into a member of L and the following two separation conditions hold :
The 1-vertex looped graph 1 always satisfies the two separation conditions, yet
it fails every uH sentence of the second kind; this is why the lemma asks addi-
tionally that there be at least one homomorphism from G into some member of
L. If G = 1 and no such homomorphism exists, then evidently, no member of L
contains a loop, and so L |= x x, a law failing on 1. Hence 1 ∈/ ISP+ Pu (L) by
Lemma 2.1. Conversely, if there is such a homomorphism, then 1 is isomorphic
to an induced subgraph of some member of L and hence 1 ∈ ISP+ Pu (L). If
the condition that there is at least one homomorphism from G into some mem-
ber of L is dropped, then Lemma 2.2 instead characterizes membership in the
quasivariety generated by L.
We now list some familiar uH sentences.
– reflexivity: x ∼ x,
– anti-reflexivity: x ∼ x,
– symmetry: x ∼ y → y ∼ x,
– anti-symmetry: x ∼ y & y ∼ x → x ≈ y,
– transitivity: x ∼ y & y ∼ z → x ∼ y.
418 M. Jackson and M. Volkov
In fact it is easy to see that, along with the 1-vertex partial orders and the trivial
class {0}, this exhausts the list of all uH classes of preorders, see Fig. 1 (the easy
proof is sketched before Corollary 6.4 of [7], for example).
Sub-uH classes of simple graphs have been heavily investigated, and include some
very interesting families. In order to describe some of these families, we need a
series of graphs introduced by Nešetřil and Pultr [21]. For each integer k ≥ 2,
let Ck denote the graph on the vertices 0, . . . , k + 1 obtained from the complete
loopless graph on these vertices by deleting the edges (in both directions) con-
necting 0 and k + 1, 0 and k, and 1 and k + 1. Fig. 2 shows the graphs C2 and
C3 ; here and below we adopt the convention that an undirected edge between
two vertices, say a and b, represents two directed edges a ∼ b and b ∼ a.
Recall that a simple graph G is said to be n-colorable is there exists a homo-
morphism from G into the complete loopless graph on n vertices.
Preorders
{0}
0 2 0 2 4
1 3 1 3
x0 ∼ x1 & x1 ∼ x2 & x2 ∼ x3 → x0 ∼ x3 ;
G1 : 0 1
The class of all graphs is generated as a uH class by a single finite graph. Indeed,
it is trivial to see that for any graph G, there is a family of 3-vertex graphs such
that the separation conditions of Lemma 2.2 hold. Since there are only finitely
many non-isomorphic 3-vertex graphs, any graph containing these as induced
subgraphs generates the uH class of all graphs. Alternatively, the reader can
easily verify using Lemma 2.2 that the graph G1 in Fig. 3 generates the uH class
of all graphs.
Example 2.6. A generator for the class Gsymm of all symmetric graphs.
Using Lemma 2.2, it is easy to prove that the class of symmetric graphs is
generated as a uH class by the graph S1 shown in Fig. 4.
S1 : 0 1 2 3
S2 : 0 1
Example 2.8. A generator for the class Gref of all reflexive graphs.
The class of reflexive graphs is generated by the following graph R1 , while the
class of reflexive and symmetric graphs is generated by the graph RS1 .
R1 : 0 RS1 : 0 1 2
The Rees–Sushkevich Theorem (see [12, Theorem 3.3.1]) states that, up to iso-
morphism, the completely 0-simple semigroups with trivial subgroups are pre-
cisely the Rees matrix semigroups over the trivial group and for which each row
and each column of the sandwich matrix contains a nonzero element. If the ma-
trix P has no 0 entries, then the set M [P ] = M 0 [P ] \ {0} is a subsemigroup.
Semigroups of the form M [P ] are called rectangular bands, and they are precisely
the completely simple semigroups with trivial subgroups.
Back to adjacency semigroups, we always think of A(G) as endowed with an
additional unary operation a → a which we call reversion and define as follows:
(x, y) = (y, x), 0 = 0 .
Notice that by this definition (a ) = a for all a ∈ A(G).
The main contribution in this paper is the fact that uH classes of graphs are
in extremely close correspondence with unary semigroup varieties generated by
adjacency semigroups, and our proof of this will involve a translation of uH sen-
tences of graphs into unary semigroup identities. However, before we proceed
with precise formulations and proofs of general results, the reader may find it
useful to check that several of the basic uH sentences used in Sect. 2 correspond
via the adjacency semigroup construction to rather natural semigroup-theoretic
properties. Indeed, all the following are quite easy to verify:
– reflexivity of G is equivalent to A(G) |= xx x ≈ x;
– anti-reflexivity of G is equivalent to A(G) |= xx z ≈ zxx ≈ xx (these laws
can be abbreviated to xx ≈ 0);
422 M. Jackson and M. Volkov
Observe that the unary semigroup identities that appear in the above examples
are in fact used to define the most widely studied types of semigroups endowed
with an extra unary operation modelling various notions of an inverse in groups.
For instance, a semigroup satisfying the identities
x ≈ x (1)
(xy) ≈ y x (2)
xx x ≈ x (3)
and in fact it can be shown that the class SB of all square bands constitutes a
variety of unary semigroups defined within the variety of all regular ∗-semigroups
by the identities (4).
Let L(Gref ) denote the lattice of sub-uH classes of Gref and let L(Aref ) denote
the lattice of subvarieties of Aref . Let L+ denote the result of adjoining a new
element S to L(Gref ) between the class of single block equivalence relations and
the class containing the empty graph. (The reader may wish to look at Fig. 1 to
see the relative location of these two uH classes.) Meets and joins are extended
to L+ in the weakest way. So L+ is a lattice in which L(Gref ) is a sublattice
containing all but one element.
We are now in a position to formulate our second main result.
Theorem B. Let ι be the map from L+ to L(Aref ) defined by S → SB and K →
HSP(A(K)) for K ∈ L(Gref ). Then ι is a lattice isomorphism. Furthermore, a
variety in L(Aref ) is finitely axiomatized (finitely generated as a variety) if and
only if it is the image under ι of either S or a finitely axiomatized (finitely
generated, respectively) uH class of reflexive graphs.
We prove Theorems A and B in the next two sections.
4 Proof of Theorem A
4.1 Equations Satisfied by Adjacency Semigroups
The variety of semigroups generated by the class of Rees matrix semigroups
over trivial groups is reasonably well understood: it is generated by a 5-element
semigroup usually denoted by A2 (see [19] for example). (In context of this paper
A2 can be thought as the semigroup reduct of the adjacency semigroup A(S2 )
where S2 is the 2-vertex graph from Example 2.7.) This semigroup was shown
to have a finite identity basis by Trahtman [23], who gave the following elegant
description of the identities: an identity u ≈ v (where u and v are semigroup
424 M. Jackson and M. Volkov
words) holds in A2 if and only if u and v start with the same letter, end with the
same letter and share the same set of two letter subwords. Thus the equational
theory of this variety corresponds to pairs of words having the same “adjacency
patterns”, in the sense that a two letter subword xy records the fact that x
occurs next to (and before) y. This adjacency pattern can also be visualized as
a graph on the set of letters, with an edge from x to y if xy is a subword, and
two distinct markers indicating the first and last letters respectively.
In this subsection we show that the equational theory of A has the same
kind of property with respect to a natural unary semigroup notion of adjacency.
The interpretation is that each letter has two sides – left and right – and that
the operation reverses these. A subword xy corresponds to the right side of x
matching the left side of y, while x y or any subword (x . . .) y corresponds to the
left side of x matching the left side of y. To make this more precise, we give an
inductive definition. Under this definition, each letter x in a word will have two
associated vertices corresponding to the left and right side. The graph will have
an initial vertex, a final vertex as well as a set of (directed) edges corresponding
to adjacencies.
Let u be a unary semigroup word, and X be the alphabet of letters appearing
in u. We construct a graph G[u] on the set
{x | x ∈ X} ∪ {rx | x ∈ X}
with two marked vertices. If u is a single letter (say x), then the edge set (or
adjacency set) of G[u] is empty. The initial vertex of a single letter x is x and
the final (or terminal ) vertex is rx .
If u is not a single letter, then it is of the form v or vw for some unary
semigroup words v, w. We deal with the two cases separately. If u is of the form
v , where v has set of adjacencies S, initial vertex pa and final vertex qb (where
{p, q} ⊆ {, r} and a, b are letters appearing in v), then the set of adjacencies of
u is also S, but the initial vertex of u is equal to the final vertex qb of v and the
final vertex of u is equal to the initial vertex pa of v.
Now say that u is of the form vw for some unary semigroup words v, w,
with adjacency set Sv and Sw respectively and with initial vertices pav , paw
respectively and final vertices qbv and qbw respectively. Then the adjacency set
of G[u] is Sv ∪ Sw ∪ {(qbv , paw )}, the initial vertex is pav and the final vertex is
qbw . Note that the word u may be broken up into a product of two unary words
in a number of different ways, however it is reasonably clear that this gives rise to
the same adjacency set and initial and final vertices (this basically corresponds
to the associativity of multiplication).
For example the word a (baa ) decomposes as a ·(baa ) , and so has initial ver-
tex equal to the initial vertex of a , which in turn is equal to the terminal vertex
of a, which is ra . Likewise, its terminal vertex should be the terminal vertex of
(baa ) , which is the initial vertex of baa , which is b . Continuing, we see that the
edge set of the corresponding graph has edges {(a , a ), (ra , ra ), (rb , a )}. This
graph is the first graph depicted in Fig. 7 (the initial and final vertices are indi-
cated by a sourceless and targetless arrow respectively). The second is the graph
The Algebra of Adjacency Patterns: Rees Matrix Semigroups with Reversion 425
a b c
a u b u- -u u- u
Q
k
Q 3
Q
Q
-u Qu u
u u
ra rb ra rb rc
of either of the words a(bc) or (b(ac ) ) . The fact that G[a(bc) ] = G[(b(ac ) ) ]
will be of particular importance in constructing a basis for the identities of A.
We can also construct a second kind of graph from a word w, in which all loops
are added to the graph of G[w] (that is, it is the reflexive closure of the edge set),
we call this Gref [w]. For example, it is easy to see that Gref [a (baa ) ] = Gref [(ba) ]
(most of the work was done in the previous example). Lastly, we define the graph
Gsymm [w] corresponding to the symmetric closure of the edge set of G[w].
Lemma 4.2. Let u and θ be as in Notation 1. Then θ(u) = 0 if and only if the
map defined by x → ix and rx → jx is a graph homomorphism from G[u] to H.
that θ(u1 ) has right coordinate jb̄ , and θ(u2 ) has left coordinate iā . But u1 u2 is
a subword, so θ(u1 )θ(u2 ) = 0, whence (jb̄ , iā ) is an edge of H, as required.
(Sufficiency.) This is easy.
Lemma 4.2 is easily adapted to the graph Gref (u) or Gsymm (u), where the graph
H is assumed to be reflexive or symmetric, respectively.
Proposition 4.1. An identity u ≈ v holds in A if and only if G[u] = G[v]. An
identity holds in Aref if and only if Gref [u] = Gref [v]. An identity holds in Asymm
if and only if Gsymm [u] = Gsymm [v].
Proof. We prove only the first case; the other two cases are similar.
First we show sufficiency. Let us assume that G[u] = G[v], and consider an
assignment θ into an adjacency semigroup A(H). Now the vertex sets are the
same, so u and v have the same alphabet. So we may assume that θ maps the
alphabet to nonzero elements of A(H). By Lemma 4.2, we have θ(u) = 0 if and
only if θ(v) = 0. By Lemma 4.1, we have θ(u) = θ(v) whenever both sides are
nonzero. Hence θ(u) = θ(v) always.
Necessity. Say that G[u] = G[v]. If the vertex sets are distinct, then u ≈ v fails
on A(1), which is isomorphic to the unary semigroup formed by the integers 0
and 1 with the usual multiplication and the identity map as the unary operation.
Now say that G[u] and G[v] have the same vertices. Without loss of generality,
we may assume that either G[v] contains an edge not in G[u], or that the two
graphs are identical but have different initial vertices. Let Au := A(G[u]) and
consider the assignment into Au that sends each variable x to (x , rx ). Observe
that the value of u is equal to (λa , ρb ) where λa is the initial vertex of G[u] and
ρb is the final vertex, while the value of v is either 0 (if there is an adjacency not
in G[v]: we fail to get a graph homomorphism) or has different first coordinate
(if G[v] has a different initial vertex). So u ≈ v fails in A.
For later reference we refer to the second and third laws in Ψ as the first associa-
tivity across reversion law (FAAR) and second associativity across reversion law
(SAAR), respectively. We let B denote the unary semigroup variety defined by Ψ .
Surprisingly, the laws in Ψ are sufficient to reduce every unary semigroup
word to one in which the nesting height of the unary is at most 2. The proof
of this is the main result of this subsection.
Lemma 4.3. Ψ implies (a(bcd) e) ≈ (b e) c(ad ) , where c is possibly empty.
Let X := {x1 , x2 , . . .}. Let F(X) denote the free unary semigroup freely gener-
ated by X and FΨ (X) denote the B-free algebra freely generated by X. We let
ψ denote the fully invariant congruence on F(X) giving FΨ (X) = F(X)/ψ. We
find a subset N ⊆ F(X) with X ⊆ N and show that multiplying two words from
N in F(X), or applying to a word in N produces a word that is ψ-equivalent
to a word in N . It shows that every word in F(X) is ψ-equivalent to a word
in N . In this way the members of N are a kind of weak normal form for terms
modulo Ψ (we do not claim that distinct words in N are not ψ-equivalent; for
example, Proposition 4.1 shows that A |= x(x y) ≈ x(x y ) , but the two words
are distinct elements of N ).
We let N consist of all (nonempty) words of the form
in which case it reduces to x ∈ N modulo Ψ . Now say that the result holds for
breadth k members of N , and say that the breadth of s is k + 1. So s can be
written in the form p(y1 · · · ym ) u where p is either empty or is a word from N of
breadth k, u = uk+2 is a possibly empty semigroup word in the alphabet X ∪ X
and y1 , . . . , ym is a possibly repeating sequence of variables from X ∪ X with
vk+1 ≡ y1 · · · ym (so m > 1). Note that p can be empty only if k = 0.
Let us write w for y2 · · · ym−1 (if m = 2, then w is empty). If both p and u are
empty, then s ∈ N already. If neither p nor u are empty, then by Lemma 4.3 Ψ
implies s ≈ (pym
) w(y1 u) . The breadth of pym
is k, so the induction hypothesis
and Lemma 4.4 complete the proof.
SAAR
Now say that p is empty and u is not. We have ((y1 wym ) u) ≈ (y1 u) wym ,
and the latter word is contained in N (modulo x ≈ x).
FAAR
Lastly, if u is empty and p is not, then we have s ≡ (p(wym ) ) ≈ w(pym
),
and the induction hypothesis applies to (pym ) since pym is of breadth k. By
Lemma 4.4, s is ψ-equivalent to a member of N .
As explained above, Lemmas 4.4 and 4.5 give us the following result.
Proposition 4.2. Every unary semigroup word reduces modulo Ψ to a word
in N .
An algorithm for making such a reduction is to iterate the method of proof of
Lemmas 4.4 and 4.5; however we will not need this here.
α(u1 , v1 )[i]α(u2 , v2 )[i] = (πi (u1 ), πi (v1 ))(πi (u2 ), πi (v2 )) = (πi (u1 ), πi (v2 ))
as required.
Claim 2. Say (u1 , v1 ) and (u2 , v2 ) are nonzero elements of A(G). If v1 ∼ u2 then
α(u1 , v1 )α(u2 , v2 ) ∈ J.
Proof. By the definition of v1 ∼ u2 there is i ∈ I with πi : G → Hi with
πi (v1 ) ∼ πi (u2 ). Then (u1 , v1 )[i](u2 , v2 )[i] = 0.
Claims 1 and 2 show that α is a semigroup homomorphism from A(G) onto B/J
(at least, if we adjust the co-domain of α to be B/J and identify the constant 0
with J). Now we show that this map is injective. Say (u1 , v1 ) = (u2 , v2 ) in A(G).
Without loss of generality, we may assume that u1 = u2 . So there is a coordinate
i with πi (u1 ) = πi (u2 ). Then α(u1 , v1 ) differs from α(u2 , v2 ) on the i-coordinate.
So we have a semigroup isomorphism from A(G) to B/J. Lastly, we observe that
α trivially preserves the unary operation, so we have an isomorphism of unary
semigroups as well. This completes the proof of Lemma 4.6.
To prove the other half of Theorem A we take a syntactic approach by translating
uH sentences into unary semigroup identities. To apply our technique, we first
need to reduce arbitrary uH sentences to logically equivalent ones of a special
form.
Our goal is to show that if G ∈
/ ISP+ Pu (K) then A(G) ∈/ HSP(A(K)). We first
consider some degenerate cases.
If K = {0}, then A(K) is the class consisting of the one element unary
semigroup and HSP(A(K)) |= x ≈ y. The statement G ∈ / ISP+ Pu (K) simply
means that |G| ≥ 1 and so A(G) |= x ≈ y. So A(G) ∈ / HSP(A(K)).
Now we say that K contains a nonempty graph. We can then further assume
that the empty graph is not in K. If G is the 1-vertex looped graph 1, then
the statement G ∈ / ISP+ Pu (K) simply means that K consists of antireflexive
graphs. In this case, A(K) |= xx ≈ 0, while A(G) |= xx ≈ 0. So again, A(G) ∈/
HSP(A(K)).
So now it remains to consider the case where G is not the 1-vertex looped
graph and K does not contain the empty graph. Lemma 2.1 shows that there
430 M. Jackson and M. Volkov
u1 , . . . , un , v1 , . . . , vn ∈ {a1 , . . . , am }
are not necessarily distinct variables. For each adjacency ui ∼ vi in Φ, let wi de-
note the word (ui vi ) si (ui vi ) si (ui vi ) si (ui vi ) , where si is a new variable. Now let
σ : {1, . . . , m} → {1, . . . , n} be some finite sequence of numbers from {1, . . . , n}
with the property that for each pair i, j ∈ {1, . . . , n} there is k < m with σk = i
and σk+1 = j and such that σ(1) = σ(m) = 1. Define a word DΦ (depending on
σ) as follows:
⎛ ⎞
⎝ wσ(i) tσ(i),σ(i+1) ⎠ wσ(m) ,
1≤i<m
where the ti,j are new variables. As an example, consider the conjunction Φ :=
x ∼ y & y ∼ z, where n = 2, u1 = x, v1 = u2 = y and v2 = z. Using the
sequence σ = 1, 2, 2, 1, 1 we get DΦ equal to the following expression:
(xy) s1 (xy ) s1 (x y) s1 (x y ) t1,2 (yz) s2 (yz ) s2 (y z) s2 (y z ) t2,2 (yz) s2 (yz )
s2 (y z) s2 (y z ) t2,1 (xy) s1 (xy ) s1 (x y) s1 (xy) t1,1 (xy) s1 (xy ) s1 (x y) s1 (x y ) .
(i , ri ) say. Let γ be any member of {L, R}m. If θ(DΦ ) = 0 then the map φγ from
a1 , . . . , am into the vertices VH of H defined by
i if γ(i) = L;
φγ (ai ) =
ri if γ(i) = R
satisfies Φ.
Proof. Let ai ∼ aj be one of the adjacencies in Φ. So all of ai aj , ai aj , ai aj and
ai aj appear in DΦ and hence are given nonzero values by θ. We have i ∼ j ,
i ∼ rj , ri ∼ j , ri ∼ rj in H. So regardless of the choice of γ we have φγ (ai ) ∼
φγ (aj ) in H.
Lemma 4.8. Let Φ = &1≤i≤n ui ∼ vi be a nonempty conjunction in the vari-
ables a1 , . . . , am and let θ be an assignment of these variables into a graph H
such that H |= θ(Φ). Define an assignment θ+ of the variables of DΦ into A(H)
by ai → (θ(ai ), θ(ai )), θ+ (ti,j ) := (θ(vi ), θ(uj )) and θ+ (si ) = θ+ (ti,i ). We have
θ+ (DΦ ) = (θ(v1 ), θ(u1 )).
Proof. This is a routine calculation. For each adjacency ui ∼ vi in Φ (here
{ui , vi } ⊆ {a1 , . . . , am }) we have
all taking the same nonzero value (θ(vi ), θ(ui )). Then we also have
θ+ (wi ) = (θ(vi ), θ(ui )) which shows that
θ+ (DΦ ) = [θ(u1 ), θ(v1 )] . . . [θ(vi ), θ(ui )] θ+ (ti,j ) [θ(vj ), θ(uj )] . . . [θ(u1 ), θ(v1 )]
= [θ(u1 ), θ(v1 )] . . . [θ(vi ), θ(ui )][θ(vi ), θ(uj )][θ(vj ), θ(uj )] . . . [θ(u1 ), θ(v1 )]
= [θ(u1 ), θ(v1 )]
Proof. We prove the case of τ4 and leave the remaining (very similar) cases to
the reader. First assume that H |= Φ → ui ∼ v. Consider some assignment θ
into A(H) that gives DΦ a nonzero value. As w appears on both sides, we may
further assume that θ(w) is nonzero. Observe that the graph of the right hand
side of the identity is identical to that of the left side except for the addition of a
single edge from ui to w . Also, the initial and final vertices are the same. So to
show that the two sides are equal, it suffices to show that θ(ui )θ(w) is nonzero.
Choose any map γ from the variables of τ4 to {L, R} with γ(ui ) = R. By
Lemma 4.7 we have H |= φγ (Φ). Using Φ → ui ∼ v it follows that for any vertex
w we have φγ (ui ) ∼ w. In other words, θ(ui )θ(w) is nonzero as required.
Now say that Φ → ui ∼ v fails on H under some assignment θ. Extend θ+ to
w by w → (θ(v), θ(u1 )). Under this assignment the left hand side of τ4 takes the
value (θ(v), θ(ui )), while the right hand side equals 0.
Lastly we need to consider the case where Γ has empty premise, that is, where
Γ is a universally quantified atomic formula τ . In the language of graphs, there
are essentially four different possibilities for τ (up to a permutation of letter
names): x ∼ y, x ∼ x, x ≈ y and x ≈ x. The last of these is a tautology.
The first three are nontautological and correspond to the uH-classes of complete
looped graphs, reflexive graphs, and the one element graphs. For Φ one of the
three atomic formulas, we let τΦ denote the identities xx ≈ x, xx x ≈ x, and
x ≈ x, respectively.
Lemma 4.12. Let H be a graph and Φ be one of the three nontautological atomic
formulas in the language of graphs. We have H |= Φ if and only if A(H) |= τΦ .
5 Proof of Theorem B
In contrast to the proof of Theorem A, this section requires some basic notions
and facts from semigroup theory such as Green’s relations J , L , R, H and
their manifestation on Rees matrix semigroups. For details, refer to the early
chapters of any general semigroup theory text; Howie [12] for example.
The first step to proving Theorem B is the following.
x ≈ xx x, (x x) ≈ x x, x ≈ x, x(yz) ≈ (y(xz ) ) , (xy) z ≈ ((x z) y) .
We now let Σref denote the following set of unary semigroup identities:
x ≈ x, x(yz) ≈ (y(xz ) ) , (xy) z ≈ ((x z) y) , (Ψ )
xx x ≈ x , (6)
(xx ) ≈ xx , (7)
x ≈x ,
3 2
(8)
xyxzx ≈ xzxyxzx ≈ xzxyx , (9)
x yxzx ≈ (xzx) yxzx , (10)
xyxzx ≈ xyxz(xyx) . (11)
Proposition 4.1 easily shows that all but identity (6) hold in A, while (6) obvi-
ously holds in the subvariety Aref . Hence, to prove that Σref is a basis for Aref ,
we need to show that every model of Σref lies in Aref . Before we can do this, we
need some further consequences of Σref .
In the identities that occur in the next lemma we use u, where u is either x
or xyx, to denote either u or u . We assume that the meaning of the operation
is fixed within each identity: either it changes nothing or it adds to all its
arguments.
Lemma 5.4. The following identities all follow from Σref :
– (xu1 ) u2 xyx ≈ (xyxu1 ) u2 xyx;
– xyxu2 (u1 x) ≈ xyxu2 (u1 xyx) ;
– (u1 x) u2 xyx ≈ (u1 xyx) u2 xyx;
– xyxu2 (xu1 ) ≈ xyxu2 (xyxu1 ) ,
where u1 and u2 are possibly empty unary semigroup words.
Proof. In each of the eight cases, if u1 is empty, then the identity is equivalent
modulo x ≈ x to one in Σref up to a change of letter names. So we assume that
u1 is non-empty. We can ensure that u2 is non-empty by rewriting u2 xyx and
xyxu2 as (u2 xx )xyx and xyx(x xu2 ) respectively (a process we reverse at the
SAAR
end of each deduction). For the first identity we have Ψ implies (xu1 ) u2 xyx ≈
((x u2 xyx) u1 ) , and then we use (9) or (10) to replace x by xyx. Reversing the
application of SAAR, we obtain the corresponding right hand side.
The second identity is just a dual to the first so follows by symmetry. Similarly,
the fourth will follow from the third by symmetry.
For the third identity, Lemma 5.3 can be applied to the left hand side to
get x x(u1 x) u2 xyx. Now, the subword x x is either x x or xx . We will write
it as t(x, x ) (where t(x, y) is one of the words xy or yx). Using (9), we have
t(x, x )(u1 x̄) u2 xyx ≈ t(xyx, x )(u1 x̄) u2 xyx. But the subword t(xyx, x )(u1 x̄)
is of the form required to apply the second identity in the lemma we are prov-
ing. Since this second identity has been established, we can use it to deduce
t(xyx, x )(u1 xyx) u2 xyx and then reverse the procedure to get
t(xyx, x )(u1 xyx) u2 xyx ≈ x x(u1 xyx) u2 xyx ≈ (u1 xyx) u2 xyx
(the last equality requires a few extra easy steps in the u = u case).
436 M. Jackson and M. Volkov
p(a) = t(a, a1 , . . . , an )
So far the proof is identical to that of [11, Lemma 3.2]. In the semigroup setting,
both ρz and λz are congruences, however this is no longer true in the unary
semigroup setting. Instead, we replace ρz and λz by their syntactic congruences
Syn(ρz ) and Syn(λz ).
Let a and b be distinct elements of S. Our goal is to show that one of the
congruences Syn(ρa ), Syn(ρb ), Syn(λa ) and Syn(λb ) separates a and b, and that
S/ Syn(ρz ) and S/ Syn(λz ) are isomorphic to a square band or an adjacency
semigroup of a reflexive graph. The first part is essentially identical to a corre-
sponding part of the proof of [11, Lemma 3.2]. We include it for completeness
only.
First suppose that a ∈ / SbS. So b ∈ Ia . Choose t = a a ∈ SaS so that
a = at ≡ bt mod Ia . Hence (a, b) ∈/ ρ̂a . Now suppose that SaS = SbS, so that a
The Algebra of Adjacency Patterns: Rees Matrix Semigroups with Reversion 437
and b lie in the same J -class SaS\Ia of S. One of the following two equalities
must fail: ab b = b or aa b = a for otherwise a = aa b = aa ab b = ab b = b.
Hence as neither a nor b is in Ia = Ib , we have either (a, b) ∈ / ρa ⊇ ρ̂a or
(a, b) ∈
/ λa ⊇ λ̂a .
Now it remains to prove that S/ Syn(ρz ) and S/ Syn(λz ) are adjacency semi-
groups or square bands. Lemmas 5.1, 5.2 and 5.3 show that it suffices to prove
that the underlying semigroup of S/ Syn(ρz ) is completely 0-simple or completely
simple. We look at the Syn(ρz ) case only (the Syn(λz ) case follows by symme-
try). Now it does no harm to assume that Iz is empty or {0}, since v, w ∈ Iz
obviously implies that (v, w) ∈ Syn(ρz ). Hence Kz := SzS/(Iz ∩ SzS) is a 0-
simple semigroup or a simple semigroup. Since S is periodic (by identity (8) of
Σref ), we have that Kz is completely 0-simple or completely simple. We need to
prove that every element of S\Iz is Syn(ρz )-related to a member of SzS\Iz .
Let c ∈ S. If c ∈ SzS or c ∈ Iz we are done, so let us assume that c ∈ / SzS ∪Iz .
So z = pcq for some p, q ∈ S 1 . So z = pcqz pcq. Put w = qz p. Note that w ∈ SzS
and cwc = 0. Our goal is to show that c Syn(ρz ) cwc. Let s(x, y) be any unary
semigroup word in some variables x, y1 , . . . and let t ∈ SaS. We need to prove
that for any d in S 1 we have s(c, d )t ≡ s(cwc, d )t modulo Iz . Write t as ucwcv,
which is possible since both t and cwc are J -related. (Note that modulo the
identity xx x ≈ x we may assume both u and v are nonempty.) We want to
obtain
s(c, d )ucwcv = s(cwc, d )ucwcv . (12)
Now using Corollary 5.1, we may rewrite s(c, d ) as a word in which each appli-
cation of covers either a single variable or a word of the form gh where g, h are
either letters or applied to a letter. There may be many occurrences of c in this
word. We show how to replace an arbitrary one of these by cwc and by repeat-
ing this to each of these occurrences we will achieve the desired equality (12).
Let us fix some occurrence of c. So we may consider the expression s(c, d )ucwc
as being of one of the following forms: w1 cw2 cwc; w1 c w2 cwc; w1 (cz) w2 cwc;
w1 (c z) w2 cwc; w1 (zc ) w2 cwc; w1 (zc) w2 cwc. In each case, we can make the re-
quired replacement using a single application of Lemma 5.4. This gives equality
(12), which completes the proof.
Lemma 5.6. SL ∨ SB = U.
Proof. The direct product of the semigroup A(1) with an I × I square band
has a unique maximal ideal and the corresponding Rees quotient is (isomorphic
438 M. Jackson and M. Volkov
Lemma 5.7. Let V be a subvariety of Aref containing the variety SB. Either
V = SB or V ⊇ U and V = HSP(A(K)) for some class of (necessarily reflexive)
graphs K.
If S is amongst the Ri then either the join is a join of S with the trivial uH
class {0} (and the join is obviously preserved by ι), or using Lemma 5.6, we
can replace S by the uH class of universal relations, and proceed as above. This
completes the characterization of L(Aref ).
Next we must show that a class K of graphs generates a finitely axiomatizable
uH class if and only if HSP(A(K)) is finitely axiomatizable. The “only if” case is
The Algebra of Adjacency Patterns: Rees Matrix Semigroups with Reversion 439
Corollary 6.3. Now say that K has a finite basis for its uH sentences. Following
the methods of Subsect. 4.3, we may construct a finite set Ξ of identities such
that an adjacency semigroup A lies in HSP(A(K)) if and only if A |= Ξ. We
claim that Σref ∪ Ξ is an identity basis for HSP(A(K)). Indeed, if S is a unary
semigroup satisfying Σref ∪ Ξ, then by Lemma 5.5, S is a subdirect product of
adjacency semigroups (or possibly square bands) satisfying Ξ. So these adjacency
semigroups lie in HSP(A(K)), whence so does S.
The proof that ι preserves the property of being finitely generated (and the
property of being nonfinitely generated) is very similar and left to the reader.
6 Applications
The universal Horn theory of graphs is reasonably well developed, and the link to
unary Rees matrix semigroups that we have just established provides numerous
corollaries. We restrict ourselves to just a few ones which all are based on the
examples of uH classes presented in Sect. 2.
We start with presenting finite generators for unary semigroup varieties that
we have considered.
Proposition 6.1. The varieties A, Asymm , and Aref are generated by A(G1 ),
A(S1 ) and A(R1 ) respectively.
Proof. This follows from Theorem A and Examples 2.5, 2.6, and 2.8.
Observe that the generators are of fairly modest size, with 17, 17 and 10 elements
respectively.
Recall that C3 is a 5-vertex graph generating the uH class of all 3-colorable
graphs (Example 2.4, see also Fig. 2).
Proposition 6.2. The finite membership problem for the variety generated by
the 26-element unary semigroup A(C3 ) is NP-hard.
Proof. Let G be a simple graph. By Theorem A the adjacency semigroup A(G)
belongs to HSP(A(C3 )) if and only if G is 3-colorable. Thus, we have a reduction
to the finite membership problem for HSP(A(C3 )) from 3-colorability of simple
graphs, a known NP-complete problem, see [8]. Of course, the construction of
A(G) can be made in polynomial time, so this is a polynomial reduction.
A similar (but more complicated) example in the plain semigroup setting has
been found in [14]. Observe that we do not claim that the finite membership
problem for HSP(A(C3 )) is NP-complete since it is not clear whether or not the
problem is in NP.
One can also show that the equational theory of A(C3 ) is co-NP-complete. (It
means that the problem whose instance is a unary semigroup identity u ≈ v and
whose question is whether or not u ≈ v holds in A(C3 ) is co-NP-complete.) This
follows from the construction of identities modelling uH sentences in Subsect. 4.3.
The argument is an exact parallel to that associated with [14, Corollary 3.8] and
we omit the details.
440 M. Jackson and M. Volkov
SL ∨ SB
Single block Antichains BR
equivalence relations
I({1, 0}) SB SL
{0} T
Fig. 8. The lattice of uH classes of reflexive symmetric graphs vs the lattice of varieties
of strict regular semigroups
Example 6.1. The adjacency semigroup A(2) of the two element chain 2 (as a
partial order) generates a variety with a lattice of subvarieties isomorphic to the
four element chain. The variety is a cover of the variety BR of combinatorial
strict inverse semigroups.
Proof. This follows from Example 2.1, Theorem B and the fact that the uH class
of universal relations is not a sub-uH class of the partial orders (so SB is not a
subvariety of HSP(A(2)).
The underlying semigroup of A(2) is again the semigroup A2 . Thus, Example 6.1
makes an interesting contrast to Corollary 6.1.
For our final application, consider the 3-vertex graph P shown in Fig. 9.
It is known (see [3]) and easy to verify that the uH-class ISP+ Pu (P) is not
finitely axiomatizable and the class of partial orders is the unique maximal sub-
uH class of ISP+ Pu (P). Recall that a variety V is said to be a limit variety if V has
no finite identity basis while each of its proper subvarieties is finitely based. The
existence of limit varieties is an easy consequence of Zorn’s lemma but concrete
examples of such varieties are quite rare. We can use the just registered properties
of the graph P in order to produce a new example of a finitely generated limit
variety of I-semigroups.
P: 1 2 3
7 Conclusion
We have found a transparent translation of facets of universal Horn logic into the
apparently much more restricted world of equational logic. A general translation
of this sort has been established for uH classes of arbitrary structures (even
partial structures) by the first author [13]. We think however that the special
case considered in this paper is of interest because it deals with very natural
objects on both universal Horn logic and equational logic sides.
We have shown that the unary semigroup variety Aref whose equational logic
captures the universal Horn logic of the reflexive graphs is finitely axiomatizable.
The question of whether or not the same is true for the variety A corresponding
to all graphs still remains open. A natural candidate for a finite identity basis
of A is the system consisting of the identities (Ψ ) and (7)–(11), see Sect. 5.
References
1. Almeida, J.: Finite Semigroups and Universal Algebra. World Scientific, Singapore
(1994)
2. Auinger, K.: Strict regular ∗-semigroups. In: Howie, J.M., Munn, W.D., Weinert,
H.-J. (eds.) Proceedings of the Conference on Semigroups and Applications, pp.
190–204. World Scientific, Singapore (1992)
3. Benenson, I.E.: On the lattice of quasivarieties of models. Izv. vuzuv. Matem-
atika (12), 14–20 (1979) (Russian); English translation in Soviet Math. (Iz. VUZ)
23(12), 13–21 (1979)
4. Burris, S., Sankappanavar, H.P.: A Course in Universal Algebra. Springer,
Heidelberg (1981)
5. Caicedo, X.: Finitely axiomatizable quasivarieties of graphs. Algebra Universalis 34,
314–321 (1995)
6. Clark, D.M., Davey, B.A., Freese, R., Jackson, M.: Standard topological algebras:
syntactic and principal congruences and profiniteness. Algebra Universalis 52, 343–
376 (2004)
7. Clark, D.M., Davey, B.A., Jackson, M., Pitkethly, J.: The axiomatisability of topo-
logical prevarieties. Adv. Math. 218, 1604–1653 (2008)
8. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory
of NP-Completeness. W.H. Freeman and Company, New York (1979)
9. Gorbunov, V.: Algebraic Theory of Quasivarieties. Consultants Bureau, New York
(1998)
10. Gorbunov, V., Kravchenko, A.: Antivarieties and colour-families of graphs. Algebra
Universalis 46, 43–67 (2001)
11. Hall, T.E., Kublanovsky, S.I., Margolis, S., Sapir, M.V., Trotter, P.G.: Algorithmic
problems for finite groups and finite 0-simple semigroups. J. Pure Appl. Alge-
bra 119, 75–96 (1997)
The Algebra of Adjacency Patterns: Rees Matrix Semigroups with Reversion 443
1 Introduction
1.1 The Speed of a Class of Finite Relational Structures
Let P be a graph property, and P n be the set of graphs with vertex set [n] =
{1, . . . , n} which have property P. We denote by spP (n) = |P n | the number of
labeled graphs in P n . The function spP (n) is called the speed of P, or in earlier
literature the counting function of P 1 . Instead of graph properties we also study
Partially supported by a grant of the Graduate School of the Technion–Israel Insti-
tute of Technology.
Partially supported by a grant of the Fund for Promotion of Research of the
Technion–Israel Institute of Technology and grant ISF 1392/07 of the Israel Sci-
ence Foundation (2007-2010).
1
In the recent monograph by P. Flajolet and R. Sedgewick [19] the counting function
is called “counting sequence of P”. Speed is used mostly in case the counting function
is monotone increasing.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 444–462, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Definability of Combinatorial Functions 445
For R unary we can interpret [n], R as a binary word where position i is occu-
pied by letter 1 if i ∈ R and by letter 0 otherwise. Similarly, For R̄ = (R1 , . . . , Rs )
which consists of unary relations only we can interpret [n], R1 , . . . , Rs as a
word over an alphabet of size 2s . With this way of viewing languages we have
the celebrated theorem of R. Büchi (and later but independently of C. Elgot and
B. Trakhtenbrot), cf. [22,13] states:
Theorem 1. Let K be a language and K be the corresponding set of ordered
structures with unary predicates for the occurrence of letters. Then K is regular
iff K is definable in MSOL using the natural order <nat on [n].
dm
(m)
spK (n) ≡ aj spK (n − j) (mod m) ,
j=1
In [18] it was shown that in Proposition 1 and in Theorem 2 the logic MSOL
can be augmented by modular counting quantifiers.
Furthermore, E. Fischer showed in [17]
Theorem 3. For every prime p ∈ N there is an FOL4 -definable function spKp (n),
where Kp consists of finite (E, R)-structures with E binary and R quaternary, which
is not ultimately periodic modulo p.
The definability status of various combinatorial functions from the literature will
be discussed in Sect. 4.
448 T. Kotek and J.A. Makowsky
Example 3
(i) Every formula φ(R̄, <1 ) ∈ TVSOLk is equivalent to the formula ψ(R̄) =
∃ <1 φ(R̄, <1 )∧φlinOrd (<1 ) ∈ SOLk , where φlinOrd (<1 ) says <1 is a linear
ordering of the universe.
(ii) Every TVMSOLk -Specker function is also a counting order invariant
MSOLk -Specker function.
(iii) We shall see in Sect. 3 that there are counting order invariant MSOL2 -
definable Specker functions which are not TVMSOL2 -definable.
Proof. We give a sketch of the proof for the Ca,b MSOL formula φeven = C0,2 (x =
x), which says the size of the universe is even. The general proof is similar. φeven
can be written as φ(R̄, <1 ) = ∃U φmin (U ) ∧ ∀x∀y(φsucc (x, y) → (x ∈ U ↔ y ∈ /
U )) ∧ ∀x(x ∈ U → ∃y x <1 y) where φmin (U ) = ∀x ((¬∃y y <1 x) → x ∈ U )
says the minimal element x in the order <1 belongs to U , and φsucc (x, y) =
(x <1 y) ∧ ¬∃z(x <1 z ∧ z <1 y) says y is the successor of x in <1 .
450 T. Kotek and J.A. Makowsky
Our first result is a characterization of functions over the natural numbers which
satisfy a linear recurrence relation over Z.
We shall only prove that (i) implies (ii), the other cases are either known or
easily derived.
In the terminology of rational functions we get the following corollary:
Corollary 1, as stated, is known, cf. [26, Corollary 8.2. in Chapter II]. However,
to prove Theorem 4∞from Corollary 1, one would need that for every N-rational
function α(x) = i=0 a(n)xn the function a(n) is the counting function of some
regular language (*). To the best of our knowledge, this is not known to be
true. The characterization of regular languages via generating functions uses the
multivariate version, see [26].
In the proof of Theorem 4 we introduce the notion of Specker polynomials,
which can be thought of as a special case of graph polynomials where graphs are
replaced by linear orders.
Next we show that the Specker-Blatter Theorem cannot be extended to count-
ing order invariant Specker functions which are definable in MSOL2 . More pre-
cisely:
In Sect. 4 we shall show in Corollary 3 the same also for the Catalan number.
However, if we require that the defining formula φ of a Specker function is
itself order invariant, i.e. φ ∈ TVMSOL2 , then the Specker-Blatter Theorem
still holds.
4 N o M LR N o M LR N o M LR N o M LR
3 ? ? ? N o M LR
M LR M LR M LR
2
No LR No LR No LR No MLR
where v̄ stands for (v1 , . . . , vk ), R̄ stands for (R1 , . . . , Rt ) and the Ri ’s are
relation variables of arity ρi at most k. The Ri ’s range over relations of
arity ρi over [n] and the vi range over elements of [n] satisfying the iteration
formulas Φi , Ψi ∈ L.
452 T. Kotek and J.A. Makowsky
(ii) Simple ordered Lk -Specker polynomials and order invariance thereof are de-
fined analogously to Specker functions.
Every Specker function can be viewed as a Specker polynomial in zero indetermi-
nates. Conversely, if we evaluate a Specker polynomial at x = 1 we get a Specker
function.
In this subsection we prove a stronger version of Theorem 4.
Lemma 1. Let A(n, z̄) be a c.o.i. MSOL1 -Specker polynomial with in-
determinates z̄ = (z1 , . . . , zs ) and let h1 (w̄), . . . , hs (w̄) ∈ Z [w̄]. Let
A (n, (h1 (w̄), . . . , hs (w̄))) denote the variable subtitution
in A(n, z̄) where for
i ∈ [s], zi is substituted to hi (w̄). Then A n, h̄ is an integer evaluation of
a c.o.i. MSOL1 -Specker polynomial.
Proof. We look at A(n, z̄) with z1 substituted to the polynomial
d
α α
h1 (w̄) = cj w1 j1 · · · wt jt
j=1
We note that for every α(v) ∈ MSOL we can define an MSOL formula
with d unary relation variables φP art(α) (U1 . . . , Ud ) which holds iff U1 , . . . , Ud
are a partition of the set of elements of [n] which satisfy α(v). Then
A (n, (h1 (w̄), z2 , . . . , zs )) =
⎛
··· ⎝ z2 · · ·
R1 :Φ1 (R1 ) Rm :Φm (R1 ,...,Rm−1 ) U1 ,...,Ud :φP art(Ψ1 ) (Ū) v1 :Ψ2 (R̄,v1 )
⎞
zs c1 w1α11 · · · wtα1t · · · cd w1αd1 · · · wtαdt ⎠
v1 :Ψs (R̄,v1 ) v1 :v1 ∈U1 v1 :v1 ∈Ud
We now replace all cj with new indeterminates wj and thus obtain that
A (n, (h1 (w̄), z2 , . . . , zs )) is an evaluation of an c.o.i. MSOL1 -Specker polyno-
mial.
Doing the same for the other zi , we get that A (n, (h1 (w̄), . . . , hs (w̄))) is an
evaluations of an o.i. MSOL1 -definable Specker polynomial, as required.
r
An (x̄) = fi (x̄) · An−i (x̄) ,
i=1
where fi (x̄) ∈ Z [x̄] and initial conditions A1 (x̄), . . . , Ar (x̄) ∈ Z [x̄]. To write
An (x̄) as a c.o.i. MSOL1 -Specker polynomials, we sum over the paths of the
recurrence tree. A path in the recurrence tree corresponds to the successive
application of the recurrence
¯ S) says
where φrec (Ū , I,
– φP art ¯ ¯
(Ū , I, S) holds, i.e. Ū , I, S is a partition of [n],
– n∈ Ui , i.e. the path in the recurrence tree starts from n,
r
– | i=1 Ii | = 1, i.e. the path reaches
r exactly one initial condition
– if v ∈ [n] − [r], then v ∈ / i=1 Ii , i.e. the path may not reach an initial
condition until v ∈ [r],
– if v ∈ [r], then v ∈ / ri=1 Ui , i.e. the path ends when reaching the initial
conditions, and r
– for every v ∈ Ui , {v − 1, . . . , v − (i − 1)} ⊆ S and v − i ∈ i=1 (Ui ∪ Ii ), i.e.
the next element in the path is v − i.
The formula φrec is MSOL definable using the given order. Let B(n, x̄) be
B(n, z̄) = z1 · · · zr zr+1 · · · z2r .
¯
Ū ,I,S:φ ¯
rec (Ū ,I,S) v:v∈U1 v:v∈Ur v:v∈I1 v:v∈Ir
454 T. Kotek and J.A. Makowsky
So,
A(n, a) = 1 = R, Y, Z̄ | βa (R, Y, Z̄) ,
R,Y,Z̄:βa (R,Y,Z̄)
where βa (R, Y, Z̄) = Φ(R) ∧ Ψ (R, Y ) ∧ φpart (Y, Z̄) and φpart (Y, Z1 , . . . , Za ) says
Z1 , . . . , Za form a partition of Y . We note that φpart is definable in MSOL. For
a = 0,
A(n, a) = 1 = |{R | γ(R)}| ,
R:γ(R)
where γ(R) = Φ(R) ∧ ∀v1 ¬Ψ (R, v1 ). Thus, since the constant function 0 is de-
finable in MSOL, we get that if a ≥ 0 then A(n, a) is the difference of two c.o.i.
MSOL1 -Specker functions.
Definability of Combinatorial Functions 455
and we have
A(n, a) = 1− 1,
R,Y,Z̄:αEven (Y )∧β|a| (R,Y,Z̄) R,Y,Z̄:¬αEven (Y )∧β|a| (R,Y,Z̄)
Lemma 2. Let C be a class of R̄−structures of finite Specker index with all rela-
tion symbols in R̄ at most binary. Then fC (n) satisfies modular linear recurrence
relations for every m ∈ N.
Subst(G, s , A1 ) ∈ C iff Subst(G, s , A2 ) ∈
/ C.
where U and R are unary and F is binary, <1 is a linear order of [n], and Φ is
says
(i) F is a function,
(ii) U is the domain of F ,
(iii) R is the range of F ,
(iv) U and R form a partition of [n],
(v) the first element of [n], is in U ,
(vi) F : U → R is a bijection, and
(vii) F is monotone with respect to <1 .
We note C is MSOL2 definable. We note also that ospC (n, <1 ) is counting order
invariant. ospC (n, <1 ) counts the number of partitions of [n] into two equal parts,
because there is exactly one monotone bijection between any two subsets of [n]
of equal size. The condition that 1 ∈ U assures that we do not count the same
partition twice. So ospC (n, <1 ) = E2,= (n).
We know that E2,= is not ultimately periodic modulo 2, and hence the Specker-
Blatter theorem cannot be extended to c.o.i. MSOL2 -Specker functions.
4 Examples
4.1 Examples of MSOLk -Specker Functions
Fibonacci and Lucas Numbers. The Fibonacci numbers Fn satisfy the linear
recurrence Fn = Fn−1 + Fn−2 for n > 1, F0 = 0 and F1 = 1. The Lucas numbers
Ln , a variation of the Fibonacci numbers, satisfy the same recurrence for n > 1,
Ln = Ln−1 + Ln−2 , but have different initial conditions, L1 = 1 and L0 = 2.
It follows from the proof of Theorem 4 that a function which satisfies a lin-
ear recurrence relation over N is a c.o.i. MSOL1 -Specker function. Thus. The
Fibonacci and Lucas numbers are natural examples of c.o.i.-MSOL1 -Specker
functions.
Stirling Numbers. The Stirling numbers of the first kind, denoted [nr] are
defined as the number of ways to arrange n objects into r cycles. For fixed r,
this is an MSOL2 -Specker function, since for E ⊆ [n]2 and U ⊆ E, the property
that U is a cycle in E and the property that E is a disjoint union of cycles
are both MSOL2 -definable. Using again the growth argument from Example
1(iii), we can see that the Stirling numbers of the first kind do not satisfy a
458 T. Kotek and J.A. Makowsky
linear recurrence relation, because [n1 ] grows like the factorial (n − 1)!. However,
from the Specker-Blatter Theorem it follows that they satisfy a modular linear
recurrence relation for every m.
The Stirling numbers of the second kind, denoted {nr}, count the number
of partitions of a set [n] into r many non-empty subsets. For fixed r, this is
MSOL2 -definable: We count the number of equivalence relations with r non-
empty equivalence classes. From the Specker-Blatter Theorem it follows that
they satisfy a modular linear recurrence relation for every m. We did not find in
the literature a linear recurrence relation for the Stirling numbers of the second
kind which fits our context. But we show below that such a recurrence relation
exists.
Proposition 6. For fixed r, the Stirling numbers of the second kind are c.o.i.
MSOL1 -Specker functions.
Proof. We use r unary relations U1 , . . . , Ur and say that they partition the set
[n] into non-empty sets. However, when we permute the indices of the Ui ’s we
count two such partitions twice. To avoid this we use a linear ordering on [n] and
require that, with respect to this ordering, the minimal element in Ui is smaller
than all the minimal elements in Uj for j > i.
Corollary 2. For every r there exists a linear recurrence relation with constant
coefficients for the Stirling numbers of the second kind {nr}. Further more there
are constants cr such that {nr} ≤ 2cr ·n .
Our proof is not constructive, and we did not bother here to calculate the explicit
linear recurrence relations or the constants cr for each r.
(i) a0 = 1
(ii) ai−1 − ai ∈ {1, −1} for i = 1, . . . , 2n − 2
(iii) a2n−1 = 0
We can express this in MSOL2 using a linear order and two unary functions.
The two functions F1 and F2 are used to describe a0 , . . . , an−1 and an , . . . , a2n−1
respectively. Let ΦCatalan be the formula that says:
Bell Numbers. The Bell numbers Bn count the number of equivalence relations
on n elements. We note f (n) = Bn is a MSOL2 -Specker function. However, Bn
is not c.o.i. MSOL1 -definable due to a growth argument.
where {nk}is the Stirling number of the second kind. Tn (x) is c.o.i. MSOL2 -
definable; To see this we note that it is defined by
Tn (x) = x,
E:Φcliques (E) u:Φf irst−in−cc (E,u)
Acknowledgements
We are grateful to the anonymous referee for his very careful reading of the
manuscript, his stylistic suggestions and pointing out of misprints, and to Daniel
Marx for his useful comments.
References
1. Balogh, J., Bollobás, B., Weinreich, D.: The speed of hereditary properties of
graphs. J. Comb. Theory, Ser. B 79, 131–156 (2000)
2. Balogh, J., Bollobás, B., Weinreich, D.: The penultimate rate of growth for graph
properties. EUROCOMB: European Journal of Combinatorics 22 (2001)
3. Balogh, J., Bollobás, B., Weinreich, D.: Measures on monotone properties of graphs.
Discrete Applied Mathematics 116, 17–36 (2002)
4. Barcucci, E., Lungo, A.D., Frosini, A., Rinaldi, S.: A technology for reverse-
engineering a combinatorial problem from a rational generating function. Advances
in Applied Mathematics 26, 129–153 (2001)
5. Bateman, H.: The polynomial of Mittag-Leffler. Proceedings of the National
Academy of Sciences 26, 491–496 (1940)
6. Bernardi, O., Noy, M., Welsh, D.: On the growth rate of minor-closed classes of
graphs (2007)
7. Berstel, J., Reutenauer, C.: Rational Series and Their Languages. In: EATCS
Monographs on Theoretical Computer Science, vol. 12. Springer, Heidelberg (1988)
8. Blatter, C., Specker, E.: Le nombre de structures finies d’une th’eorie à charactère
fin. Sciences Mathématiques, Fonds Nationale de la rechercheScientifique, Brux-
elles, pp. 41–44 (1981)
9. Blatter, C., Specker, E.: Recurrence relations for the number of labeled structures
on a finite set. In: Börger, E., Hasenjaeger, G., Rödding, D. (eds.) Logic and Ma-
chines: Decision Problems and Complexity. LNCS, vol. 171, pp. 43–61. Springer,
Heidelberg (1984)
10. Bollobas, B., Thomason, A.: projections of bodies and hereditary properties of
hypergraphs. J. London Math. Soc. 27 (1995)
11. Büchi, J.: Weak second–order arithmetic and finite automata. Zeitschrift für math-
ematische Logik und Grundlagen der Mathematik 6, 66–92 (1960)
12. Chomsky, N., Schützenberger, M.: The algebraic theory of context free languages.
In: Brafford, P., Hirschberg, D. (eds.) Computer Programming and Formal Sys-
tems, pp. 118–161. North-Holland, Amsterdam (1963)
13. Ebbinghaus, H., Flum, J.: Finite Model Theory. In: Perspectives in Mathematical
Logic. Springer, Heidelberg (1995)
14. Ebbinghaus, H., Flum, J., Thomas, W.: Mathematical Logic. Undergraduate Texts
in Mathematics, 2nd edn. Springer, Heidelberg (1994)
15. Eilenberg, S.: Automata, Languages, and Machines, Vol. A. Academic Press, Lon-
don (1974)
16. Erdös, P., Frankl, P., Rödl, V.: The asymptotic enumeration of graphs not contain-
ing a fixed subgraph and a problem for hypergraphs having no exponent. Graphs
Combin. 2, 113–121 (1986)
17. Fischer, E.: The Specker-Blatter theorem does not hold for quaternary relations.
Journal of Combinatorial Theory, Series A 103, 121–136 (2003)
462 T. Kotek and J.A. Makowsky
18. Fischer, E., Makowsky, J.: The Specker-Blatter theorem revisited. In: Warnow,
T.J., Zhu, B. (eds.) COCOON 2003. LNCS, vol. 2697, pp. 90–101. Springer, Hei-
delberg (2003)
19. Flajolet, P., Sedgewick, R.: Analytic Combinatorics. Cambridge University Press,
Cambridge (2009)
20. Graham, L., Knuth, D.E., Patashnik, O.: Concrete Mathematics: A Foundation for
Computer Science. Addison-Wesley Longman Publishing Co., Inc., Boston (1994)
21. Knuth, D.E.: The Art of Computer Programming. Fascicle 4: Generating All Trees–
History of Combinatorial Generation (Art of Computer Programming), vol. 4.
Addison-Wesley Professional, Reading (2006)
22. Libkin, L.: Elements of Finite Model Theory. Springer, Heidelberg (2004)
23. Lidl, R., Niederreiter, H.: Finite Fields. Encyclopedia of Mathematics and its Ap-
plications, vol. 20. Cambridge University Press, Cambridge (1983)
24. Mason, J.C., Handscomb, D.C.: Chebyshev Polynomials. Chapman & Hall/CRC,
Boca Raton (2003)
25. Prömel, H., Steger, A.: Excluding induced subgraphs: Quadrilaterals. Random
Structures & Algorithms 3, 19–31 (1992)
26. Salomaa, A., Soittola, M.: Automata theoretic aspects of formal power series.
Springer, Heidelberg (1978)
27. Scheinerman, E., Zito, J.: On the size of hereditary classes of graphs. J. Comb.
Theory, Ser. B 61, 16–39 (1994)
28. Sidi, A.: Practical Extrapolation Methods, Theory and Applications. Cambridge
Monographs on Applied and Computational Mathematics, vol. 10. Cambridge Uni-
versity Press, Cambridge (2003)
29. Soittola, M.: Positive rational sequences. Theoretical Computer Science 2, 317–322
(1976)
30. Specker, E.: Application of logic and combinatorics to enumeration problems. In:
Börger, E. (ed.): Trends in Theoretical Computer Science. 141–169. Computer
Science Press; Reprinted in: Ernst Specker, Selecta, Birkhäuser, pp. 324-350 (1988)
Halting and Equivalence of Program Schemes in
Models of Arbitrary Theories
Dexter Kozen
When T is empty, these two problems are the classical halting and equiva-
lence problems for program schemes, respectively. We show that problem
(i) is Σ1α -complete and problem (ii) is Π2α -complete. Both problems re-
main hard for their respective complexity classes even if Σ is restricted
to contain only a single constant, a single unary function symbol, and
a single monadic predicate. It follows from (ii) that there can exist no
relatively complete deductive system for scheme equivalence over models
of theories of any Turing degree.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 463–469, 2010.
c Springer-Verlag Berlin Heidelberg 2010
464 D. Kozen
Note that for both problems, the theory T is part of the input in the form of an
oracle.
If T is the empty ground theory, these are the classical halting and equivalence
problems for program schemes, respectively. Classical lower bound proofs (see
[7]) establish the r.e. hardness of the two problems for this case. The Π20 -hardness
of (ii) for this case can also be shown to follow without much difficulty from a
result of [4].
symbol. For now we will assume that we have two unary relation symbols Q, R;
we can later encode these in a single P by taking P (f 2n (a)) = Q(f n (a)) and
P (f 2n+1 (a)) = R(f n (a)).
Let A ⊆ N be any set. We will show how to encode the halting problem for
deterministic oracle Turing machines M with oracle A. This problem is Σ1A -
complete. Given an input x over M ’s input alphabet, we will construct a ground
theory T0 Turing-equivalent to A and a scheme p with no input or output such
that p halts on all models of T0 iff M halts on input x. Later, we will extend
T0 to a first-order theory T of the same complexity. The encoding technique
used here is fairly standard, but we include the argument for completeness and
because we need the resulting scheme p in a certain special form for the proof
of (ii).
Consider the Herbrand domain consisting of all terms over a and f . This
domain is isomorphic to the natural numbers with 0 and successor under the
correspondence n → f n (a). An Herbrand model H over this domain is repre-
sented by a pair of semi-infinite binary strings representing the truth values of
Q(f n (a)) and R(f n (a)) for n ≥ 0. The correspondence is one-to-one. We will
use the string corresponding to Q to encode a computation history of M and
the string corresponding to R to encode the oracle A.
Each string x over M ’s input alphabet determines a unique finite or infinite
computation history #αx0 #αx1 #αx2 # · · · , where αxi is a string over a finite al-
phabet Δ encoding the instantaneous configuration of M on input x at time i
consisting of the tape contents, head position, and current state. We also as-
sume that configurations encode all oracle queries and the answers returned by
the oracle A (we will be more explicit about the precise format of the encoding
below). The configurations αxi are separated by a symbol # ∈ Δ. The computa-
tion history in turn can be encoded in binary, and this infinite binary string can
be encoded by the truth values of Q(f n (a)), n ≥ 0.
The ground theory T0 describes the oracle A using R and the starting con-
figuration #αx0 # of M on input x using Q. The description of the starting
configuration consists of a finite set Sx of ground literals of the form Q(f n (a))
or ¬Q(f n (a)). The oracle is described by the set
x is f n (a), then testing Q(x) reads the n-th bit of the string. The pointer is
advanced by the assignment x := f (x).
The scheme p must also verify that oracle responses are correct. Without loss
of generality, we can assume that M uses the following mechanism to query the
oracle. We assume that M has an integer counter initially set to 0. In each step,
M may add one to the counter or not, depending on its current state and the
tape symbol it is scanning, according to its transition function. It queries the
oracle by entering a distinguished oracle query state. If the current value of the
counter is n, then M transits to a distinguished “yes” state if n ∈ A and to a
distinguished “no” state if n ∈ A. The counter is reset to 0.
For p to verify the correctness of the oracle responses, we assume that the
format of the encoding of configurations is β$0n , where β is the description of
the current state, tape contents, and head position of M and n is the current
value of the counter. If p discovers that M is in the oracle query state while
scanning β, then after encountering the $, it sets a variable z := a and executes
z := f (z) for each occurrence of 0 after the $, so that z will have the value
f n (a) when the next # is seen. Then it tests R(z) to determine whether n ∈ A.
It then checks that in the subsequent configuration, M is in the “yes” or “no”
state according as R(z) is true or false, respectively, and that the counter has
been reset.
If p discovers an error, so that the string does not represent a computation
history of M on some input, it halts immediately. It also halts if it ever encounters
a halting state of M anywhere in the string. Thus the only Herbrand model of T0
that would cause p not to halt is the one describing the infinite valid computation
history of M on x in the case that M does not halt on x. Thus p halts on all
Herbrand models of T0 , thus on all models of T0 , iff M halts on x.
We can further restrict the set Sx describing the start configuration of M
to be empty by observing that Sx is finite, so it can be hard-wired into the
scheme p itself. Thus the initial format check that p performs can be modified to
check whether Sx holds and halt immediately if not. This gives a ground theory
T0 consisting of (1) only, independent of the input x, at the expense of coding
information about x in the scheme p. However, for purposes of the proof of (ii)
below, it will be important that p not depend on the input x but only on the
machine M .
Finally, we must produce a first-order theory T extending T0 such that T is
of no higher Turing degree than T0 (that is, T is still Turing-equivalent to A)
and every Herbrand model of T0 extends to a model of T . Since the halting of
p depends only on the Herbrand substructure, p will halt on all models of T
iff it halts on all Herbrand models of T0 . The main issue here is that we must
be careful to construct a T whose Turing complexity is no greater than that of
the ground theory T0 , otherwise the lower bound will not hold. Note that the
first-order theory generated by T0 may not be suitable, because the best we can
guarantee is that it is Σ1 in A.
Halting and Equivalence of Program Schemes 467
where C and D are disjoint finite subsets of N. We take T to be the set of logical
consequences of T0 and the formulas (2). Every Herbrand model of T0 extends to
a model of T , because new elements outside the Herbrand domain can be freely
added as needed to satisfy the existential formulas (2).
To show that T is Turing-equivalent to A, we observe that since the theory is
monadic, every first-order sentence reduces effectively via the laws of first-order
logic to a Boolean combination of ground formulas P (f n (a)) and existential
formulas (2). The latter are all true in T , so every sentence is equivalent modulo
T to a Boolean combination of ground formulas P (f n (a)). Any such formula is
consistent with T iff it is consistent with T0 , since as previously observed, every
Herbrand model of T0 extends to a model of T . Thus T Turing-reduces to T0 .
This argument shows that the halting problem for program schemes over
models of T0 or T is hard for Σ1A . Since A was arbitrary and both T0 and T are
Turing-equivalent to A, we are finished with the proof of (i).
Now we turn to problem (ii). For the upper bound, first we show that equiva-
lence of schemes over models of T is Π2T . Equivalently, inequivalence of schemes
over models of T is Σ2T . It suffices to show that inequivalence of schemes over
models of T can be determined by an IND program over N with oracle T with an
∃ ∀ alternation structure [5] (see also [6]). As above, we need only that ground
entailment relative to T is decidable.
Let p and q be two schemes with input variables x̄ = x1 , . . . , xn . The schemes
p and q are not equivalent over models of T iff there exists a complete extension
of T with extra constants c̄ = c1 , . . . , cn in which either
1. both p and q halt on input c̄ and produce different output values;
2. p halts on c̄ and q does not; or
3. q halts on c̄ and p does not.
We start by selecting existentially the alternative 1, 2 or 3 to check.
If alternative 1 was selected, we simulate p and q on input c̄, maintaining a
finite set E of ground literals and using T and E as in the proof of (i) to resolve
tests. Whenever a test is encountered that is not determined by T and E, we
guess the truth value and extend E accordingly. Thus we nondeterministically
guess a complete extension of T using existential branching in the IND program.
We continue the simulation until both p and q halt, then compare output values,
accepting if they differ.
If alternative 2 was selected, we simulate p on c̄ until it halts, maintaining
the guessed truth values of undetermined tests in the set E as above. When p
has halted, we have a consistent extension T ∪ E of T , where E consists of the
finitely many tests that were guessed during the computation of p. So far we
have only used existential branching. We must now verify that there exists a
complete extension of T ∪ E in which q does not halt on input c̄. By (i), this
468 D. Kozen
Acknowledgments
Thanks to Andreas Blass for insightful comments, which inspired a strengthening
of the results. This work was supported in part by NSF grant CCF-0635028.
References
1. Angus, A., Kozen, D.: Kleene algebra with tests and program schematology. Tech.
Rep. TR2001-1844, Computer Science Department, Cornell University (July 2001)
2. Barth, A., Kozen, D.: Equational verification of cache blocking in LU decomposi-
tion using Kleene algebra with tests. Tech. Rep. TR2002-1865, Computer Science
Department, Cornell University (June 2002)
3. Ferrante, J., Rackoff, C.: The computational complexity of logical theories. Lecture
Notes in Mathematics, vol. 718. Springer, Heidelberg (1979)
4. Harel, D., Meyer, A.R., Pratt, V.R.: Computability and completeness in logics of
programs. In: Proc. 9th Symp. Theory of Comput., pp. 261–268. ACM, New York
(1977)
5. Harel, D., Kozen, D.: A programming language for the inductive sets, and applica-
tions. Information and Control 63(1-2), 118–139 (1984)
6. Harel, D., Kozen, D., Tiuryn, J.: Dynamic Logic. MIT Press, Cambridge (2000)
7. Manna, Z.: Mathematical Theory of Computation. McGraw-Hill, New York (1974)
Metrization Theorem for Space-Times: From
Urysohn’s Problem towards Physically Useful
Constructive Mathematics
Vladik Kreinovich
Abstract. In the early 1920s, Pavel Urysohn proved his famous lemma
(sometimes referred to as “first non-trivial result of point set topology”).
Among other applications, this lemma was instrumental in proving that
under reasonable conditions, every topological space can be metrized.
A few years before that, in 1919, a complex mathematical theory
was experimentally proven to be extremely useful in the description of
real world phenomena: namely, during a solar eclipse, General Relativ-
ity theory – that uses pseudo-Riemann spaces to describe space-time
– was (spectacularly) experimentally confirmed. Motivated by this suc-
cess, Urysohn started working on an extension of his lemma and of the
metrization theorem to (causality-)ordered topological spaces and cor-
responding pseudo-metrics. After Urysohn’s early death in 1924, this
activity was continued in Russia by his student Vadim Efremovich, Efre-
movich’s student Revolt Pimenov, and by Pimenov’s students (and also
by H. Busemann in the US and by E. Kronheimer and R. Penrose in the
UK). By the 1970s, reasonably general space-time versions of Urysohn’s
lemma and metrization theorem have been proven.
However, these 1970s results are not constructive. Since one of the
main objectives of this activity is to come up with useful applications
to physics, we definitely need constructive versions of these theorems –
versions in which we not only claim the theoretical existence of a pseudo-
metric, but we also provide an algorithm enabling the physicist to gen-
erate such a metric based on empirical data about the causality relation.
An additional difficulty here is that for this algorithm to be useful, we
need a physically relevant constructive description of a causality-type
ordering relation.
In this paper, we propose such a description and show that, for this
description, a combination of the existing constructive ideas with the
known (non-constructive) proof leads to successful constructive space-
time versions of the Urysohn’s lemma and of the metrization theorem.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 470–487, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Metrization Theorem for Space-Times 471
1 Introduction
Urysohn’s lemma. In the early 1920s, Pavel Urysohn proved his famous lemma
(sometimes referred to as “first non-trivial result of point set topology”). This
lemma deals with normal topological spaces, i.e., spaces in which every two
disjoint closed sets have disjoint open neighborhoods; see, e.g., [12]. As the very
term “normal” indicates, most usual topological spaces are normal, including
the n-dimensional Euclidean space.
Urysohn’s lemma states the following:
and for which the original topology on X coincides with the topology generated
by the open balls Br (x) = {y : ρ(x, y) < r}.
Specifically, from Urysohn’s lemma, we can easily conclude that:
Comment. It is worth mentioning that the normality condition is too strong for
the theorem: actually, it is sufficient to require that the space is:
Space-time geometry and how it inspired Urysohn. A few years before Urysohn’s
lemma, in 1919, a complex mathematical theory was experimentally proven to
be extremely useful in the description of real world phenomena. Specifically,
during a solar eclipse, General Relativity theory – that uses pseudo-Riemann
spaces to describe space-time – was (spectacularly) experimentally confirmed;
see, e.g., [20].
From the mathematical viewpoint, the basic structure behind space-time geom-
etry is not simply a topological space, but a topological space with an order a b
whose physical meaning is that the event a can causally influence the event b.
For example, in the simplest case of the Special Relativity theory (see Fig. 1),
the event a = (a0 , a1 , a2 , a3 ) can influence the event b = (b0 , b1 , b2 , b3 ) if we can
get from the spatial point (a1 , a2 , a3 ) at the moment a0 to the point (b1 , b2 , b3 )
472 V. Kreinovich
6
t
x = −c · t x= c·t
@
@
@
@
@
@
@
@
@
@
@
@ -
x
at the moment b0 > a0 while traveling with a speed which is smaller than or
equal to the speed of light c:
(a1 − b1 )2 + (a2 − b2 )2 + (a3 − b3 )2 ≤ c · (b0 − a0 ).
Motivated by this practical usefulness of ordered topological spaces, Urysohn
started working on an extension of his lemma and of the metrization theorem to
(causality-)ordered topological spaces and corresponding pseudo-metrics.
Space-time metrization after Urysohn. P. S. Urysohn did not have time to work
on the space-time extension of his results, since he died in 1924 at an early age
of 26.
After Urysohn’s early death, this activity was continued in Russia by his
student Vadim Efremovich, by Efremovich’s student Revolt Pimenov, and by
Pimenov’s students – and also by H. Busemann in the US and by E. Kronheimer
and R. Penrose in the UK [10,13,22] (see also [16]).
This research actively used the general theory of ordered topological spaces;
see, e.g., [21].
By the 1970s, reasonably general space-time versions of Urysohn’s lemma and
metrization theorem have been proven; see, e.g., [14,15].
Need for a more practice-oriented definition. On the theoretical level, the causal-
ity relation is all we need to know about the geometry of space-time.
However, from the practical viewpoint, we face an additional problem – that
measurements are never 100% accurate and, therefore, we cannot locate events
exactly. When we are trying to locate an event a in space and time, then, due to
measurement uncertainty, the resulting location a is only approximately equal
to the actual one: a ≈ a.
From this viewpoint, when we observe that an event a influences the event b,
we record it as a relation between the corresponding approximations – i.e., we
conclude that a b; see Fig. 2. However, this may be a wrong conclusion: for
def
example, if an event b is at the border of the future cone Fa = {b : a b} of
the event a, then
– we have a b, but
– the approximate location b may be outside the cone,
so the conclusion a b is wrong.
6
t
x = −c · t x= c·t
@
@
@
@
@
@
@
@
@ b
* *
@ b
@
@ -
* x
a
a
≺ a; ∀a ∃a, a (a ≺ a ≺ a); a ≺ b ⇒ ∃c (a ≺ c ≺ b);
def
Comment. In principle, we can use a dual definition a b ≡ a ∈ b− , where
def
b− = {c : c ≺ b}. To make sure that these two definitions lead to the same
result, the following additional property is usually required:
b ∈ a+ ⇔ a ∈ b − .
The usual physical meaning of this definition is that ρ(a, b) is the length of the
shortest path between a and b. This meaning leads to a natural explanation for
the triangle inequality ρ(a, c) ≤ ρ(a, b) + ρ(b, c). Indeed, the shortest path from
a to b (of length ρ(a, b)) can be combined with the shortest path from b to c (of
length ρ(b, c)) into a single combined path from a to c of length ρ(a, b) + ρ(b, c).
Thus, the length ρ(a, c) of the shortest possible path between a and c must be
smaller than or equal to this combined length: ρ(a, c) ≤ ρ(a, b) + ρ(b, c).
In space-time, we do not directly measure distances and lengths. The only
thing we directly measure is (proper) time along a path. So, in space-time ge-
ometry, we talk about times and not lengths.
It is well known that if we travel with a speed close to the speed of light,
then the proper travel time (i.e., the time measured by a clock that travels with
us) goes to 0. Thus, in space-time, the smallest time does not make sense: it is
always 0. What makes sense is the largest time. In view of this, we can define a
“kinematic metric” τ (a, b) as the longest (= proper) time along any path from
event a to event b.
476 V. Kreinovich
Space-time analog of Urysohn’s lemma. The main condition under which the
space-time analog of Urysohn’s lemma is proven is that the space X is separable,
i.e., there exists a countable dense set {x1 , x2 , . . . , xn , . . .}.
This lemma is similar to the original Urysohn’s lemma, because it proves the
existence of a function f(a,b) that separates two disjoint closed sets:
The new statement is different from the original Urysohn’s lemma, because:
Proof. First, we prove that for every x, there exists a -monotonic function
fx : X → [0, 1] for which fx (b) > 0 ⇔ x ≺ b.
The proof of this statement is reasonably straightforward.
The only technically cumbersome part of this proof is to show that if a space
with a kinematic causality is separable, i.e., if there exists an everywhere dense
sequences {x1 , . . . , xn , . . .}, then there exists a decreasing sequence yi that con-
verges to x. Moreover, we can select this sequence in such a way that for every
i, if x ≺ xi then yi ≺ xi .
Indeed, since the relation ≺ is a kinematic causality, there exists a point x for
def
which x ≺ x. We then take y0 = x. By our choice of y0 , we thus have x ≺ y0 .
Let us assume that we have already selected points y0 , . . . , yk for which x ≺
yk ≺ yk−1 ≺ . . . ≺ y0 . Let us construct a point yk+1 for which, first, x ≺ yk+1 ≺
yk and, second, if x ≺ xk+1 , then yk+1 ≺ xk+1 .
If x
≺ xk+1 , then, due to the properties of the kinematic causality, there exists
a point c for which x ≺ c ≺ yk . We will then take yk+1 = c.
If x ≺ xk+1 , then, due to the properties of the kinematic causality, from
x ≺ xk+1 and x ≺ yk , we can conclude that there exists a point d for which
x ≺ d ≺ xk+1 , yk . We can then take yk+1 = d.
Let us now prove that yn → x, i.e., that for every open neighborhood U of
the point x, there exists an index n0 for which yn ∈ U for all n ≥ n0 . Indeed,
let U be such a neighborhood. Since open intervals form a base, there exists an
open interval (a, b) ⊆ U that contains the point x. By definition of the interval,
x ∈ (a, b) means that a ≺ x and x ≺ b. By definition of the kinematic causality,
there exists a point c for which x ≺ c ≺ b. Thus, the open interval (x, b) is
non-empty. Since the sequence {xn } is everywhere dense, it has a point xn0 in
this interval, for which x ≺ xn0 ≺ b. By the properties of the sequence yi , this
implies that x ≺ yn0 ≺ xn0 ≺ b. Since the sequence {yn } is decreasing, we thus
conclude that x ≺ yn ≺ b for all n ≥ n0 . From a ≺ x ≺ yn , we then deduce that
a ≺ yn . Hence, yn ∈ (a, b) ⊆ U and so, yn ∈ U for all n ≥ n0 . The statement is
proven.
Once a decreasing sequence yi that converges to x is constructed, we can take
∞
fx (b) = 2−i · f(x,yi ) (b).
i=1
Next, we prove that for every x, there exists a -decreasing function
gx : X → [0, 1]
for which gx (a) > 0 ⇔ a ≺ x. The proof of this second statement is similar to
the proof of the first statement.
Once these two auxiliary statements are proven, we can use the countable ev-
erywhere dense sequence {x1 , x2 , . . . , xn , . . .} to construct the desired kinematic
metric as
478 V. Kreinovich
∞
τ (a, b) = 2−i · min(gxi (a), fxi (b)).
i=1
It is reasonably easy to prove that thus defined function is indeed a kinematic
metric.
a
≺ a; ∀a ∃a, a (a ≺ a ≺ a); a ≺ b ⇒ ∃c (a ≺ c ≺ b);
xi ≺ xj ⇔ ∃n (xi ≺n xj ).
Constructive meaning: reminder. The main difference between this new defi-
nition and the original definition of the kinematic causality is that the exis-
tential quantifier ∃ (and the disjunction ∨) are understood constructively: as
the existence of an algorithm that provides the corresponding objects; see, e.g.,
[1,2,3,4,9,18,19]. In these terms:
– The formula ∀a ∃a, a (a ≺ a ≺ a) means that there exists an algorithm that,
given an event a, returns events a and a for which a ≺ a ≺ a.
– The formula a ≺ b ⇒ ∃c (a ≺ c ≺ b) means that there exists an algorithm
that, given two events a and b for which a ≺ b, returns an event c for which
a ≺ c ≺ b.
– The formula a ≺ b, c ⇒ ∃d (a ≺ d ≺ b, c) means that there exists an algo-
rithm that, given events a, b, and c for which a ≺ b, c, returns an event d for
which a ≺ d ≺ b, c.
– The formula b, c ≺ a ⇒ ∃d (b, c ≺ d ≺ a) means that there exists an algo-
rithm that, given events a, b, and c for which b, c ≺ a, returns an event d for
which b, c ≺ d ≺ a.
– The formula a ≺ b ⇒ ∀c (a ≺ c ∨ b c) means that there exists an algorithm
that, given events a, b, and c for which a ≺ b, returns either a true statement
a ≺ c or a true statement b c.
– The formula a ≺ b ⇒ ∃i (a ≺ xi ≺ b) means that there exists an algorithm
that, given events a and b for which a ≺ b, returns a natural number i for
which a ≺ xi ≺ b.
– The formula xi ≺ xj ⇔ ∃n (xi ≺n xj ) means that there exists an algorithm
that, given natural numbers i and j for which xi ≺ xj , returns a natural
number n for which xi ≺n xj .
480 V. Kreinovich
Comment. In strictly constructive terms, we can say that points xi are simply
natural numbers, xi ≺n xj is a ternary relation between natural numbers, and
an arbitrary constructive event a can be described by two constructive sequences
mi and Mi for which xmi ≺ a ≺ xMi , xmi → x, and xMi → x.
In these terms, if an event a is described by sequences mi and Mi and an
event b is described by sequences ni and Ni , then a ≺ b means that there exist
i and j for which xMi ≺ xnj .
Part 1. Let us define ≺-monotonic values γ(p/2q ) for all natural numbers p and
q for which p ≤ 2q . We will define them inductively, first for q = 0, then for
q = 1, etc.
Since 0 ≤ gxi (a) ≤ 1 and 0 ≤ fxi (b) ≤ 1, we have 0 ≤ min(gxi (a), fxi (b)) ≤ 1.
One can easily check that this formula defines a computable function: to compute
it with accuracy 2−p , it is sufficient to compute the sum of the terms i = 1, . . . , p,
the remaining terms are bounded from above by the sum
2−(p+1) + 2−(p+2) + . . . = 2−p .
So, to complete the proof, we need to prove:
– that the function τ (a, b) is in correct relation with the kinematic causality
relation, and
– that this function satisfies the anti-triangle inequality.
Let us first prove that a ≺ b ⇔ τ (a, b) > 0:
– If a ≺ b, then there exists i for which a ≺ xi ≺ b. Thus, by the properties
of the functions fxi and gxi , we have gxi (a) > 0 and fxi (b) > 0 and thus,
min(gxi (a), fxi (b)) > 0. Hence, we have τ (a, b) > 0.
– Vice versa, if τ (a, b) > 0, this means that there exists an i for which
min(gxi (a), fxi (b)) > 0, i.e., for which gxi (a) > 0 and fxi (b) > 0. By the
properties of the functions fxi and gxi , this means that a ≺ xi and xi ≺ b.
By transitivity, we can now conclude that a ≺ b.
To prove the anti-triangle inequality, let us prove that a similar anti-triangle
inequality holds for each of the expressions min(gxi (a), fxi (b)), i.e., that
a≺b≺c
implies that
min(gxi (a), fxi (b)) + min(gxi (b), fxi (c)) ≤ min(gxi (a), fxi (c)).
Once we prove this, the desired anti-triangle inequality can be obtained by simply
multiplying each of these inequalities by 2−i and adding them.
To prove the above inequality, let us take into account that for every real
number x, it not possible not to have x > 0 ∨ x ≤ 0: ¬¬(x > 0 ∨ x ≤ 0). Thus,
we can consider separately
– situations when min(gxi (a), fxi (b)) > 0 and
– situations when min(gxi (a), fxi (b)) = 0,
Metrization Theorem for Space-Times 483
and conclude that the double negation of the desired inequality holds. Since for
constructive real numbers, ¬¬(p ≤ q) is constructively equivalent to p ≤ q, we
get the desired inequality.
If min(gxi (a), fxi (b)) > 0, this means that xi ∈ (a, b). Since a ≺ b ≺ c, this
implies that we cannot have xi ∈ (b, c), and hence, that min(gxi (b), fxi (c)) = 0.
Since the function fxi (b) is -monotonic and b ≺ c, we have fxi (b) ≤ fxi (c) and
thus, min(gxi (a), fxi (b)) ≤ min(gxi (a), fxi (c)). Due to min(gxi (b), fxi (c)) = 0,
we have min(gxi (a), fxi (b)) + min(gxi (b), fxi (c)) = min(gxi (a), fxi (b)) and thus,
we get the desired inequality
min(gxi (a), fxi (b)) + min(gxi (b), fxi (c)) ≤ min(gxi (a), fxi (c)).
If min(gxi (b), fxi (c)) > 0, this means that xi ∈ (b, c). Since a ≺ b ≺ c, this
implies that we cannot have xi ∈ (a, b), and hence, that min(gxi (a), fxi (b)) = 0.
Since the function gxi (b) is -decreasing and a ≺ b, we have gxi (b) ≤ gxi (a) and
thus, min(gxi (b), fxi (c)) ≤ min(gxi (a), fxi (c)). Due to min(gxi (a), fxi (b)) = 0,
we have min(gxi (a), fxi (b)) + min(gxi (b), fxi (c)) = min(gxi (b), fxi (c)) and thus,
we get the desired inequality
min(gxi (a), fxi (b)) + min(gxi (b), fxi (c)) ≤ min(gxi (a), fxi (c)).
Finally, if min(gxi (a), fxi (b)) = 0 and min(gxi (b), fxi (c)) = 0, then
and hence, since min(gxi (a), fxi (c)) ≥ 0, we also have the desired anti-triangle
inequality.
6 Additional Results
Similar techniques enable us to prove constructive versions of other results about
space-time models.
Definition 4. By a time coordinate t on a space X with a kinematic causality
relation ≺, we mean a function t : X → IR for which:
• a ≺ b ⇒ t(a) < t(b); and
• a b ⇒ t(a) ≤ t(b).
Proposition 1. On every constructive space-time X with a constructive kine-
matic causality relation ≺, there exists a constructive time coordinate.
Proof. The desired constructive version of a time coordinate can be designed as
follows:
∞
2−i · fxi (b).
def
t(b) =
i=1
Since fxi (b) ∈ [0, 1], this is constructively defined (computable): to compute
t(b) with accuracy 2−p , it is sufficient to add first p terms in the sum.
484 V. Kreinovich
Let us prove that this function is indeed the time coordinate. Indeed, since
each of the functions fxi (b) is -monotonic, their convex combination t(b) is also
-monotonic.
To prove that the function t(b) is ≺-monotonic, we can use the fact that a ≺ b
implies the existence of a natural number i for which a ≺ xi ≺ b. For this i, we
have fxi (a) = 0 and fxi (b) > 0, hence fxi (a) < fxi (b). For all other j = i, due
to a ≺ b ⇒ a b and -monotonicity of fxj , we have fxj (a) ≤ fxj (b). Thus, by
adding these inequalities, we get t(a) < t(b).
a b ⇔ t(a) ≤ t(b).
One of the main discoveries that led Einstein to his Special Relativity theory is
the discovery that time is relative: a time coordinate corresponding to a moving
body is different from the time coordinate corresponding to the stationary one.
In general, there are many possible time coordinates t, each of which has the
same property:
a b ⇒ ∀t (t(a) ≤ t(b)).
For each of these time coordinates t, the mere fact that t(a) ≤ t(b) does not
necessarily mean that a causally precedes b: it may happen that in some other
time coordinate, we have t(a) > t(b). What is true is that if a is not causally
preceding b, then there exist a time coordinate for which t(a) > t(b):
a
b ⇒ ∃t (t(a) > t(b)).
All possible time coordinates determine the causality relation: constructive case.
Let us show that in the constructive case, under reasonable conditions, we also
have the implication
a b ⇒ ∃t (t(a) > t(b)).
For that, we will need to impose an additional physically reasonable requirement.
def
For every event b, the past cone Pb = {c : c b} is a closed set; thus,
classically, its complement −Pb = {c : c b} is an open set. The point a belongs
to this set; thus, a whole open neighborhood of a belongs to this set as well. Since
Metrization Theorem for Space-Times 485
the topology is the Alexandrov topology, with intervals as a base, this means
that there exist values a and a which which a ≺ a ≺ a and the whole interval
(a, a) belongs to the complement −Pb .
Since the sequence {xi } is everywhere dense in X, there is a point xi in the
interval (a, a), i.e., a point xi for which xi ≺ a and xi b. By measuring the
event locations with higher and higher accuracy, we will be able to detect this
relation. Thus, it is reasonable to require that the following additional condition
constructively holds:
a b ⇒ ∃i (xi ≺ a & xi b).
Let us show that under this condition, the above implication holds.
Proof. Indeed, let i0 be an index for which xi0 ≺ a and xi0 b. For this i0 , we
thus have fxi0 (a) > 0 and fxi0 (b) = 0. Let us now construct the following time
coordinate:
∞
2
t(x) = · fxi0 (x) + 2−i · fxi (x).
fxi0 (a)
i
=i0
Similar to the above formula, we can check that the function thus defined is
indeed a time coordinate. It is therefore sufficient to show that t(a) > t(b).
Indeed:
2
– For x = a, the first term in the sum is equal to · fxi0 (x) = 2, so
fxi0 (a)
t(a) ≥ 2.
– For x = b, the first term is equal to 0. Since fxi (x) ≤ 1 for all i, we thus
conclude that
∞
∞
t(b) = 2−i · fxi (x) ≤ 2−i = 1.
i
=i0 i=1
Here, t(a) ≥ 2 and t(b) ≤ 1, and hence indeed t(a) > t(b).
a
b ⇒ ¬¬∃t (t(a) > t(b)).
One can easily check that this function is computable, and that it is indeed a
metric – i.e., that it is symmetric and satisfies the triangle inequality.
7 Remaining Challenges
Need to take symmetries into account. In this paper, given space-time X with
the kinematic causality relation ≺, we designed a kinematic metric τ that is
consistent with this relation.
In physics, however, causality is not everything. One of the most important
notions of physics is symmetry. If space-time has symmetries – i.e., is invari-
ant with respect to some transformations – it is therefore desirable to find a
kinematic metric τ which is invariant with respect to these symmetries.
In the simplest case of a finite symmetry group G, we can explicitly define
such a invariant constructive kinematic metric as
def
τinv (a, b) = τ (g(a), g(b)).
g∈G
Need for feasible algorithms. In this paper, we have analyzed the existence of
algorithms for computing the kinematic metric. From the practical viewpoint,
it is important to make sure not only that such algorithms exist, but that they
are feasible (i.e., can be computed in polynomial time); see, e.g., [11].
Partial analysis of feasibility of different computational problems related to
space-time models is given in [17]. It is desirable to extend this analysis to the
problem of computing kinematic metric.
References
1. Aberth, O.: Introduction to Precise Numerical Methods. Academic Press, San
Diego (2007)
2. Beeson, M.: Foundations of Constructive Mathematics: Metamathematical Studies.
Springer, Heidelberg (1985)
3. Beeson, M.: Some relations between classical and constructive mathematics. Jour-
nal of Symbolic Logic 43, 228–246 (1987)
4. Bishop, E., Bridges, D.S.: Constructive Analysis. Springer, Heidelberg (1985)
5. Bridges, D.S.: The construction of a continuous demand function for uniformly
rotund preferences. J. Math. Economics 21, 217–227 (1992)
6. Bridges, D.S.: The constructive theory of preference orderings on a locally compact
space II. Math. Social Sciences 27, 1–9 (1994)
7. Bridges, D.S.: Constructive methods in mathematical economics. Mathematical
Utility Theory, J. Econ. (Zeitschrift für Nationalökonomie) (suppl. 8), 1–21 (1999)
8. Bridges, D.S., Mehta, G.B.: Representations of Preference Orderings. Lecture Notes
in Economics and Mathematical Systems, vol. 422. Springer, Heidelberg (1996)
9. Bridges, D.S., Vı̂ta, L.: Techniques of Constructive Mathematics. Springer,
New York (2006)
10. Busemann, H.: Timelike Spaces. Warszawa, PWN (1967)
11. Gurevich, Y.: Platonism, constructivism, and computer proofs vs. proofs by hand.
Bulletin of EATCS (European Association for Theoretical Computer Science) 57,
145–166 (1995)
12. Kelley, J.L.: General Topology. Springer, New York (1975)
13. Kronheimer, E.H., Penrose, R.: On the structure of causal spaces. Proc. Cambr.
Phil. Soc. 63, 481–501 (1967)
14. Kreinovich, V.: On the metrization problem for spaces of kinematic type. Soviet
Mathematics Doklady 15, 1486–1490 (1974)
15. Kreinovich, V.: Categories of Space-Time Models. Ph.D. dissertation, Soviet
Academy of Sciences, Siberian Branch, Institute of Mathematics, Novosibirsk
(1979) (in Russian)
16. Kreinovich, V.: Symmetry characterization of Pimenov’s spacetime: a reformula-
tion of causality axioms. International J. Theoretical Physics 35, 341–346 (1996)
17. Kreinovich, V., Kosheleva, O.: Computational complexity of determining which
statements about causality hold in different space-time models. Theoretical Com-
puter Science 405, 50–63 (2008)
18. Kreinovich, V., Lakeyev, A., Rohn, J., Kahl, P.: Computational Complexity and
Feasibility of Data Processing and Interval Computations. Kluwer, Dordrecht
(1998)
19. Kushner, B.A.: Lectures on Constructive Mathematical Analysis. American Math-
ematical Society, Providence (1985)
20. Misner, C.W., Thorne, K.S., Wheeler, J.A.: Gravitation. W.H. Freeman, New York
(1973)
21. Nachbin, L.: Topology and Order. Van Nostrand, Princeton (1965); reprinted by
R. E. Kreiger: Huntington (1976)
22. Pimenov, R.I.: Kinematic spaces: Mathematical Theory of Space-Time. Consul-
tants Bureau, N.Y (1970)
Thirteen Definitions of a Stable Model
Vladimir Lifschitz
1 Introduction
Stable models of logic programs have been studied by many researchers, mainly
because of their role in the foundations of answer set programming (ASP) [30,37].
This programming paradigm provides a declarative approach to solving combi-
natorial search problems, and it has found applications in several areas of science
and technology [21]. In ASP, a search problem is reduced to computing stable
models, and programs for generating stable models (“answer set solvers”) are
used to perform search.
This paper is a review of some of the definitions, or characterizations, of
the concept of a stable model that have been proposed in the literature. These
definitions are equivalent to each other when applied to “traditional rules” –
with an atom in the head and a list of atoms, some possibly preceded with the
negation as failure symbol, in the body:
But there are reasons why each of them is valuable and interesting. A new char-
acterization of stable models can suggest an alternative picture of the intuitive
meaning of logic programs; or it can lead to new algorithms for generating stable
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 488–503, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Thirteen Definitions of a Stable Model 489
But there also other Herbrand interpretations satisfying the program’s comp-
letion – for instance,
{p(a, b), p(b, a), p(b, b)}. (4)
Another example of this kind, important in applications, is given by the re-
cursive definition of transitive closure:
q(X, Y ) ← p(X, Y ).
(5)
q(X, Z) ← q(X, Y ), q(Y, Z).
The completion of the union of this program with a definition of p has, in many
cases, unintended models, in which q is weaker than the transitive closure of p
that we want to define.
Should we say then that Herbrand minimal models provide a better seman-
tics for logic programming than program completion? Yes and no. The concept
of completion has a fundamental advantage: it is applicable to programs with
negation. Such a program, viewed as a set of clauses, usually has several minimal
Herbrand models, and some of them may not satisfy the program’s completion.
Such “bad” models reflect neither the intended meaning of the program nor the
behavior of Prolog. For instance, the program
(“good”) and
{p(a), p(b), q(a), q(b)} (8)
(“bad”). The completion of (6)
In the 1980s, the main challenge in the study of the semantics of logic program-
ming was to invent a semantics that
Such a semantics was proposed in two papers presented at the 1986 Work-
shop on Foundations of Deductive Databases and Logic Programming [1,48].
That approach was not without defects, however. First, it is limited to programs
in which recursion and negation “don’t mix.” Such programs are called strati-
fied. Unfortunately, some useful Prolog programs do not satisfy this condition.
For instance, we can say that a position in a two-person game is winning if
there exists a move from it to a non-winning position (cf. [46]). This rule is not
stratified: it recursively defines winning in terms of non-winning. A really good
semantics should be applicable to rules like this.
Second, the definition of the semantics of stratified programs is somewhat
complicated. It is based on the concept of the iterated least fixpoint of a program,
and to prove the soundness of this definition one needs to show that this fixpoint
doesn’t depend on the choice of a stratification. A really good semantics should
be a little easier to define.
The stable model semantics, as well as the well-founded semantics proposed
in [49,50], can be seen as an attempt to generalize and simplify the iterated
fixpoint semantics of stratified programs.
3 Nonmonotonic Reasoning
Many events in the history of research on stable models can be only under-
stood if we think of it as part of a broader research effort – the investigation
of nonmonotonic reasoning. Early work in this area was motivated primarily by
the desire to understand and automate the use of defaults by humans. When
commonsense reasoning exploits a default, there is a possibility that taking into
account new information may force us to retract the conclusions that we have
made. The same kind of nonmonotonicity is observed in the behavior of Prolog
programs with negation. For instance, rules (6) warrant the conclusion r(b), but
if the fact q(b) is added to the program then this conclusion will be retracted.
As to the stable model semantics, three nonmonotonic formalisms are partic-
ularly relevant.
3.1 Circumscription
F : M G1 , . . . , M Gn
, (9)
H
where F, G1 , . . . , Gn , H are first-order formulas. The letter M, according to Reiter,
is to be read as “it is consistent to assume.” Intuitively, default (9) is similar to the
inference rule allowing us to derive the conclusion H from the premise F , except
that the applicability of this rule is limited by the justifications G1 , . . . , Gn ; deriv-
ing H is allowed only if each of the justifications can be “consistently assumed.”
This informal description of the meaning of a default is circular: to decide
which formulas can be derived using one of the defaults from D we need to
know whether the justifications of that default are consistent with the formulas
that can be derived from W using the inference rules of classical logic and the
defaults from D – including the default that we are trying to understand! But
Reiter was able to turn his intuition about M into a precise semantics. His theory
of defaults tells us under what conditions a set E of sentences is an “extension”
for the default theory with axioms W and defaults D.
In Sect. 4 we will see that one of the earliest incarnations of the stable model
semantics was based on treating rules as defaults in the sense of Reiter.
(if it is consistent to assume that X does not have the property p, conclude that
it doesn’t). On the other hand, Moore observes that “a formula is consistent if
its negation is not believed”; accordingly, Reiter’s M is somewhat similar to the
combination ¬L¬ in autoepistemic logic, and default (9), in propositional case,
is somewhat similar to the autoepistemic formula
F ∧ ¬L¬G1 ∧ · · · ∧ ¬L¬Gn → H.
However, the task of finding precise and general relationships between these
three formalisms turned out to be difficult. Discussing technical work on that
topic is beyond the scope of this paper.
The autoepistemic theory with these axioms has a unique stable expansion,
and the atoms from that stable expansion form the intended model (7) of the
program.
This epistemic interpretation of logic programs – what we will call Defini-
tion A – is more general than the iterated fixpoint semantics, and it is much
simpler. One other feature of Definition A that makes it attractive is the sim-
plicity of the underlying intuition: negation as failure expresses the absence of
belief.
The “default logic semantics” proposed in [3] is translational as well; it inter-
prets logic programs as default theories. The head A0 of a rule (1) turns into the
conclusion of the default, the conjunction A1 ∧ · · · ∧ Am of the positive members
of the body becomes the premise, and each negative member not Ai turns into
494 V. Lifschitz
the justification M¬Ai (“it is consistent to assume ¬Ai ”). For instance, the last
rule of program (6) corresponds to the default
p(X) : M ¬q(X)
. (12)
r(X)
There is no need for grounding, because defaults are allowed to contain variables.
This difference between the two translations is not essential though, because
Reiter’s semantics of defaults treats a default with variables as the set of its
ground instances. Grounding is simply “hidden” in the semantics of default
logic.
This Definition B of the stable model semantics stresses an analogy between
rules in logic programming and inference rules in logic. Like Definition A, it
has an epistemic flavor, because of the relationship between the “consistency
operator” M in defaults and the autoepistemic “belief operator” L (Sect. 3.4).
The equivalence between these two approaches to semantics of traditional
programs follows from the fact that each of them is equivalent to Definition C
of a stable model reviewed in the next section. This was established in [14] for
the autoepistemic semantics and in [29] for the default logic approach.
A = A1 , A2 , . . . , Ak−1 , Ak = A (k > 1)
p(a, b).
p(a, a) ← p(a, a).
p(a, b) ← p(b, a). (14)
p(b, a) ← p(a, b).
p(b, b) ← p(b, b).
has 3 loops:
{p(a, a)}, {p(b, b)}, {p(a, b), p(b, a)}. (15)
Program (11) has no loops.
According to [17], if we require in condition (i) above that M satisfy the
completion of the program, rather than the program itself, then it will be pos-
sible to relax condition (ii) and require only that no loop contained in M be
unfounded; there will be no need then to refer to arbitrary non-empty subsets
in that condition.
In [27] loops are used in a different way. That paper associates with every
loop L of Π a certain propositional formula, called the loop formula for L.
According to Definition E, M is a stable model of Π iff M satisfies the completion
of Π conjoined with the loop formulas for all loops of Π. For instance, the loop
formulas for the loops (15) of program (14) are
p(a, a) → false,
p(b, b) → false,
(p(a, b) ∨ p(b, a)) → true.
The first two loop formulas eliminate all nonminimal models of the completion
of (14).
1
That is, M satisfies the propositional formulas (10) corresponding to the rules of Π.
2
To be precise, unfoundedness is defined with respect to a partial interpretation, not
a set of atoms. But we are only interested here in the special case when the partial
interpretation is complete, and assume that complete interpretations are represented
by sets of atoms in the usual way.
496 V. Lifschitz
The invention of loop formulas has led to the creation of systems for gen-
erating stable models that use SAT solvers for search (“SAT-based answer set
programming”). Several systems of this kind performed well in a recent
competition [5].
A0 = A, A1 , . . . , Am ∈ M, Am+1 , . . . , An
∈ M. (17)
A stronger form of this condition, given in [6], characterizes the class of stable
models. According to his Definition G, an Herbrand model M of a grounded
program is stable if there exists a well-ordering ≤ of M such that every element A
of M is supported relative to this well-ordering, in the sense that the program
contains a rule (1) satisfying conditions (17) and
A1 , . . . , Am < A0 .
For instance, the stable model (3) of program (14) is supported relative to the
order p(a, b) < p(b, a). On the other hand, model (4) is supported, but it is easy
to see that it is not supported relative to any ordering of its elements.3
When M is finite, well-orderings of M can be described in terms of functions
from atoms to integers, and supportedness relative to an order relation can be
described by a formula of difference logic – the extension of propositional logic
that includes variables for integers and atomic formulas of the form x − y ≥ c.
This observation suggests the posibility of using solvers for difference logic to
generate stable models [38].
Rules in line 1 tell us that p(a) can be established in any number of steps that
is greater than 0; similarly for p(b) and q(a). According to line 2, r(X) can
be established in N + 1 steps if p(X) can be established in N steps and q(X)
cannot be established at all (note that an occurrence of a predicate does not get
an additional numeric argument if it is negated). Finally, an atom holds if it can
be established by some number N of rule applications.
Definition I [25] treats a rule in a logic program as an abbreviated description
of the effect of an action – the action of “applying” that rule – in the situation
calculus.4 For instance, if the action corresponding to the last rule of (6) is de-
noted by lastrule(X) then that rule can be viewed as shorthand for the situation
calculus formula
p(X, S) ∧ ¬∃S(q(X, S)) → r(X, do(lastrule(X), S))
(if p(X) holds in situation S and q(X) does not hold in any situation then r(X)
holds after executing action lastrule(X) in situation S).
In this approach to stable models, the situation calculus function do plays the
same role as adding 1 to N in Wallace’s theory. Instead of program completion,
Lin and Reiter use the process of turning effect axioms into successor state
axioms, which is standard in applications of the situation calculus.
Unlike the reduct (13), this modified reduct contains negation as failure in the
last rule. Generally, unlike the reduct in the sense of Sect. 5, the modified reduct
of a program has several minimal models.
According to Definition K, M is a stable model of Π if M is a minimal model
of the modified reduct of Π relative to M .
In [9] the definition of the reduct is modified in a different way. The reduct
of a program Π in the sense of Ferraris is obtained from the formulas (10) cor-
responding to the grounding rules of Π by replacing every maximal subformula
of F that is not satisfied by M with “false”. For instance, the formulas corre-
sponding to the grounded rules (11) of (6) are the formulas
CIRC[F ] at all. If F has “one level of implications” and no negations (as, for
instance, when F corresponds to a set of traditional rules without negation, such
as (2) and (5)), SM[F ] is equivalent to CIRC[F ]. But SM becomes essentially
different from CIRC as soon as we allow negation in the bodies of rules.
The difference between SM[F ] and the formulas used in Definition F is that
the former does not involve auxiliary predicates and consequently does not re-
quire additional conjunctive terms relating auxiliary predicates to the predicates
occurring in the program.
Definition M combines the main attractive feature of Definitions F, H, and I –
no need for grounding – with the main attractive feature of Definitions J and L –
applicability to formulas of an arbitrarily complex logical form. This fact makes
it possible to give a semantics for an ASP language with choice rules and the
count aggregate without any references to grounding [18].
Among the other definitions of a stable model discussed in this paper, Defini-
tion J, based on equilibrium logic, is the closest relative of Definition M. Indeed,
in [41] the semantics of equilibrium logic is expressed by quantified Boolean for-
mulas, and we can say that Definition M eliminated the need to ground the
program using the fact that the approach of that paper can be easily extended
from propositional formulas to first-order formulas.
A characterization of stable models that involves grounding but is otherwise
similar to Definition M is given in [28]. It has emerged from research on the
nonmonotonic logic of knowledge and justified assumptions [26].
13 Conclusion
computational ideas coming from research on the design of SAT solvers and to
give rise to a new knowledge representation paradigm, answer set programming.
Acknowledgements
I am grateful to the editors for the invitation to contribute this paper to a volume
in honor of my old friend and esteemed colleague Yuri Gurevich.
Many thanks to Michael Gelfond, Tomi Janhunen, Joohyung Lee, Nicola
Leone, Yuliya Lierler, Fangzhen Lin, Victor Marek, and Mirek Truszczyński for
their comments. This work was partially supported by the National Science
Foundation under Grant IIS-0712113.
References
1. Apt, K., Blair, H., Walker, A.: Towards a theory of declarative knowledge. In:
Minker, J. (ed.) Foundations of Deductive Databases and Logic Programming, pp.
89–148. Morgan Kaufmann, San Mateo (1988)
2. Balduccini, M., Gelfond, M.: Logic programs with consistency-restoring rules.5
In: Working Notes of the AAAI Spring Symposium on Logical Formalizations of
Commonsense Reasoning (2003)
3. Bidoit, N., Froidevaux, C.: Minimalism subsumes default logic and circumscription
in stratified logic programming. In: Proceedings LICS 1987, pp. 89–97 (1987)
4. Clark, K.: Negation as failure. In: Gallaire, H., Minker, J. (eds.) Logic and Data
Bases, pp. 293–322. Plenum Press, New York (1978)
5. Denecker, M., Vennekens, J., Bond, S., Gebser, M., Truszczynski, M.: The second
answer set programming system competition.6 In: Erdem, E., Lin, F., Schaub, T.
(eds.) LPNMR 2009. LNCS, vol. 5753, pp. 637–654. Springer, Heidelberg (2009)
6. Elkan, C.: A rational reconstruction of nonmonotonic truth maintenance systems.
Artificial Intelligence 43, 219–234 (1990)
7. Faber, W., Leone, N., Pfeifer, G.: Recursive aggregates in disjunctive logic pro-
grams: Semantics and complexity. In: Alferes, J.J., Leite, J. (eds.) JELIA 2004.
LNCS (LNAI), vol. 3229, pp. 200–212. Springer, Heidelberg (2004)
8. Fages, F.: A fixpoint semantics for general logic programs compared with the well–
supported and stable model semantics. New Generation Computing 9, 425–443
(1991)
9. Ferraris, P.: Answer sets for propositional theories. In: Baral, C., Greco, G., Leone,
N., Terracina, G. (eds.) LPNMR 2005. LNCS (LNAI), vol. 3662, pp. 119–131.
Springer, Heidelberg (2005)
10. Ferraris, P., Lee, J., Lifschitz, V.: A new perspective on stable models. In: Pro-
ceedings of International Joint Conference on Artificial Intelligence (IJCAI), pp.
372–379 (2007)
11. Ferraris, P., Lifschitz, V.: Weight constraints as nested expressions. Theory and
Practice of Logic Programming 5, 45–74 (2005)
12. Fine, K.: The justification of negation as failure. In: Proceedings of the Eighth
International Congress of Logic, Methodology and Philosophy of Science, pp. 263–
301. North Holland, Amsterdam (1989)
5
http://www.krlab.cs.ttu.edu/papers/download/bg03.pdf
6
http://www.cs.kuleuven.be/∼dtai/events/asp-competition/paper.pdf
502 V. Lifschitz
13. Gelfond, M.: On stratified autoepistemic theories. In: Proceedings of National Con-
ference on Artificial Intelligence (AAAI), pp. 207–211 (1987)
14. Gelfond, M., Lifschitz, V.: The stable model semantics for logic programming. In:
Kowalski, R., Bowen, K. (eds.) Proceedings of International Logic Programming
Conference and Symposium, pp. 1070–1080. MIT Press, Cambridge (1988)
15. Gelfond, M., Lifschitz, V.: Logic programs with classical negation. In: Warren, D.,
Szeredi, P. (eds.) Proceedings of International Conference on Logic Programming
(ICLP), pp. 579–597 (1990)
16. Heyting, A.: Die formalen Regeln der intuitionistischen Logik. Sitzungsberichte der
Preussischen Akademie von Wissenschaften. Physikalisch-mathematische Klasse,
pp. 42–56 (1930)
17. Lee, J.: A model-theoretic counterpart of loop formulas. In: Proceedings of Interna-
tional Joint Conference on Artificial Intelligence (IJCAI), pp. 503–508, Professional
Book Center (2005)
18. Lee, J., Lifschitz, V., Palla, R.: A reductive semantics for counting and choice in
answer set programming. In: Proceedings of the AAAI Conference on Artificial
Intelligence (AAAI), pp. 472–479 (2008)
19. Lifschitz, V.: On the declarative semantics of logic programs with negation. In:
Minker, J. (ed.) Foundations of Deductive Databases and Logic Programming, pp.
177–192. Morgan Kaufmann, San Mateo (1988)
20. Lifschitz, V.: Twelve definitions of a stable model. In: Garcia de la Banda, M.,
Pontelli, E. (eds.) ICLP 2008. LNCS, vol. 5366, pp. 37–51. Springer, Heidelberg
(2008)
21. Lifschitz, V.: What is answer set programming? In: Proceedings of the AAAI Con-
ference on Artificial Intelligence, pp. 1594–1597. MIT Press, Cambridge (2008)
22. Lifschitz, V., Pearce, D., Valverde, A.: Strongly equivalent logic programs. ACM
Transactions on Computational Logic 2, 526–541 (2001)
23. Lifschitz, V., Tang, L.R., Turner, H.: Nested expressions in logic programs. Annals
of Mathematics and Artificial Intelligence 25, 369–389 (1999)
24. Lin, F.: A Study of Nonmonotonic Reasoning. PhD thesis, Stanford University
(1991)
25. Lin, F., Reiter, R.: Rules as actions: A situation calculus semantics for logic pro-
grams. Journal of Logic Programming 31, 299–330 (1997)
26. Lin, F., Zhao, Y.: ASSAT: Computing answer sets of a logic program by SAT
solvers. In: Proceedings of National Conference on Artificial Intelligence (AAAI),
pp. 112–117. MIT Press, Cambridge (2002)
27. Lin, F., Zhao, Y.: ASSAT: Computing answer sets of a logic program by SAT
solvers. Artificial Intelligence 157, 115–137 (2004)
28. Lin, F., Zhou, Y.: From answer set logic programming to circumscription via logic
of GK. In: Proceedings of International Joint Conference on Artificial Intelligence
(IJCAI) (2007)
29. Marek, V., Truszczyński, M.: Stable semantics for logic programs and default the-
ories. In: Proceedings North American Conf. on Logic Programming, pp. 243–256
(1989)
30. Marek, V., Truszczyński, M.: Stable models and an alternative logic programming
paradigm. In: The Logic Programming Paradigm: a 25-Year Perspective, pp. 375–
398. Springer, Heidelberg (1999)
31. McCarthy, J.: Circumscription–a form of non-monotonic reasoning. Artificial In-
telligence 13, 27–39, 171–172 (1980)
32. McCarthy, J.: Applications of circumscription to formalizing common sense knowl-
edge. Artificial Intelligence 26(3), 89–116 (1986)
Thirteen Definitions of a Stable Model 503
33. McCarthy, J., Hayes, P.: Some philosophical problems from the standpoint of ar-
tificial intelligence. In: Meltzer, B., Michie, D. (eds.) Machine Intelligence, vol. 4,
pp. 463–502. Edinburgh University Press, Edinburgh (1969)
34. McDermott, D.: Nonmonotonic logic II: Nonmonotonic modal theories. Journal of
ACM 29(1), 33–57 (1982)
35. McDermott, D., Doyle, J.: Nonmonotonic logic I. Artificial Intelligence 13, 41–72
(1980)
36. Moore, R.: Semantical considerations on nonmonotonic logic. Artificial Intelli-
gence 25(1), 75–94 (1985)
37. Niemelä, I.: Logic programs with stable model semantics as a constraint program-
ming paradigm. Annals of Mathematics and Artificial Intelligence 25, 241–273
(1999)
38. Niemelä, I.: Stable models and difference logic. Annals of Mathematics and Artifi-
cial Intelligence 53, 313–329 (2008)
39. Niemelä, I., Simons, P.: Extending the Smodels system with cardinality and weight
constraints. In: Minker, J. (ed.) Logic-Based Artificial Intelligence, pp. 491–521.
Kluwer, Dordrecht (2000)
40. Pearce, D.: A new logical characterization of stable models and answer sets. In:
Dix, J., Przymusinski, T.C., Moniz Pereira, L. (eds.) NMELP 1996. LNCS (LNAI),
vol. 1216, pp. 57–70. Springer, Heidelberg (1997)
41. Pearce, D., Tompits, H., Woltran, S.: Encodings for equilibrium logic and logic
programs with nested expressions. In: Brazdil, P.B., Jorge, A.M. (eds.) EPIA 2001.
LNCS (LNAI), vol. 2258, pp. 306–320. Springer, Heidelberg (2001)
42. Reiter, R.: A logic for default reasoning. Artificial Intelligence 13, 81–132 (1980)
43. Reiter, R.: Circumscription implies predicate completion (sometimes). In: Pro-
ceedings of International Joint Conference on Artificial Intelligence (IJCAI), pp.
418–420 (1982)
44. Reiter, R.: Knowledge in Action: Logical Foundations for Specifying and Imple-
menting Dynamical Systems. MIT Press, Cambridge (2001)
45. Saccá, D., Zaniolo, C.: Stable models and non-determinism in logic programs with
negation. In: Proceedings of ACM Symposium on Principles of Database Systems
(PODS), pp. 205–217 (1990)
46. van Emden, M., Clark, K.: The logic of two-person games. In: Micro-PROLOG:
Programming in Logic, pp. 320–340. Prentice-Hall, Englewood Cliffs (1984)
47. van Emden, M., Kowalski, R.: The semantics of predicate logic as a programming
language. Journal of ACM 23(4), 733–742 (1976)
48. Van Gelder, A.: Negation as failure using tight derivations for general logic pro-
grams. In: Minker, J. (ed.) Foundations of Deductive Databases and Logic Pro-
gramming, pp. 149–176. Morgan Kaufmann, San Mateo (1988)
49. Van Gelder, A., Ross, K., Schlipf, J.: Unfounded sets and well-founded semantics for
general logic programs. In: Proceedings of the Seventh ACM SIGACT-SIGMOD-
SIGART Symposium on Principles of Database Systems, Austin, Texas, March
21-23, pp. 221–230. ACM Press, New York (1988)
50. Van Gelder, A., Ross, K., Schlipf, J.: The well-founded semantics for general logic
programs. Journal of ACM 38(3), 620–650 (1991)
51. Wallace, M.: Tight, consistent and computable completions for unrestricted logic
programs. Journal of Logic Programming 15, 243–273 (1993)
DKAL and Z3: A Logic Embedding Experiment
1 Introduction
DKAL, the Distributed Knowledge Authorization Language, has been developed
as a foundation for logic-based authorization mechanisms. The formulation of
DKAL used in this paper was developed through a sequence of adaptations of logic
for authorization, but also a noticeable realization that a combination of modal
and intuitionistic propositional logic provide an exceptional match for integrating
knowledge information in a distributed system.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 504–528, 2010.
c Springer-Verlag Berlin Heidelberg 2010
DKAL and Z3: A Logic Embedding Experiment 505
2 An Introduction to DKAL
Nowadays, the most widespread method to deal with security policies in sys-
tems are access control lists (ACL). This method basically consists of a list of
permissions attached to each of the objects that may be accessed. ACLs allow
expressing detailed policies at a very granular level, but they have their limita-
tions when it comes to expressing access policies that depend on conditions and
that combine policies of potentially multiple parties that don’t necessarily trust
each other on anything else than a few objects. An approach to authorization
based on logic has on the other hand the prospect of expressing policies using ex-
pressive conditions and properly combining the intent of multiple parties. Logic
based approaches also allow policies to be the object of formal verification, and
to thoroughly analyze properties such as security leaks.
In this context, Distributed Knowledge Authorization Language (DKAL) and
affiliated logic was proposed in [13] and later DKAL was given a foundation
based on a combination of modal and intuitionistic logic in [15].
At the core, this logic deals with pieces of information called infons, e.g. John
has the right to access the network Corporate. Infons are not required to be
true or false, but instead infons can be known or not by relevant principals. For
example, we can ask whether the administrator of the network Corporate knows
that John has the right to access the network. In this way, each principal has
some knowledge (that is, infons that are known to him). DKAL also provides
a way in which principals can share the information they have. For example, if
Admin knows that John has read access to Corporate, he can communicate this
to Patrick and (if Patrick accepts that communication), then Patrick will know
that Admin said that John has read access to Corporate.
506 S. Mera and N. Bjørner
One may study the logic of infons, and some natural operators arise. You
know a ∧ b if you know a and you know b, and the implication connective is also
natural, if you know a and a → b, then you know b. In addition to conjunction
and implication, infon logic has two unary connectives p said and p implied,
for any principal p. Both connectives represent a way to model the knowledge
that are passed from one principal to the other by means of communication
assertions. Let us see the intuitive difference between them. A principal p may
just say a to q, and then, if the communication is successful, q learns the infon
p said a. But p may condition his claim on q knowing b, so then q learns the
infon b → (p implied a). So implied is a weaker condition, and it will be useful
to avoid undesired delegation or probing attacks.
Infons are also partially ordered in terms of information, and the intuitive
idea is that a ≤ b when the information in a is contained in that of b. In this
sense, infons can be studied from the point of view of algebra. For example, a + b
is the least upper bound of a and b, and a → b = min{z : a + z ≥ b}.
There is also interest in studying the logic of infons itself. In this direction,
two fragments were identified: the full infon logic, and the primal infon logic.
The latter is weaker, but has nice computational properties and is still expressive
enough for modeling security issues. We will say infon logic to refer to either of
both logics.
A knows x
where A is a principal and x is an infon term. The intended meaning of this is
asserting that A has the knowledge x. The way a principal has to communicate
knowledge is through communication assertions. They have the form:
A to q : [x ← y] ⇐ z
where A is a principal, q is either a principal or a variable (that ranges over
principals), and x, y and z are infon terms. We use the following shorthands
[x ← y] instead of [x ← y] ⇐ true, [x] ⇐ z instead of [x ← true] ⇐ z, and [x]
instead of [x ← true] ⇐ true. We are going to give the precise semantics later, but
intuitively, with this assertion the principal A communicates to q the knowledge
x, with y as a proviso. This communication takes place only if A already knows
z. Filter assertions are used to receive communications. They have the form:
B from p : [x ← y] ⇐ z
DKAL and Z3: A Logic Embedding Experiment 507
Proviso-present scenario
Proviso-free scenario
B to p : [t1 ← t2 ] ⇐ t3
B to p : [t1 ] ⇐ t3
B knows t3 η, pη = A
B knows t3 η, pη = A
A from q : [s1 ← s2 ] ⇐ s3
(Com1) A from q : [s1 ] ⇐ s3 (Com2)
A knows s3 θ, qθ = B
A knows s3 θ, qθ = B
s1 θ = t1 ηθ
s1 θ = t1 ηθ
s2 θ = t2 ηθ
A knows B said s1 θ
A knows (s2 θ → B implied s1 θ)
where η and θ are appropriate substitutions. The knowledge that a principal gets
from his knowledge assertions and the communications from other principals
gives rise to more knowledge. Here is where the infon logic plays its part.
A knows Γ
(Ensue) Γ x
A knows x
The Infon Logic. We now introduce the syntax and semantics of the infon
logic. The syntax is the same for both the full and the primal fragment. Given
an infinite set of principals, and a vocabulary of functions names, the set of
formulas is defined as:
ϕ ::= true | A(t1 , . . . , tk ) | ϕ1 + ϕ2 | ϕ1 → ϕ2 | p said ϕ | p implied ϕ (1)
where true and A(t1 , . . . , tk ) are primitive formulas, A is a function name,
t1 , . . . , tk are terms, ϕ, ϕ1 and ϕ2 range over formulas and p ranges over princi-
pals. For the purpose of this article the structure of primitive formulas is of no
real importance.
For practical reasons, it is useful to have some fixed built-in functions, such
as the nullary function asInfon that act as a casting operator that converts the
Boolean value true into the uninformative infon true asInfon. We will also
introduce the shortcuts tdonS and tdonI used in DKAL for expressing that
principals are trusted on saying or implying certain infons:
p tdonS x abbreviates (p said x) → x,
p tdonI x abbreviates (p implied x) → x.
The infon logic is basically an extension of the intuitionistic propositional
system NJp [18], and was introduced is [15]. Here we present the sequent calculus
for the full infon logic.
508 S. Mera and N. Bjørner
Γ y
(PI)
Γ, x y
Γ x + y Γ x + y Γ x Γ y
( + E) ( + I)
Γ x Γ y Γ x + y
Γ x Γ x→y Γ, x y
(→E) (→I)
Γ y Γ x→y
Γ y Γ y
(S) (I)
q said Γ q said y q told Γ q implied y
Γ y Γ x Γ, x y
(→IW) (Trans)
Γ x→y Γ y
K1. Iq ⊆ Sq .
K2. if u ≤ w and wIq v, then uIq v, and the same for Sq .
Let C(Γ ) = x∈Γ C(x). A Kripke structure M models a sequent Γ x iff
C(Γ ) ⊆ C(x) in M . A sequent s is valid if every structure models s.
The definition for the satisfaction relationship for the primal infon is similar
to the above except that K4 is replaced with the following weaker requirement:
Theorem 2 ([15]). The validity problem for the full infon logic is polynomial-
space complete.
For the primal infon logic, the complexity bound can be improved when the
maximum depth of a formula is fixed:
– δ(x) = 0 if x is a variable.
– δ(p told x) = 1 + δ(x).
– δ(x + y) = max{δ(x), δ(y)}.
– δ(x → y) = δ(y).
Theorem 3 ([15]). For every natural number d, there is a linear time algorithm
for the multiple derivability problem for primal infon logic restricted to formulas
x, with δ(x) ≤ d. Given hypothesis Γ and queries Q, the algorithm determines
which of the queries in Q follow from the hypothesis Γ .
3 DKAL@Starbucks
We will now develop an example using DKAL. The example has been processed
mechanically using our Z3 based prototype described in Section 5. We present
the example using the syntax used by the front-end to the prototype. The ex-
ample is extracted from a document that was not specifically aimed at studying
510 S. Mera and N. Bjørner
Login Authorization. The entities involved in this part of the process are:
Patrick, the user
OSLogon, the module in charge of the login process
AuthProvider, the module that provides the authentication protocol
LoginGUI, the graphical interface for the login process
KeyboardAPI, the interface that listens to the keyboard events
The primitive functions all take a principal p as argument. They are:
p hasEnteredUsername, p hasValidCredentials, p couldLogin,
p hasEnteredPassword, p hasCredentialsCached, p isLoggedIn
And we are going to use the substrate function p passesSecurityChecks that
returns a truth value.
Patrick begins by typing his user-name and password. This action is informed
by the keyboard API to the login graphical interface:
KeyboardAPI to LoginGUI: [Patrick hasEnteredUsername]
KeyboardAPI to LoginGUI: [Patrick hasEnteredPassword]
Recall that [x] is an abbreviation for [x ← true] ⇐ true.
The graphical interface trusts any information, encoded using a variable X,
provided by the keyboard API.
LoginGUI knows KeyboardAPI tdonS X
and the graphical interface also accepts communications Y from any participant
X:
LoginGUI from X: [Y ]
It therefore follows that the graphical interface learns that Patrick has entered
his user-name and password:
? LoginGUI knows Patrick hasEnteredUsername
? LoginGUI knows Patrick hasEnteredPassword
The graphical interface caches someones credentials when it knows that this
person has entered his user-name and password. When that happens, it informs
that to the authorization provider. This is encoded using the rule:
LoginGUI to AuthProvider: [X hasCredentialsCached] <=
X hasEnteredUsername + X hasEnteredPassword
DKAL and Z3: A Logic Embedding Experiment 511
and the authorization provider accepts all communications from the graphical
interface, as the following rule encodes:
AuthProvider from LoginGUI: [X]
Therefore, we can deduce that the provider learns that the interface has cached
Patrick’s credentials:
(A1) ? AuthProvider knows LoginGUI said
Patrick hasCredentialsCached
On the other hand, the authorization provider allows someone to login when the
graphical interface has cached his credentials and those credentials are valid. If
that is the case, he informs that to the logon module:
AuthProvider to OSLogon: [X couldLogin] <=
(LoginGUI said X hasCredentialsCached) +
X hasValidCredentials
The authorization provider checks Patrick’s credentials with its internal
database, and concludes that they are valid, so it learns:
(A2) ? AuthProvider knows Patrick hasValidCredentials
Given (A1) and (A2), the authorization provider communicates to the logon
module that Patrick can be logged in. Since the logon module accepts commu-
nications from the authorization provider regarding a successful login:
OSLogon from AuthProvider: [X couldLogin]
the logon module learns that:
(A3) ? OSLogon knows AuthProvider said Patrick couldLogin
The logon module will allow a user X to login when the authorization provider
says so, and when it can verify that X passes all the security checks:
OSLogon to X: [X isLoggedIn] <= AuthProvider said
X couldLogin + asInfon(X passesSecurityChecks)
Given (A3) and the fact that Patrick effectively passes all the security checks,
the logon module informs that Patrick is logged in. Patrick accepts any commu-
nication from the logon module:
Patrick from OSLogon:[X]
So Patrick learns:
? Patrick knows OSLogon said Patrick isLoggedIn
Patrick trusts the logon module with respect to login issues, so assert the
following:
Patrick knows winLogon tdonS Patrick isLoggedIn
Consequently Patrick learns that he is logged in:
? Patrick knows Patrick isLoggedIn
512 S. Mera and N. Bjørner
so it learns:
? Windows knows IE shouldShowLock
514 S. Mera and N. Bjørner
4.2 Quantifiers in Z3
There are several motivations for integrating strong quantifier support in the
context of SMT solvers. The main motivation stems from program verification
applications. In this context, quantified formulas that come from program ver-
ification can typically be instantiated based on a local analysis of the ground
terms occurring in the input formula. A current main approach to integrating
quantifiers with SMT solving is therefore by producing quantifier instantiations.
The instantiated quantifiers are then quantifier free formulas that are handled by
the main ground SMT solving engine. It is an art and craft to control quantifier
instantiations to produce just the useful instantiations.
For controlling which instantiations of the axioms are produced let us consider
an annotation of the axioms using triggers. A trigger annotated universal formula
is a formula of the form
where k ≥ 1, p1 , . . . , pk , are terms that contain all variables from x. Given a model
M of a quantifier free formula ϕ we say that ϕ is saturated with respect to ψannot if
516 S. Mera and N. Bjørner
5 DKAL and Z3
We will now describe an embedding of DKAL into Z3. The embedding encodes
inference rules from DKAL as first-order axioms presented to Z3. The encoding
furthermore prescribes using patterns how the axioms should be instantiated
based on ground terms that have already been created. We later show that the
encoding preserves decidability of ground DKAL queries as the set of instantia-
tions that can be created based on the patterns is finite.
Terms in Z3 are sorted according to a simple theory of sorts. The universe is
assumed partitioned into a set of disjoint sorts and a sort can be introduced by
declaring a name identifying the sort. The sorts that are used for DKAL are:
– Infon - the sort of infons.
– InfonSet - for a set of Infons.
– Principal - the sort of principals.
The main functions and predicates on these sorts follow the presentation from
Section 2 closely.
DKAL and Z3: A Logic Embedding Experiment 517
The Infon Inference Rules. The infons are built according to the BNF gram-
mar outlined in (1). For the embedding we introduce functions that build ab-
stract terms corresponding to each case in the grammar. true : Infon, asInfon :
B → Infon, + , imp : Infon × Infon → Infon and said , implied :
Principal × Infon → Infon.
The entailment relation between infon sets and infons is a binary predicate
: InfonSet × Infon → B. It encodes the entailment relation for sequents over
infons. To encode the entailment relation it suffices to follow the inference rules
for the infon logic. The subformula property of the infon logic is critical in
this context. Rule axioms are only instantiated if there are existing subterms
matching the maximal subterms in either the premise or conclusion.
Encoding Rules for said and implied . The rules S and I require some
additional encoding. We will use auxiliary functions saidOf( , ), toldOf( , ) :
Principal × InfonSet → InfonSet to extract the subset of an infon set Γ where
principal p either said or at least implied some infon.
Extracting implied infons is almost similar, except we take into account that if
p said x then p implied x.
Proof: It is very simple to observe that the encoding of the infon logic is sound,
so we need only to consider the case where Γ ϕ is derivable, and we wish to
show that the saturation A ∧ ¬(Γ ϕ) contains a subset of ground inconsistent
formulas. If Γ ϕ is derivable, then there is a proof in the infon logic satisfying
the subformula property: only subformulas of the original sequent are used in
the proof. In particular, consider each proof step. We outline a justification that
a corresponding axiom from A gets instantiated.
The rules +I, +E1, +E2, → E, → I are similar in that if the last step in
the subformula preserving proof is one of these, then the pattern annotation
matches the conclusion and one of the existing subformulas. The corresponding
axiom is instantiated. Since we assume that the current saturation satisfies the
negation of the conclusion of the rule, it will also have to falsify at least one of
the premises.
The axioms for x2x, true and asInfon are more liberal from their inference
rule counter-parts, as they apply in the context of an arbitrary context Γ , but
the rule PI ensures that this difference is benign.
Finally, the rules I and S are simulated using axioms S1 − 2, I1 − 3 since they
encode a sequence of PI rules followed by either S or I. These axioms introduce
the auxiliary sets saidOf(p, Γ ) and toldOf(p, Γ ). On the other hand x2x and (2)
and (3) take care of extracting premises when they become relevant.
DKAL and Z3: A Logic Embedding Experiment 519
For good order we should note that axiom instantiation also terminates. In
other words, it is not possible to repeatedly instantiate the axioms without re-
introducing an already existing instance.
(Com1) (Com2)
Proviso-free
scenario
Proviso-present
scenario
p1 to p2 : [x] ⇐ y, p1 to p2 [x ← y] ⇐ z,
,
p2 from p1 : [x] ⇐ z p2 from p1 [x ← y] ⇐ k
{p2 knows (p1 said x), T(y), T(z)} {p2 knows (y imp p1 implied x), T(z), T(k)}
{p1 to p2 [x] ⇐ y, T(z)} {p1 to p2 [x ← y] ⇐ z, T(k)}
{p2 from p1 [x] ⇐ z, T(y)} {p2 from p1 [x ← y] ⇐ k, T(z)}
p1 to p2 [x] ⇐ y p1 to p2 [x ← y] ⇐ z
∧ p1 knows y ∧ p1 knows z
∧ p2 from p1 [x] ⇐ z ∧ p2 from p1 [x ← y] ⇐ k
∧ p2 knows z ∧ p2 knows k
→ p2 knows (p1 said x) → p2 knows (y imp p1 implied x)
We split the Ensue rule into two parts. The first part applies the Ensue rule
assuming x is derivable from some set knowsOf(p) comprising of the infons known
to x. The second defines the set knowsOf(p).
(Ensue1) (Ensue2)
{p knows x} {p knows x}
{knowsOf(p) x} {x ∈ knowsOf(p)}
knowsOf(p) x → (p knows x) (p knows x) ↔ x ∈ knowsOf(p)
restricts rule applications to a subset of the general formulation from Section 2.1.
The main sanity check of this limited encoding has been through experimental
evaluation through case studies, including the one presented in Section 3.
the set of all underivable sequents in M with the same antecedent (wΓ =
{Γ ϕ | M |= Γ ϕ, for fixed Γ and arbitrary ϕ})
We will now justify the model construction with more rigorous detail.
1) Note that if
Γ ψ1 and Γ ψ2 (18)
are derivable, then Γ ψ1 + ψ2 is derivable by (+I) rule, contradicting
completeness of ΩΓ . Hence one of ψ1 , ψ2 (say ψ1 ) is not in Γ , and one of
ψ1 , ψ2 does not clash with Γ ψ1 + ψ2 . If Γ ψ1
∈ ΩΓ , then ψ1 clashes with
Γ ψ1 + ψ2 by completeness. Hence the sequent Γ ψ1 is derivable, and ψ2
does not clash with Γ ψ1 + ψ2 . Also ψ2 ∈ Γ , since otherwise sequents in
(18) are derivable. Hence Γ ψ2 ∈ ΩΓ as required.
522 S. Mera and N. Bjørner
Let’s see that the model K is a Kripke model for the infon logic.
– ≤ is reflexive and transitive given that ⊆ is reflexive and transitive.
– We have to check that, if ΩΓ ≤ ΩΓ and Sp (ΩΓ , ΩΓ ), then Sp (ΩΓ , ΩΓ )
(and the same for Ip ). If Γ ⊆ Γ , and {ϕ | p said ϕ ∈ Γ } ⊆ Γ , then
DKAL and Z3: A Logic Embedding Experiment 523
ϕ ∈ Γ implies K, w |= ϕ (19)
Γ ϕ ∈ ΩΓ iff K, w
|= ϕ (20)
For the case ϕ = p said ψ. If p said ψ ∈ Γ , then for every w ≡ ΩΓ such that
Sp (ΩΓ , ΩΓ ) we have ψ ∈ Γ . By inductive hypothesis, K, w |= ψ. Using the
truth definition, this implies K, w |= p said ψ as desired. For the second claim,
if Γ p said ψ ∈ ΩΓ , then by the saturation condition, there is a ΩΓ ∈ M such
that Sp (ΩΓ , ΩΓ ) and Γ ψ ∈ ΩΓ . By the inductive hypothesis, K, w |= ψ.
Therefore K, w |= p said ψ. For the other direction, if K, w
|= p said ψ, there is
a w ≡ ΩΓ ∈ M such that Sp (w, w ) and K, w |= ψ. By inductive hypothesis,
we know that Γ ψ ∈ ΩΓ . Because Sp (ΩΓ , ΩΓ ) and Γ ψ ∈ ΩΓ , Γ
p said ψ ∈ ΩΓ .
The case for ϕ = p implied ψ is equivalent to the previous one.
Corollary 1. K, w
|= Γ ϕ for every Γ ϕ ∈ ΩΓ .
Here we show examples of some models that have been extracted using our pro-
totype. In order to force the desired query to be underivable we feed the tool
with an incomplete specification. Recall the Starbucks example presented in Sec-
tion 3, in which an SSL connection is established. When the trusted root informs
Windows that the certificate is valid (and Windows accepts the communication)
it learns:
? Windows knows TrustedRoot implied ThisCert isValid
But in this example we will remove the fact that says that Windows trusts
on the trusted root:
526 S. Mera and N. Bjørner
w0 w1
K, w0
|= ThisCert isValid,
w2 w1 w0
In the picture, the dashed edges represent the ≤ relationship. All the other
relations (saidpatrick , saidwindows , etc.) are the same and they are represented
by the solid edges. The valuation function V is the following:
V (ThisCert isValid) = ∅,
V (IECurrSite supportsSSL) = {w0 , w1 , w2 }
V (ThisCert isProperlySigned) = {w0 , w1 , w2 }
V (IE shouldShowLock) = {w1 , w2 }.
We can see again that
K, w0
|= ThisCert isValid,
7 Conclusion
We experimented with embedding the DKAL logic into the theorem prover Z3
for classical first-order logic and interpreted theories. More specifically, we used
the instantiation-based support for quantifiers in Z3. We also established how
ground counter-models produced by Z3 can be mapped back to Kripke models
for the infon logic. A prototype embedding into Z3 was implemented and checked
on smaller illustrative case studies. This paper includes one such case study that
exercised various features available in DKAL. We do not claim to have produced
any dedicated, high-performance, DKAL engine, but the embedding into Z3
allows re-using the extensible support for theories in Z3 to handle auxiliary
logical constraints, such as comparing timeouts symbolically.
References
1. Ball, T., Rajamani, S.K.: The SLAM project: debugging system software via static
analysis. SIGPLAN Not. 37(1), 1–3 (2002)
2. Barnett, M., Leino, K.R.M., Schulte, W.: The Spec# Programming System: An
Overview. In: Barthe, G., Burdy, L., Huisman, M., Lanet, J.-L., Muntean, T. (eds.)
CASSIS 2004. LNCS, vol. 3362, pp. 49–69. Springer, Heidelberg (2005)
3. Biere, A., Cimatti, A., Clarke, E., Zhu, Y.: Symbolic model checking without BDDs.
In: Cleaveland, W.R. (ed.) TACAS 1999. LNCS, vol. 1579, pp. 193–207. Springer,
Heidelberg (1999)
4. Bjørner, N., Hendrix, J.: Linear functional fixed-points. In: Bouajjani, A., Maler,
O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 124–139. Springer, Heidelberg (2009)
5. Cohen, E., Moskal, M., Schulte, W., Tobies, S.: A Precise Yet Efficient Memory
Model For C. In: Proceedings of Systems Software Verification Workshop (SSV
2009) (2009) (to appear)
6. Costa, M., Crowcroft, J., Castro, M., Rowstron, A.I.T., Zhou, L., Zhang, L.,
Barham, P.: Vigilante: end-to-end containment of internet worms. In: Herbert,
A., Birman, K.P. (eds.) SOSP, pp. 133–147. ACM, New York (2005)
7. de Moura, L., Bjørner, N.: Efficient E-Matching for SMT Solvers. In: Pfenning,
F. (ed.) CADE 2007. LNCS (LNAI), vol. 4603, pp. 183–198. Springer, Heidelberg
(2007)
8. de Moura, L., Bjørner, N.: Z3: An Efficient SMT Solver. In: Ramakrishnan, C.R.,
Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg
(2008)
9. Degtyarev, A., Gurevich, Y., Narendran, P., Veanes, M., Voronkov, A.: Decidability
and complexity of simultaneous rigid E-unification with one variable and related
results. Theoretical Computer Science 243, 167–184 (2000)
10. DeLine, R., Leino, K.R.M.: BoogiePL: A typed procedural language for checking
object-oriented programs. Technical Report 2005-70, Microsoft Research (2005)
11. Godefroid, P., Levin, M., Molnar, D.: Automated Whitebox Fuzz Testing. Technical
Report 2007-58, Microsoft Research (2007)
12. Gulavani, B.S., Henzinger, T.A., Kannan, Y., Nori, A.V., Rajamani, S.K.: Syn-
ergy: a new algorithm for property checking. In: Young, M., Devanbu, P.T. (eds.)
SIGSOFT FSE, pp. 117–127. ACM, New York (2006)
13. Gurevich, Y., Neeman, I.: Dkal: Distributed-knowledge authorization language. In:
CSF, pp. 149–162. IEEE Computer Society, Los Alamitos (2008)
528 S. Mera and N. Bjørner
14. Gurevich, Y., Neeman, I.: DKAL 2 — A Simplified and Improved Authorization
Language. Technical Report 2009-11, Microsoft Research (2009)
15. Gurevich, Y., Neeman, I.: The Infon Logic. Bulletin of European Association for
Theoretical Computer Science (2009)
16. Jackson, E.K., Schulte, W.: Model Generation for Horn Logic with Stratified Nega-
tion. In: Suzuki, K., Higashino, T., Yasumoto, K., El-Fakih, K. (eds.) FORTE 2008.
LNCS, vol. 5048, pp. 1–20. Springer, Heidelberg (2008)
17. Lahiri, S.K., Qadeer, S.: Back to the Future: Revisiting Precise Program Verifica-
tion using SMT Solvers. In: POPL 2008 (2008)
18. Mints, G.: Grigori. In: A short introduction to intuitionistic logic. Kluwer Aca-
demic, New York (2000)
19. Moy, Y., Bjørner, N., Sielaff, D.: Modular Bug-finding for Integer Overflows in the
Large: Sound, Efficient, Bit-precise Static Analysis. Technical Report MSR-TR-
2009-57, Microsoft Research (2009)
20. Ohlbach, H.J., Nonnengart, A., de Rijke, M., Gabbay, D.M.: Encoding two-valued
nonclassical logics in classical logic. In: Robinson, J.A., Voronkov, A. (eds.) Hand-
book of Automated Reasoning, pp. 1403–1486. Elsevier, MIT Press (2001)
21. Tillmann, N., Schulte, W.: Unit Tests Reloaded: Parameterized Unit Testing with
Symbolic Execution. IEEE software 23, 38–47 (2006)
Decidability of the Class E
by Maslov’s Inverse Method
Grigori Mints
1 Introduction
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 529–537, 2010.
c Springer-Verlag Berlin Heidelberg 2010
530 G. Mints
resolution-like methods and explain the reason for the switch at the beginning
of their Sect. 4.
In Sect. 1 we present S. Maslov’s formulation of his inverse method for skolem-
ized formulas which avoids many technical details and give a short proof of its
completeness which shows a close connection with Gentzen-type derivations.
Then we discuss two notions most prominent in S. Maslov’s approach to de-
cidable classes and automated deduction: those of a favorable free clause and
of factorization. They make it possible to simplify a formula to be tested for
validity “on the fly”, in the process of testing.
S. Maslov proved in [7] decidability of a class K containing essentially all de-
cidable classes of predicate formulas known by the time [7] was written. That
proof used a combination of Lemmas 1 and 2 with much more careful analysis of
tautological disjunction than in the present paper. It would be interesting to give
decision procedures using inverse method for decidable classes that were discov-
ered after that, for example guarded formulas [2]. Another possible application
that requires development of the inverse method for intuitionistic logic may be a
new more efficient decision algorithm for the formulas of intuitionistic predicate
logic that do not contain negative quantifiers [13]. Negative quantifiers are ones
that become ∃ in the prenex normal form. The need for such an algorithm in
computer science is noticed in [5] where an algorithm close to [13] is provided
for an extension of this class.
I thank Elena Maslova for help in editing this paper.
Di1 σ1 ∨ . . . ∨ Dim σm (m ≥ 0; 1 ≤ i1 , . . . im ≤ δ)
provided σ1 σ 1 = . . . = σδ σ δ .
A clause is F -favorable iff it is derivable in UF .
Comment. A simple (but sufficiently general) instance of the rule B is
C ∨ (D1 σ) . . . C ∨ (Dδ σ)
B
C
2.2 Completeness of UF
Theorem 1. A clause C is F -favorable iff C ∨ F is valid, that is, derivable in
predicate logic.
C ∨ M τ1 ∨ . . . ∨ M τp
M θ ≡ & i Di θ
(C ∨ D1 θ) ∨ M τ1 ∨ . . . ∨ M τp , . . . (C ∨ Dδ θ) ∨ M τ1 ∨ . . . ∨ M τp .
C ∨ D1 θ, . . . , C ∨ Dδ θ,
532 G. Mints
(C1 ∨ . . . ∨ Cδ ) ∨ (D1 ∨ . . . ∨ Dδ )σ ∨ F ≡
(C1 ∨ . . . ∨ Cδ ) ∨ M σ ∨ F →
(C1 ∨ . . . ∨ Cδ ) ∨ F ∨ F → (C1 ∨ . . . ∨ Cδ ) ∨ F
where σ = σ1 σ = . . . = σδ σ δ .
1
Corollary 1. A formula F is derivable iff the empty clause ∅ is F -favorable.
F ⇐⇒ ∃x&i=α Di
(abbreviated F ⇐⇒ F − ).
Proof. F → F − is obvious. By Theorem 1 Dα ∨ F is derivable in the predicate
logic, and hence
∀xDα ∨ F.
Hence in the proof of F − → F one can use ∀xDα , but then the task is trivial.
A general proof search algorithm for the first order logic based on the inverse
method [3] generates favorable clauses for the goal formula F from the clauses
given by the rule A by the application of the rule B combined with the unification
as it is done in the resolution method. The proof search terminates when an
empty clause is generated.
If at some moment a free unit clause is derived, it is deleted from the formula
(cf. Lemma 1) and the proof search is continued with a simplified formula. If a
factorable clause C ∨ D is generated, the search branches according to Lemma
2. If in addition one of the clauses C, D is a unit free clause, the formula F can
be simplified in corresponding branch.
Example. F ≡ ∃x∃y[(¬P x ∨ P f (x))& (¬Qy ∨ Qg(y))]
≡ ∃x∃y[D1 (x)&D2 (y)].
Rule A gives two favorable clauses:
D1 (x) ∨ D1 (f (x)); D2 (y) ∨ D2 (g(y)).
Rule B gives (with substitutions σ1 = (x/f (x)), σ2 = (y/g(y)) )
D1 (x) ∨ D2 (y).
This clause is factored into two unit free clauses D1 (x) and D2 (y). Taking first
the factor D2 (y) (to save notation), we simplify F to D1 (x) and have ∅ by
Example 1. The factor D1 (x) is treated similarly.
where {t1 , . . . tk } is the list of all terms of the depth ≤ m from the Herbrand
universe of F .
F -terms are terms of the depth ≤ d(F ) constructed from constants and vari-
able x by means of function symbols in F .
F -atoms are constructed from F -terms by means of predicates in F .
Formula F is saturated if each clause Di contains each F -atom (possibly with
negation).
D ⇐⇒ (D ∨ A)&(D ∨ ¬A).
The next four lemmas are used in the proof of the main Theorem 2.
Lemma 3. Let E, E be terms or literals, r be a term not occurring in E, E
and E(r) ≡ E (r). Then E ≡ E .
Proof. Induction on E, E .
Proof. Induction on m.
Now we prove the basic result.
Theorem 2. Let F be a derivable formula of the form (3) and H 2d(F ) (F ) be
underivable. Then one of the clauses Di (1 ≤ i ≤ δ) is derivable in UF by at
most one application of the rule B.
Decidability of the Class E by Maslov’s Inverse Method 535
for some F -term t. In other words, from (9,10) for some F -terms r and t it
follows that tj is contained in tl , in particular d(tl ) ≥ d(tj ). By (8) and Lemma
5 this implies
d(r) ≥ d(t). (11)
Now we shall prove that it is the clause Dαl that is derivable in UF by one
application of the rule B. For this it is sufficient to establish that
Now the decision algorithm for the class E is given by Lemma 1 and Theorem 2.
References
1. Chang, C., Lee, R.: Symbolic Logic and Mechanical Theorem Proving. Academic
Press, New York (1973)
2. Andreka, H., van Benthem, J., Nemeti, I.: Modal Logics and Bounded Fragments
of Predicate Logic. J. Philos. Log. 27, 217–230 (1998)
3. Davydov, G., Maslov, S., Mints, G., Orevkov, V., Slisenko, A.: A Computer Al-
gorithm for Establishing Deducibility Based on Inverse Method. In: Seminars in
Math., V.A. Steklov Math. Inst., vol. 16, pp. 1–16 (1971)
4. Degtyarev, A., Voronkov, A.: The Inverse Method. In: Robinson, A., Voronkov,
A. (eds.) Handbook of Automated Reasoning, pp. 179–272. Elsevier, Amsterdam
(2001)
5. Dowek, G., Jiang, Y.: Eigenvariables, Bracketing and the Decidability of Positive
Minimal Predicate Logic. Theor. Comput. Sci. 360, 193–208 (2006)
6. Gurevich, Y.: Decision Problem for the Logic of Predicates and Operations. Algebra
and Log. 8(3), 284–308 (1968)
7. Maslov, S.: The Inverse Method for Logical Calculi. Trudy Mat. Inst. Steklov 98,
26–87 (1968)
8. Maslov, S., Mints, G.: Proof Search Theory and the Inverse Method (Russian).
Supplement to the Russian Translation of the book by Chang and Lee [1], Nauka,
Moscow, pp. 310–340 (1983)
9. Maslov, S., Orevkov, V.: Decidable Classes Reducible to the One-quantifier Class.
Trudy Inst. Steklov 121, 57–66 (1972)
Decidability of the Class E by Maslov’s Inverse Method 537
10. Maslov, S.: Connection between the Strategies of the Inverse Method and
the Resolution Method Seminars in Math., vol. 16, pp. 48–54. Plenum Press,
New York (1971)
11. Mints, G.: Gentzen-type Systems and Resolution Rules. Part I. Propositional Logic.
In: Martin-Löf, P., Mints, G. (eds.) COLOG 1988. LNCS, vol. 417, pp. 198–231.
Springer, Heidelberg (1990)
12. Mints, G.: Gentzen-type Systems and Resolution Rule. Part II. In: Logic Collo-
quium 1990. Lecture Notes in Logic, vol. 2, pp. 163–190 (1994)
13. Mints, G.: Solvability of the Problem of Deducibility in LJ for the Class of For-
mulas not Containing Negative Occurrences of Quantifiers. Proc. Steklov Inst. of
Mathematics 98, 135–145 (1971)
14. Orevkov, V.: One Decidable Class of Formulas of the Predicate Calculus with
Function Symbols. In: Proceedings of the II-nd Symposium in Cybernetics, Tbilisi,
p. 176 (1965)
Logics for Two Fragments beyond the Syllogistic
Boundary
Lawrence S. Moss
1 Introduction
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 538–564, 2010.
c Springer-Verlag Berlin Heidelberg 2010
Logics for Two Fragments beyond the Syllogistic Boundary 539
programs to carry out the inferences from te that people do, or usually do, when
they read some text. This is an active area with connections to Information
Extraction. But there is not much logic there, at least not yet. Another area,
one probably closer to your interests, has been an exploration of fragments of
natural language that correspond to complexity classes. This work has been
carried out by Ian Pratt-Hartmann [13,14]. I think it could be interesting to
complexity theorists. And then there is the matter of giving logical systems in
which one can carry out as much simple reasoning in language as possible. This
has been going on for some time (see van Benthem [1] for one early reference),
and I could tell you more about it.
Q: But surely the general problem is undecidable. Actually, when I think about
matters like vagueness, incomplete sentences, failures of reference, figurative lan-
guage, and more and more problems . . ., I’m not even sure it makes much sense
to talk about what you call “simple reasoning in language”.
A: To quote your teacher [8] on the Entscheidungsproblem,
A: Sure, I’d be happy to. Let’s say we’re talking about a big table full of fruit.
(He writes on the board.)
I say that if we assume (or believe, or know) all the sentences above the line,
then we should do the same for the sentence below the line.
Q: “non-pineapple”?! I thought this was supposed to be natural language.
A: Take it as a shorthand for “piece of fruit which is not a pineapple”.
540 L.S. Moss
Q: Aristotle, . . ., hmm. You know, I once heard that the three greatest logicians
of all time are Aristotle, Frege, and Gödel. I know a bit about the last two, but
I have almost no idea what Aristotle did. Something about all men are mortal ?
A: Aristotle raised the matter of inference in the first place, with no precedent.
And then for the fragment he was concerned with he provided a complete answer.
I can’t imagine a bigger project.
Logics for Two Fragments beyond the Syllogistic Boundary 541
Q: I can see that you’re really taken with Aristotle. But okay, let’s go back
to what’s on the blackboard (1). Do you have a logical system in which we can
prove (1), and which is still decidable?
A: Yes. Want to see it?
Q: Sure, but before you start in, I have a last question. You started out men-
tioning that there is some interest in natural logic from people in much more
applied areas of computer science. Are they going to be interested in logical
systems for cases like (1), the kind of thing that you are going to show me?
A: Probably not. Presumably they would be much more interested in an algo-
rithm that worked quickly and correctly on 90% of the real-world inferences that
come up in practice, than on a logical system that was complete in the logician’s
sense but was only good for a small amount of real-world inference. On the other
hand, knowing what a complete system looked like could be an inspiration, or
at least a comfort.
Q: Can you give me an example? Something that a logical system is not likely
to get, but which is an inference from text in the sense of current work in natural
language processing?
A: (Again writing on the board.)
There are only two languages in this paper (actually they are families of lan-
guages parameterized by sets of basic symbols): the language L of this section,
and the extension L(adj) studied in Section 3. L is based on three pairwise dis-
joint sets called P, R, and K. These are called unary atoms, binary atoms, and
constant symbols. (The reason we use the letters P and R is that they remind
us of predicate symbols and relation symbols.)
542 L.S. Moss
We present the syntax of L in Figure 1. Sentences are built from constant sym-
bols, unary and binary atoms using an involutive symbol for negation, a forma-
tion of set terms, and also a form of quantification. The second column indicates
the variables that we shall use in order to refer to the objects of the various
syntactic categories. Because the syntax is not standard, it will be worthwhile to
go through it slowly and to provide glosses in English for expressions of various
types.
One might think of the constant symbols as proper names such as John and
Mary. The unary atoms may be glossed as one-place predictates such as boys,
girls, etc. And the relation symbols correspond to transitive verbs (that is, verbs
which take a direct object) such as likes, sees, etc. They also correspond to com-
parative adjective phrases such as is bigger than. (However, later on in Section 3,
we introduce a new syntactic primitive for the adjectives.)
Unary atoms appear to be one-place relation symbols, especially because we
shall form sentences of the form p(j). However, we do not have sentences p(x),
since we have no variables at this point in the first place. Similar remarks apply
to binary atoms and two-place relation symbols. So we chose to change the
terminology from relation symbols to atoms.
We form unary and binary literals using the bar notation. We think of this as
expressing classical negation. So we take it to be involutive, so that p = p and
s = s.
The set terms in this language are the only recursive construct. If b is read
as boys and s as sees, then one should read ∀(b, s) as sees all boys, and ∃(b, s)
as sees some boys. Hence these set terms correspond to simple verb phrases. We
also allow negation on the atoms, so we have ∀(b, s); this can be read as fails to
see all boys, or (better) sees no boys or doesn’t see any boys. We also have ∃(b, s),
fails to see some boys. But the recursion allows us to embed set terms, and so we
have set terms like
∃(∀(∀(b, s), h), a)
which may be taken to symbolize a verb phrase such as admires someone who
hates everyone who does not see any boy.
Logics for Two Fragments beyond the Syllogistic Boundary 543
We should note that the relative clauses which can be obtained in this way
are all “missing the subject”, never “missing the object”. The language is too
poor to express predicates like λx.all boys see x.
The main sentences in the language are of the form ∀(b, c) and ∃(b, c); they
can be read as statements of the inclusion of one set term extension in another,
and of the non-empty intersection. We also have sentences using the constants,
such as ∀(g, s)(m), corresponding to Mary sees all girls. But we are not able to say
all girls see Mary; the syntax again is too weak. (However, in our Conclusion we
shall see how to extend our system to handle this.) This weakness in expressive
power corresponds to a less complex decidability result, as we shall see.
The bar notation. We have already seen that our unary and binary atoms come
with negative forms. We extend this notation to all sentences in the following
ways: p = p, s = s, ∃(l, r) = ∀(l, r), ∀(l, r) = ∃(l, r), ∀(c, d) = ∃(c, d), ∃(c, d) =
∀(c, d), c(j) = c(j), and r(j, k) = r(j, k).
The same is true of the relational syllogistic; cf. [15]. In the other direction, we
translate L to FO2 , the fragment of first order logic using only the variables x
and w. We do this by mapping the set terms two ways, called c → ϕc,x and
c → ϕc,y . Here are the recursion equations for c → ϕc,x :
The equations for c → ϕc,y are similar. Then the translation of the sentences
into FO2 follows easily.
Translation of L into Boolean modal logic. We shall write L̂ for the following
version of Boolean modal logic. L̂ has each p ∈ P as an atomic proposition, and
it has two modal operators, s and s , one for each s ∈ R. The syntax of L̂ is
given by
ϕ := p ¬ϕ ϕ ∧ ψ s ϕ s ϕ
The language is interpreted on the same kind of structures that we have been
using for L. Then [[p]] is given for all atoms p, and we also set [[¬ϕ]] = M \ [[ϕ]],
[[ϕ ∧ ψ]] = [[ϕ]] ∩ [[ψ]], and
We write Γ |= ϕ to mean that for all structures M and all x ∈ M , if x ∈ [[ψ]] for
all ψ ∈ Γ , then again x ∈ [[ϕ]].
Let L0 be the set of sentences of L which do not involve constants. We trans-
late L0 into L̂. First, for each set term c, we define a sentence c∗ of L̂. The
definition is: p∗ = p, p∗ = ¬p,
An easy induction shows that [[c]] = [[c∗ ]] for all set terms c. Then we translate
∀(c, d) to ∀(c, d)∗ and ∃(c, d) to ∃(c, d)∗ :
Pratt-Hartmann and Moss [15] investigated several logical systems for the rela-
tional syllogistic and asked whether they were axiomatizable in a purely syllo-
gistic fashion. We shall not enter into their definition of syllogistic proof system
except to say that it is an adequate formalization of the concept. It comes in
two flavors, depending on whether one permits reductio ad absurdum or not.
It turns out that the consequence relation for some some logical languages can
be captured by syllogistic systems even without reductio, some can be captured
with reductio but (provably) not without it, and some are so strong that they
cannot be captured even with reductio. The language of this paper would be one
example of this latter phenomenon. Theorem 6.12 of [15] shows that for the lan-
guage L of this paper (but without constant symbols), there indeed is no finite
complete syllogistic system. It is therefore of interest to build a proof system
which goes beyond syllogistic logic. This is the main technical goal of this paper.
We present our system in natural-deduction style in Figure 3. It makes use
of introduction and elimination rules, and more critically of variables. For a
textbook account of a proof system for first-order logic presented in this way,
see van Dalen [3].
General sentences in this fragment are what usually are called formulas. We
prefer to change the standard terminology to make the point that here, sentences
are not built from formulas by quantification. In fact, sentences in our sense do
not have variable occurrences. But general sentences do include variables. They
are only used in our proof theory.
546 L.S. Moss
The bar notation, again. We have already seen the bar notation c for set terms c,
and ϕ for sentences ϕ. We extend this to formulas b(x) = b(x), r(x, y) = r(x, y).
We technically have a general sentence ⊥, but this plays no role in the proof
theory.
We write Γ ϕ if there is a proof tree conforming to the rules of the system
with root labeled ϕ and whose axioms are labeled by elements of Γ . (Frequently
we shall be sloppy about the labeling and just speak, e.g, of the root as if it were
a sentence instead of being labeled by one.) Instead of giving a precise definition
here, we shall content ourselves with a series of examples in Section 2.3 just
below.
The system has two rules called (∀E), one for deriving general sentences of
the form c(x) or c(j), and one for deriving general sentences r(x, y) or r(j, k).
(Other rules are doubled as well, of course.) It surely looks like these should be
unified, and the system would of course be more elegant if they were. But given
the way we are presenting the syntax, there is no way to do this. That is, we do
not have a concept of substitution, and so rules like (∀E) cannot be formulated
in the usual way. Returning to the two rules with the same name, we could have
chosen to use different names, say (∀E1) and (∀E2). But the result would have
been a more cluttered notation, and it is always clear from context which rule
is being used.
Although we are speaking of trees, we don’t distinguish left from right. This
is especially the case with the (∃E) rules, where the canceled hypotheses may
occur in either order.
Side Conditions. As with every natural deduction system using variables, there
are some side conditions which are needed in order to have a sound system.
In (∀I), x must not occur free in any uncanceled hypothesis. For example, in
the version whose root is ∀(c, d), one must cancel all occurrences of c(x) in the
leaves, and x must not appear free in any other leaf.
Logics for Two Fragments beyond the Syllogistic Boundary 547
[c(x)] [c(x)]
.. ..
.. ..
d(x) r(t, x)
∀I ∀I
∀(c, d) ∀(c, r)(t)
[ϕ]
..
..
α α
⊥I ⊥
⊥ ϕ RAA
Fig. 3. Proof rules. See the text for the side conditions in the (∀I) and (∃E) rules.
In (∃E), the variable x must not occur free in the conclusion α or in any
uncanceled hypothesis in the subderivation of α.
In contrast to usual first-order natural deduction systems, there are no side
conditions on the rules (∀E) and (∃I). The usual side conditions are phrased in
terms of concepts such as free substitution, and the syntax here has no substi-
tution to begin with. To be sure on this point, one should check the soundness
result of Lemma 1.
[c(y)]1 ∀(c, d)
1 ∀(c, d) hyp 1 ∀E
[r(x, y)] d(y)
2 x ∃(c, r)(x) hyp 2 ∃I
[∃(c, r)(x)] ∃(d, r)(x)
3 c(y) ∃E, 2 ∃E 1
∃(d, r)(x)
2
∃E, 2 ∀I
4 r(x, y) ∀(∃(c, r), ∃(d, r))
5 d(y) ∀E, 1, 3
6 ∃(d, r)(x) ∃I, 4, 5
7 ∀(∃(c, r), ∃(d, r)) ∀I, 1–6
2.3 Examples
We present a few examples of the proof system at work, along with comments
pertaining to the side conditions. Many of these are taken from the proof system
R∗ for the language R∗ of [15]. That system R∗ is among the strongest of the
known syllogistic systems, and so it is of interest to check the current proof
system is at least as strong.
Example 2. Here is a proof of the classical syllogism Darii: ∀(b, d), ∃(c, b)
∃(c, d):
[b(x)]1 ∀(b, d)
∀E
d(x) [c(x)]1
∃I
∃(c, b) ∃(c, d)
∃E 1
∃(c, d)
Example 3. Next we study a principle called (K) in [15]. Intuitively, if all watches
are expensive items, then everyone who owns a watch owns an expensive item. The
formal statement in our language is ∀(c, d) ∀(∃(c, r), ∃(d, r)). See Figure 4.
We present a Fitch-style proof on the left and the corresponding one in our
formalism on the right. One aspect of the Fitch-style system is that (∃E) gives
two lines; see lines 3 and 4 on the left in Figure 4.
[c(y)]1 ∀(c, c)
∀E
c(y) [c(y)]1
⊥I
⊥
RAA
r(x, y)
∀I 1
[d(x)]2 ∀(c, r)(x)
∀I 2
∀(d, ∀(c, r))
Logics for Two Fragments beyond the Syllogistic Boundary 549
2.4 Soundness
Before presenting a soundness result, it might be good to see an improper deriva-
tion. Here is one, purporting to infer some men see some men from some men see
some women:
[s(x, x)]1 [m(x)]2
∃I
∃(s, m)(x) [m(x)]2
∃I
[∃(w, s)(x)]2 ∃(m, ∃(m, s))
∃E 1
∃(m, ∃(w, s)) ∃(m, ∃(m, s))
∃E 2
∃(m, ∃(m, s))
The specific problem here is that when [s(x, x)] is withdrawn in the application
of ∃I 1 , the variable x is free in the as-yet-uncanceled leaves labeled m(x).
550 L.S. Moss
Lemma 1. Let Π be any proof tree for this fragment all of whose nodes are
labeled with L-formulas, let ϕ be the root of Π, let M be a structure, let v : X →
M be a variable assignment, and assume that for all uncanceled leaves ψ of Π,
M |= ψ[v]. Then also M |= ϕ[v].
(That is, we are considering an instance of (∀E) when the terms t and u are
variables.) The variables x and y might well be the same. Let M be a structure,
and v be a variable assignment making true the leaves of the tree. By induc-
tion hypothesis, [[c]](v(y)) and also [[r]](v(x), m)) for all m ∈ [[c]]. In particular,
[[r]](v(x), v(y)).
The remaining cases are similar.
The completeness of the logic parallels the Henkin-style completeness result for
first-order logic. Given a consistent theory Γ , we get a model of Γ in the following
way: (1) take the underlying language L, add constant symbols to the language
to witness existential sentences; (2) extend Γ to a maximal consistent set in the
larger language; and then (3) use the set of constant symbols as the carrier of a
model in a canonical way. In the setting of this paper, the work is in some ways
easier than in the standard setting, and in some ways harder. There are more
details to check, since the language has more basic constructs. But one doesn’t
need to take a quotient by equivalence classes, and in other ways the work here
is easier.
Given two languages L and L , we say that L ⊇ L if every symbol (of any
type) in L is also a symbol (of the same type) in L . In this paper, the main case
is when P(L) = P(L ), R(L) = R(L ), and K(L) ⊆ K(L ); that is, L arises by
adding constants to L.
A theory in a language is just a set of sentences in it. Given a theory Γ in
a language L, and a theory Γ ∗ in an extension L ⊇ L, we say that Γ ∗ is a
conservative extension of Γ if for every ϕ ∈ L, if Γ ∗ ϕ, then Γ ϕ.
Proof. For (1), suppose that Γ contains ∃(c, d) and that Γ + c(j) + d(j) ϕ.
Let Π be a derivation tree. Replace the constant j by an individual variable x
which does not occur in Π. The result is still a derivation tree, except that the
leaves are not labeled by sentences. (The reason is that our proof system has
no rules specifically for constants, only for terms which might be constants and
also might be individual variables.) Call the resulting tree Π . Now the following
proof tree shows that Γ ϕ:
[c(x)] [d(x)]
..
..
∃(c, d) ϕ
ϕ ∃E
The subtree on the right is Π . The point is that the occurrences of c(x) and
d(x) have been canceled by the use of ∃E at the root.
This completes the proof of the first assertion, and the proof of the second is
similar.
Proof. This is a routine argument, using Lemma 2. One dovetails the addition
of constants which is needed for the Henkin property together with the addition
of sentences needed to insure maximal consistency. The formal details would use
Lemma 2 for steps of the first kind, and for the second kind we need to know that
if Γ is consistent, then for all ϕ, either Γ + ϕ or Γ + ϕ is consistent. This follows
from the derivable rule of proof by cases; see Example 5 in Section 2.3.
The last point in Lemma 3 states a technical property that will be useful in
Section 3.1.
It might be worthwhile noting that the extensions produced by Lemma 3 add
infinitely many constants to the language.
Let N = {[k] : k ∈ K(L)} × {∀, ∃}. (We use ∀ and ∃ as tags to give two copies
of the quotient K/ ≡.) We endow N with an L-structure as follows:
Before going on, we note that the first of the two alternatives in the defini-
tion of [[s]](([j], Q), ([k], Q )) is independent of the choice of representatives of
equivalence classes. And clearly so is the second alternative.
We shall write N for the resulting L-structure, hiding the dependence on Γ
and Γ ∗ .
Lemma 6. For all c ∈ Sub(Γ ), [[c]] = {([j], Q) : c(j) ∈ Γ ∗ and Q ∈ {∀, ∃}}.
Logics for Two Fragments beyond the Syllogistic Boundary 555
Proof. By induction on set terms c. We are not going to present any of the
details here because in Lemma 10 below, we shall see all the details on a more
involved result.
Lemma 7. N |= Γ .
Proof. Again we are only highlighting a few details, since the full account is
similar to what we saw in Lemma 5, and to what we shall see in Lemma 11. One
would check the sentence types in turn, using Lemma 6 frequently. We want to
go into details concerning sentences in Γ of the form s(j, k) or s(j, k). Recall that
we are dealing in this result with sentences of L, and so j and k are constant
symbols of that language. Also recall that [[j]] = ([j], ∃), and similarly for k.
First, consider sentences in Γ of the form s(j, k). By the definition of [[s]], we
have
[[s]](([j], ∃), ([k], ∃)).
By the way binary atoms and constants are interpreted in N, we have N |= s(j, k),
as desired.
We conclude with the consideration of a sentence in Γ of the form s(j, k). We
wish to show that N |= s(j, k). Suppose towards a contradiction that N |= s(j, k).
Then we have [[s]](([j], ∃), ([k], ∃)). There are two possibilities, corresponding to
the alternatives in the semantics of s. The first is when there is a set term c such
that Γ ∗ contains c(k) and ∀(c, s)(j). By (∀E), Γ ∗ then contains s(j, k). But
recall that Γ contains s(j, k). So in this alternative, Γ ∗ ⊇ Γ is inconsistent. In
the second alternative, there are j∗ ≡ j and k∗ ≡ k such that s(j∗ , k∗ ) ∈ Γ ∗ . But
recall that the equivalence classes of constant symbols from the base language L
are singletons. Thus in this alternative, j∗ = j and k∗ = k; hence s(j, k) ∈ Γ ∗ .
But then again Γ ∗ is inconsistent, a contradiction.
Complexity notes. Theorem 2 implies that the satisfiability problem for our lan-
guage is in NExpTime. We can improve this to an ExpTime-completeness result
by quoting the work of others. Pratt-Hartmann [14] define a certain logic E2 and
showed that the complexity of its satisfiability problem is ExpTime-complete.
E2 corresponds to a fragment of first-order logic, and it is somewhat bigger than
the language L. (It would correspond to adding converses to the binary atoms
in L, as we mention at the very end of this paper.) Since satisfiability for E2 is
ExpTime-complete, the same problem for L is in ExpTime.
A different way to obtain this upper bound is via the embedding into Boolean
modal logic which we saw in Section 2.1. For this, see Theorem 7 of Lutz and
Sattler [9]. We shall use an extension of that result below in connection with an
extension L(adj) of L.
The ExpTime-hardness for L follows from Lemma 6.1 in [15]. That result
dealt with a language called R† , and R† is a sub-language of L.
556 L.S. Moss
Syntax and semantics. We start with four pairwise disjoint sets A (for compar-
ative adjective phrases) and the three that we saw before: P, R, and K. We use
a as a variable to range over A in our statement of the syntax and the rules.
Logics for Two Fragments beyond the Syllogistic Boundary 557
Proof system. We adopt the same proof system as in Figure 3, but with one
addition. This addition is the rule for transitivity:
a(t1 , t2 ) a(t2 , t3 )
trans
a(t1 , t3 )
and ϕ is
∀(∃(sw, bigger), ∀(kq, bigger)).
(We are going to use kq as an abbreviation of kumquat for typographical con-
venience, and similarly for sw and sweet.) In Example 6, we saw that Γ
∀(sweet, ∀(kq, bigger)). (Recall that we saw this with a derivation in a differ-
ent format, in Figure 5. This could be converted to our official format of natural
deduction trees.) That work used R = {bigger}, but here we want R = ∅ and
A = {bigger}. The same derivation works, of course. Transitivity enables us to
obtain a derivation for (1):
..
..
[sw(y)]2 ∀(sw, ∀(kq, bigger))
∀E
[kq(z)]1 ∀(kq, bigger)(y)
2 ∀E
[bigger(x, y)] bigger(y, z)
trans
bigger(x, z)
3 ∀I 1
[∃(sw, bigger)(x)] ∀(kq, bigger)(x)
∃E 2
∀(kq, bigger)(x)
3
∀I
∀(∃(sw, bigger), ∀(kq, bigger))
Adding the transitivity rule gives a sound and complete proof system for the
semantic consequence relation Γ |= ϕ. The soundness is easy, and so we only
558 L.S. Moss
j = j0 ≡ k0 , j1 ≡ k1 , ..., jn ≡ kn = k (4)
and so we have ∃(c, a)(j1 ). Since a(k0 , j1 ), we easily have ∃(c, a)(k0 ) by transi-
tivity. And as j0 ≡ k0 , we have ∃(c, a)(j0 ).
The second assertion is also proved by induction on n ≥ 1. For n = 1, we
have j = j0 ≡ k0 , Γ ∗ contains a(k0 , j1 ); and j1 ≡ k1 = k. Then since the ≡ is
the identity on K(L), j = j0 = k0 , and j1 = k1 = k. Hence Γ ∗ contains s(j, k).
Assuming our result for n, we again consider a chain as in (4) of length n + 1.
Just as before, j = j0 = k0 , and so Γ ∗ contains a(j, j1 ). By induction hypothesis,
Γ ∗ contains a(j1 , k). By transitivity, Γ ∗ contains a(j, k).
Once again, we suppress Γ and Γ ∗ and simply write N for the resulting L-
structure.
Proof. In this proof and the next, we are going to use l to stand for a constant
symbol, even though earlier in the paper we used it for a literal. Assume that
Clearly we have the first requirement concerning [[a]]: if ∀(c, a)(l) ∈ Γ ∗ , then also
∀(c, a)(j) ∈ Γ ∗ .
We have four cases, depending on the reasons for the two assertions in (5).
Case 1 There is a set term b such that Γ ∗ contains b(k) and ∀(b, a)(j), and
there is also a set term c such that Γ ∗ contains c(l) and ∀(c, a)(k). By (1), Γ ∗
contains c(l) and ∀(c, a)(j). And so we have requirement (2a) concerning [[a]] for
([j], Q) and ([l], Q ).
Case 2 There is a set term b such that Γ ∗ contains b(k) and ∀(b, a)(j), and k
reaches l. Note that a(j, k). So j reaches l.
Case 3 j reaches k by a chain of ≡ and a statements, and there is a set term
c such that Γ ∗ contains c(l) and ∀(c, a)(k). Then a(k, l). And so j reaches l.
Case 4 j reaches k, and k reaches l. Then concatenating the chains shows that
j reaches l.
Lemma 10. For all c ∈ Sub(Γ ), [[c]] = {([j], Q) : c(j) ∈ Γ ∗ and Q ∈ {∀, ∃}}.
∀(c, s) Suppose that ∀(c, s) ∈ Sub(Γ ), so that c ∈ Sub(Γ ) as well. We prove that
Let ([j], Q) ∈ [[∀(c, s)]]. We shall show that ∀(c, s)(j) ∈ Γ ∗ . If not, then by
maximal consistency, ∃(c, s)(j) ∈ Γ ∗ . By the Henkin property, let k be such
that Γ ∗ contains c(k) and s(j, k). By induction hypothesis, ([k], ∀) ∈ [[c]]. And
so ([j], ∀)[[s]]([k], ∀). Thus there is a set term b such that Γ ∗ contains b(k) and
∀(b, s)(j). From these, Γ ∗ contains s(j, k). And thus Γ ∗ is inconsistent. This
contradiction shows that indeed ∀(c, s)(j) ∈ Γ ∗ .
In the other direction, suppose that ([j], Q) is such that ∀(c, s)(j) ∈ Γ ∗ . Let
([k], Q ) ∈ [[c]], so by induction hypothesis, c(k) ∈ Γ ∗ . By the way we interpret
binary relations in N, [[s]](([j], Q), ([k], Q )). This for all ([k], Q ) ∈ [[c]] shows that
([j], Q) ∈ [[∀(c, s)]].
∃(c, s) Suppose that ∃(c, s) ∈ Sub(Γ ), so that c ∈ Sub(Γ ) as well. Let ([j], Q) ∈
[[∃(c, s)]]. Let k and Q be such that [[c]]([k], Q ) and [[s]](([j], Q), ([k], Q )). By
induction hypothesis, c(k) ∈ Γ ∗ . First, let us consider the case when Q = ∀. Let b
be such that Γ ∗ contains b(k) and ∀(b, s)(j). Using (∀E), we have Γ ∗ ∃(c, s)(j).
And as Γ ∗ is closed under deduction, ∃(c, s)(j) ∈ Γ ∗ as desired. The more
interesting case is when Q = ∃, so that for some j∗ ≡ j and k∗ ≡ k, Γ ∗ contains
s(j∗ , k∗ ). Since c(k) and k ≡ k∗ , we have c(k∗ ) ∈ Γ ∗ . Then using (∃I), we see
that ∃(c, s)(j∗ ) ∈ Γ ∗ . Since j ≡ j∗ , once again we have ∃(c, s)(j) ∈ Γ ∗ .
Conversely, suppose that ∃(c, s)(j) ∈ Γ ∗ . By the Henkin property, let k be such
that c(k) and s(j, k) belong to Γ ∗ . Then [[s]](([j], Q), ([k], ∃)), and by induction
hypothesis, [[c]](k). Hence ([j], Q) ∈ [[∃(c, s)]].
Since Γ ∗ is closed under deduction, we see that indeed ∀(b, a)(j) ∈ Γ ∗ . Going on,
we see from the structure of N that [[s]](([j], Q), ([k], Q )). This for all ([k], Q ) ∈
[[c]] shows that ([j], Q) ∈ [[∀(c, a)]].
Logics for Two Fragments beyond the Syllogistic Boundary 561
Once again, this gives us the finite model property for L(adj). The result is not
interesting from a complexity-theoretic point of view, since we already could see
from Lutz and Sattler [9] that the logic had an ExpTime satisfiability problem.
This paper has provided two logical systems, L and L(adj), along with semantics.
We presented proof systems in the format of natural deduction, and in both cases
we have completeness theorems and the finite model property. The semantics of
the language allows us to translate some natural language sentences into the
languages faithfully.
Set terms in this sense of this paper come from McAllester and Givan [10],
where they are called class terms. That paper was probably the first to present an
infinite fragment relevant to natural language and to study its logical complexity.
The language of [10] did not have negation, and they showed that satisfiability
is NP-complete. The language of [10] is included in the language R∗ of Pratt-
Hartmann and Moss [15]; the difference is that R∗ has “a small amount” of
negation. Yet more negation is found in the language R∗† of [15]. This fragment
has binary and unary atoms and negation. It is equivalent in most respects to the
language L of this paper, but there are two small differences. First, here we have
added constant symbols. In addition to making the system more expressive, the
reason for adding constants is in order to present the Henkin-style completeness
proof in Sections 2.5. The other change is that R∗† does not allow recursively
defined set terms, only “flat” terms. However, from the point of view of decid-
ability and complexity, this change is really minor: one may add new symbols to
flatten a sentence, at the small cost of adding new sentences. The flat version is
also essentially the same as the language E2 of Pratt-Hartmann [14].
The decidability of L(adj) follows from known results on Boolean modal log-
ics [9], but the finite model property appears to be new here.
Proof systems for fragments that are weaker than L appear in [15]. These proof
systems are syllogistic; there are no variables or what we have in this paper
called general sentences. Modulo complexity hypotheses, the proof systems of
this paper are the first ones which are complete and go beyond the capabilities
of syllogistic proof systems. At the same time, they are decidable and useable.
For example, we have seen how the inference in (1) in the Introduction is handled
in our system; see Examples 6 and 7.
The use of natural deduction proofs in connection with natural language is
very old, going back to Fitch [4]. Fitch’s paper does not deal with a formalized
fragment, and so it is not possible to even ask about questions like completeness
and decidability. Also, the phenomena of interest in the paper went beyond what
we covered here. We would like to think that the methods of this paper could
eventually revive interest in Fitch’s proposal by giving it a foundation.
Francez and Dyckhoff [5] propose a proof-theoretic semantics for natural lan-
guage. Their far-ranging proposal goes beyond what we can discuss in this paper.
We only want to mention that our proof rules bear some similarity to theirs.
Logics for Two Fragments beyond the Syllogistic Boundary 563
Their system had no recursive constructs and also no negative determiners, but
it went beyond ours in covering both readings of scope-ambiguous simple sen-
tences. Since our motivation was not proof-theoretic in this paper, we did not
investigate proof-theoretic properties of our system. But it would be interesting
to do so.
It is of interest to go further in order to render more of natural language infer-
ence in complete and decidable logical systems. One next step would be to add
converses to the binary atoms in order to express simple sentences beyond what
we have seen. For example, writing sees−1 for the inverse of see, we could render
Every girl sees Mary as ∀(girl, see−1 )(Mary). It is possible to extend our work in
such a way as to incorporate these converses. The logic would axiomatized on
top of L by adding the rule deriving r−1 (t, u) from r(u, t). But this is only one
of the many things to do.
Acknowledgments
References
Benjamin Rossman
1 Introduction
Is there a “logic” capturing exactly the polynomial-time computable properties
of finite structures? This question was raised by Gurevich [9] in the mid 80s,
Supported by the NSF Graduate Research Fellowship.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 565–580, 2010.
c Springer-Verlag Berlin Heidelberg 2010
566 B. Rossman
nearly a decade after Fagin [8] showed that the NP properties of finite structures
are precisely what can be defined in existential second-order logic. Today this
question remains a central open problem in finite model theory.
Addressing this question, Blass, Gurevich and Shelah [3,4] introduced an
logic/complexity class known as CPT+C or Choiceless Polynomial Time with
Counting. CPT+C is based on a model of computation known as BGS machines
(after the inventors). BGS machines operate directly on structures in a manner
that preserves symmetry at every step of a computation. By contrast, Turing
machines encode structures as strings. This encoding violates the symmetry of
structures like graphs (which might possess nontrivial automorphisms) by im-
posing a linear order on elements. Note that Turing machines are able to exploit
this linear order to efficiently choose an element from any set constructed in the
course of a computation. Thus, it is not uncommon in the high-level description
of an algorithm (say, the well-known Augmenting Paths algorithm for Bipar-
tite Matching) to read something along the lines of “let w be any unmatched
neighbor of the vertex v”. A description like this (implicit) carries a claim that
the ultimate outcome of the computation will not dependent on the choice of
w. However, the validity of such claims cannot be taken for granted: by Rice’s
Theorem, encoding-invariance is an undecidable property of Turing machines.
The BGS machine model of computation is said to be “choiceless” because
it disallows choices which violate the inherent symmetry of the input structure.
Pseudo-instructions of the form “let i be an arbitrary element of the set I” is
forbidden. Similarly, “let w be the first neighbor of v” is meaningless (unless
referring to an explicitly constructed linear order on vertices). The inability of
BGS machines to choose is compensated by parallelism (the power to explore
all choices in parallel) and the machinery of set theory (the power to build sets
using comprehension).
BGS machines may in fact be viewed as the syntactic elements (i.e., formulas)
of a logic, whose semantics is well-defined on any structure. One rough descrip-
tion BGS logic is:
2 Definitions
We begin by defining hereditarily finite expansions of structures in §2.1. We
then define BGS logic and classes CPT and CPT+C in §2.2. The definition of
BGS logic presented here differs from the BGS machines of Blass, et al. [3], but
classes CPT and CPT+C are exactly the same. BGS logic has a bare-bones
syntax that is well-suited for induction, whereas the BGS machines of [3] have
an intuitive and attractive syntax (borrowing from Abstract State Machines)
that is recommended for the actual description of CPT+C algorithms (see [3]
for examples). Let us also mention that CPT(+C) is elsewhere [3,4,7] written
as CPT(+C), the tilde over C evoking the “less” in “choiceless”.
(i.e., elements of HF(A) which are sets) are called hereditarily finite sets
(h.f. sets) over A. (Note that if A is finite, then HF(A) is the countable
union A ∪ ℘(A) ∪ ℘(A ∪ ℘(A)) ∪ ℘(A ∪ ℘(A) ∪ ℘(A ∪ ℘(A))) ∪ · · · .)
2. The rank of a h.f. object x is defined as 0 if x ∈ A ∪ {∅} and 1 +
maxy∈x rank(y) otherwise.
3. A h.f. set x is transitive if y ⊆ x for all y ∈ x such that y is a set. The
transitive closure TC(x) of a h.f. object x is the (unique) smallest transitive
set containing x as an element.
4. The finite von Neumann ordinals κi for i < ω are elements of HF(∅) defined
by κ0 = ∅ and κi+1 = {κ0 , . . . , κi }.
Just like first-order logic, BGS logic (and its weaker cousin BGS− logic) are
defined with respect to a fixed signature σ. Similarly, BGS logic has terms and
formulas. However, BGS logic has an additional syntactic element called pro-
grams (which compute a term t(t(t(...t(∅) . . . ))) iteratively using a “step” term
t(·) until some “halting” formula is satisfied, whereupon an “output” term is
computed).
We remark that BGS logic has the ability to carry out bounded existential and
universal quantification in HF(A) (and thus subsumes first-order logic over the
base structure A). To see this, note that (∃v ∈ t) φ(v) is equivalent to the formula
{∅ : v ∈ t : φ(v)} = {∅} (pedantically, we should write EmptySet instead of ∅ and
Pair(EmptySet, EmptySet) instead of {∅}). Similarly, (∀v ∈ t) φ(v) is equivalent
to the formula {∅ : v ∈ t : ¬φ(v)} = ∅.
We now define the crucial resource by which we measure the complexity of
BGS programs. Informally, a h.f. object x ∈ HF(A) is active for the operation
of a program Π on a structure A if x is the value of any term involved in the
computation of Π on A (until a halt state is reached). By setting a polynomial
bound on the number of active objects, as well as requiring programs to halt on
every input, we arrive at classes CPT and CPT+C.
Definition 6. [Active Objects and Classes CPT and CPT+C] As in the def-
inition of [[·]]A , we omit the superscript from the active-element operator ·A
(defined below) when the structure A is clear from context (as below).
¬φ = φ, φ ∧ ψ = φ ∨ ψ = φ ∪ ψ, t1 = t2 = t1 ∪ t2 ;
Choiceless Computation and Symmetry 571
{s(v) : v ∈ t : φ(v)}
= [[{s(v) : v ∈ t : φ(v)}]] ∪ t ∪ x∈[[t]] φ(x) ∪ s(x) .
{s(v) : v ∈ t : φ(v)}
= [[{s(v) : v ∈ t : φ(v)}]] ∪ t ∪ x∈[[t]] φ(x) ∪ x∈[[t]] : [[φ(x)]]=True s(x).
For the purposes of this paper, either definition is fine (i.e., all results hold just the
same).
572 B. Rossman
Theorem 2 uses the fact that least-fixed-point logic LFP is a subclass of CPT
(this is shown in [3]) and that LFP = P on structures with a built-in linear
order [10,12].
Theorem 3. The class PARITY of finite sets of even cardinality is not definable
in CPT.
The following lemma gives a condition equivalent to the (k, r)-support property.
Lemma 1. G has the (k, r)-support property if, and only if, every kr-supported
subgroup with index nk is k-supported.
Proof. (=⇒) Suppose G has the (k, r)-support property and H is kr-supported
and [G : H] nk . Let B ⊆ A be a support for H of size |B| kr. Fix an arbitrary
partition B = B1 ∪ · · · ∪ Br into sets of size |Bi | k. For i ∈ {1, . . . , r}, let
Hi = Stab• (Bi ) and note that Hi is (k, r)-constructible (since every k-supported
subgroup is (k, r)-constructible). Since H1 ∩ · · · ∩ Hr = Stab• (B) ⊆ H and
[G : H] nk , it follows that H is (k, r)-constructible.
(⇐=) Suppose that every kr-supported subgroup with index nk is k-
supported. To show that every (k, r)-constructible subgroup is k-supported, as-
sume H1 , . . . , Hr are k-supported and H1 ∩ · · · ∩ Hr ⊆ H and [G : H] nk . It
suffices to show that H is k-supported. But note that H is kr-supported (the
union of the supports for H1 , . . . , Hr of size k is a support for H). So H is
k-supported by assumption.
The proof follows after two lemma. The first lemma states that the semantics of
terms and formulas respects automorphisms of A in the expected way.
2
Recall that the index of H in G is defined by [G : H] = |G|/|H|.
574 B. Rossman
Lemma 2. Let γ(v1 , . . . , v ) be any term or formula of BGS logic. For every
structure A, automorphism α ∈ Aut(A) and elements x1 , . . . , x ∈ HF(A),
Proof. The proof is by induction on terms. The bases cases are when t is a
constant symbol or a variable; both cases are trivial. For the induction step, we
consider the various types of term constructs (see Definition 5(1)), namely when
t is:
(i) f (t1 (v̄), . . . , tm (v̄)) where f is an m-ary function symbol f in the signa-
ture of A,
(ii) Pair(t1 (v̄), t2 (v̄)),
(iii) TheUnique(t1 (v̄)),
(iv) Union(t1 (v̄)), or
(v) {s(v̄, w) : w ∈ t1 (v̄) : ϕ(v̄, w)} (i.e., a comprehension term with subterms
t1 (v̄) and s(v̄, w) and subformula ϕ(v̄, w)).
That is, in each case we assume that the lemma holds for subterms t1 , t2 , . . .
(as well as s in case (v)) and prove that [[t(x̄)]] is (k, r)-constructible. For this,
it is sufficient to show: first, that every element of [[t(x̄)]] is (k, r)-constructible;
and second, that Stab([[t(x̄)]]) is a (k, r)-constructible subgroup of Aut(A) (using
Lemma 2).
As for the first claim that every element of [[t(x̄)]] is (k, r)-constructible, we
consider cases (i)–(v) separately. Note that every subterm ti of t has variable rank
r and satisfies |{αy : y ∈ ti (x̄), α ∈ Aut(A)}| nk since ti (x̄) ⊆ t(x̄);
therefore, [[ti (x̄)]] is (k, r)-constructible by the induction hypothesis.
◦ Case (i): [[f (t1 (x̄), . . . , tm (x̄))]] is either an atom (if [[t1 (x̄)]], . . . , [[tm (x̄)]] are
all atoms) or ∅ (otherwise). In either case, [[f (t1 (x̄), . . . , tm (x̄))]] is (k, r)-
constructible.
◦ Case (ii): By the induction hypothesis, [[t1 (x̄)]] and [[t2 (x̄)]] are both (k, r)-
constructible.
◦ Case (iii): If [[t1 (x̄)]] is not a singleton, then [[TheUnique(t1 (x̄))]] = ∅ and
hence is (k, r)-constructible. So assume [[t1 (x̄)]] is a singleton {y}. By the
induction hypothesis, [[t1 (x̄)]] is (k, r)-constructible. By transitivity of (k, r)-
constructibility, y is (k, r)-constructible.
Choiceless Computation and Symmetry 575
for some finite m. Let t denote this term Πout (Πstep (. . . Πstep (∅) . . . )). Whatever
m happens to be, t is a ground term with variable rank r. By Lemma 2, t
is fixed by all automorphisms of A (i.e., Stab(t) = Aut(A)). Thus,
{αy : y ∈ t, α ∈ Aut(A)} = {y : y ∈ t} ⊆ Active(Π, A).
Since |Active(Π, A)| nk (by definition of BGS(nk )), we have |{αy : y ∈
t, α ∈ Aut(A)}| nk . Therefore, Π(A) is (k, r)-constructible by Lemma 3.
576 B. Rossman
In the next two sections, we will use Proposition 1 to prove that CPT+C pro-
grams cannot activate certain h.f. objects over “naked” sets and vector spaces.
5 PARITY ∈
/ CPT
We denote by [n] the “naked” set {1, . . . , n} viewed as structure in the empty
signature. Let PARITY denote the class of naked sets with even cardinality (i.e.,
the “language” of empty sets). Earlier we stated the result of Blass et al. from
[3] that PARITY ∈/ CPT (Theorem 3). A key step in the proof is the following
so-called Support Theorem (Theorem 24 of [3]).
Theorem 5. For every Π ∈ CPT, there is a constant c such that for all suffi-
ciently large n, every object in Active(Π, [n]) has a support of cardinality c.
Proposition 2. For n > 2kr, the symmetric group Sn has the (k, r)-support
property.
We remark that Skr+1 fails to have the (k, r)-support property, as the alternating
subgroup is (k, r)-constructible but not k-supported (its smallest support has
size kr). The following lemma and corollary from [3] are also used in the original
proof of Theorem 5. We include proofs for completeness.
We now give our proof of Proposition 2 (which bypasses the lengthy combinato-
rial argument in [3]).
Let V be a finite vector space over a fixed finite field F . We view V as a structure
with binary operation + and unary operations for scalar multiplication by each
element of F . Let H(V ) denote the set of hyperplanes in V . Note that H(V ) is
an element of HF(V ) (in particular, H(V ) is a set of subsets of V ).
The task of computing H(V ) given V is a polynomial-time function problem
(as opposed to decision problem) in the usual sense of complexity theory (H(V )
has a polynomial-size description as a hereditary finite object, i.e., |TC(H(V ))| =
O(poly(|V |)). Moreover, H(V ) is an invariant of V (i.e., not depending on any
extrinsic linear order on V ). It is thus reasonable to ask whether any CPT+C
program computes the operation V −→ H(V ).
We remark the results of this section hold just the same for the operation
V −→ V ∗ of computing from V the dual space V ∗ of linear functions V −→ F
(suitably represented as an element of HF(V ∪ F )).
d−1
q n−i − 1
d−1
#{d-dimensional subspaces of V } = q n−2i−2
i=0
q i+1 −1 i=0
d−1
= q dn−2( 2 )−2
√ √
q dn−( n−1)( n−2)−2
√
= q (d−1)n+3 n−4
> |V |d−1 .
References
1. Blass, A., Gurevich, Y.: Strong extension axioms and Shelah’s zero-one law for
choiceless polynomial time. Journal of Symbolic Logic 68(1), 65–131 (2003)
2. Blass, A., Gurevich, Y.: A quick update on the open problems in Blass-Gurevich-
Shelah’s article. On polynomial time computations over unordered structures (De-
cembmer 2005),
http://research.microsoft.com/en-us/um/people/gurevich/Opera/150a.pdf
3. Blass, A., Gurevich, Y., Shelah, S.: Choiceless polynomial time. Annals of Pure
and Applied Logic 100(1–3), 141–187 (1999)
580 B. Rossman
4. Blass, A., Gurevich, Y., Shelah, S.: On polynomial time computation over un-
ordered structures. Journal of Symbolic Logic 67(3), 1093–1125 (2002)
5. Cai, J.-Y., Fürer, M., Immerman, N.: An optimal lower bound on the number of
variables for graph identification. Combinatorica 12(4), 389–410 (1992)
6. Chandra, A., Harel, D.: Structure and complexity of relational queries. Journal of
Computer and System Sciences 25, 99–128 (1982)
7. Dawar, A., Richerby, D., Rossman, B.: Choiceless polynomial time, counting and
the Cai-Fürer-Immerman graphs. Annals of Pure and Applied Logic 152, 31–50
(2008)
8. Fagin, R.: Generalized first-order spectra and polynomial-time recognizable sets.
In: Karp, R.M. (ed.) Complexity of Computation. SIAM-AMS Proceedings, vol. 7,
pp. 43–73 (1974)
9. Gurevich, Y.: Toward logic tailored for computational complexity. In: Richter,
M.M., et al. (eds.) Computation and Proof Theory. Springer Lecture Notes in
Mathematics, pp. 175–216. Springer, Heidelberg (1984)
10. Immerman, N.: Relational queries computable in polynomial time. Information and
Control 68(1–3), 86–104 (1986)
11. Shelah, S.: Choiceless polynominal time logic: Inability to express. In: Clote, P.G.,
Schwichtenberg, H. (eds.) CSL 2000. LNCS, vol. 1862, pp. 72–125. Springer, Hei-
delberg (2000)
12. Vardi, M.Y.: The complexity of relational query languages. In: Proc. 14th ACM
Symp. on Theory of Computing, pp. 137–146 (1982)
Hereditary Zero-One Laws for Graphs
Abstract. We consider the random graph Mp̄n on the set [n], where
the probability of {x, y} being an edge is p|x−y|, and p̄ = (p1 , p2 , p3 , ...)
is a series of probabilities. We consider the set of all q̄ derived from p̄ by
inserting 0 probabilities into p̄, or alternatively by decreasing some of
the pi . We say that p̄ hereditarily satisfies the 0-1 law if the 0-1 law (for
first order logic) holds in Mq̄n for every q̄ derived from p̄ in the relevant
way described above. We give a necessary and sufficient condition on p̄
for it to hereditarily satisfy the 0-1 law.
1 Introduction
In this paper we will investigate the random graph on the set [n] = {1, 2, ..., n}
where the probability of a pair i = j ∈ [n] being connected by an edge depends
only on their distance |i − j|. Let us define:
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 581–614, 2010.
c Springer-Verlag Berlin Heidelberg 2010
582 S. Shelah and M. Doron
If L is some logic, we say that Mp̄n satisfies the 0-1 law for the logic L if for each
sentence ψ ∈ L the probability that ψ holds in Mp̄n tends to 0 or 1, as n ap-
proaches ∞. The relations between properties of p̄ and the asymptotic behavior
of Mp̄n were investigated in [2]. It was proved there that for L, the first order
logic in the vocabulary with only the adjacency relation, we have:
Theorem 2. 1. Assumep̄ = (p1 , p2 , ...) is such that 0 ≤ pi < 1 for all i > 0
n
and let fp̄ (n) := log( i=1 (1 − pi ))/ log(n). If limn→∞ fp̄ (n) = 0 then Mp̄n
satisfies the 0-1 law for L.
2. The demand above on fp̄ is the best possible. Formally for each > 0, there
exists some p̄ with 0 ≤ pi < 1 for all i > 0 such that |fp̄ (n)| < but the 0-1
law fails for Mp̄n .
Part (1) above gives a sufficient condition on p̄ for the 0-1 law to hold in Mp̄n , but
the condition is not necessary and a full characterization of p̄ seems to be harder.
However we give below a complete characterization of p̄ in terms of the 0-1 law in
Mq̄n for all q̄ “dominated by p̄”, in the appropriate sense. Alternatively one may
ask which of the asymptotic properties of Mp̄n are kept under some operations
on p̄. The notion of “domination” or the “operations” are taken from examples
of the failure of the 0-1 law, and specifically the construction for part (2) above.
Those are given in [2] by either adding zeros to a given sequence or decreasing
some of the members of a given sequence. Formally define:
Theorem 5. Let p̄ = (p1 , p2 , ...) be such that 0 ≤ pi < 1 for all i > 0, and
j ∈ {1, 2, 3}. Then p̄ j-hereditarily satisfies the 0-1 law for L iff
n
(∗) lim log( (1 − pi ))/ log n = 0.
n→∞
i=1
Moreover we may replace above the “0-1 law” by the “convergence law” or “weak
convergence law”.
Note that the 0-1 law implies the convergence law which in turn implies the
weak convergence law. Hence it is enough to prove the “if” direction for the 0-1
law and the “only if” direction for the weak convergence law. Also note that the
“if” direction is an immediate consequence of Theorem 2 (in the case j = 1 it
is stated in [2] as a corollary at the end of section 3). The case j = 1 is proved
in section 2, and the case j ∈ {2, 3} is proved in section 3. In section 4 we deal
with the case U ∗ (p̄) := {i : pi = 1} is not empty. We give an almost full analysis
of the hereditary 0-1 law in this case as well. The only case which is not fully
characterized is the case j = 1 and |U ∗ (p̄)| = 1. We give some results regarding
this case in section 5. The case j = 1 and |U ∗ (p̄)| = 1 and the case that the
successor relation belongs to the dictionary, will be dealt with in [3]. Table 1
summarizes the results in this article regarding the j-hereditary laws.
2 Adding Zeros
Obviously the 0-1 law strongly fails in some Mq̄n iff Mq̄n does not satisfy the weak
convergence law. Hence in order to prove Theorem 5 for j = 1 it is enough if we
prove:
Lemma 7. Let p̄ = (p1 , p2 , ...) be such that 0 ≤ pi < 1 for all i > 0, and assume
that (∗) of 5 fails. Then for some q̄ ∈ Gen1 (p̄) the 0-1 law for L strongly fails in
Mq̄n .
We would like our graphs Mq̄n to have a certain structure, namely that the
number of triangles in Mq̄n is o(n) rather than say o(n3 ). We can impose this
structure by making demands on q̄. This is made precise by the following:
1. l∗ and 2l∗ are the first and second members of {0 < l < nq̄ : ql > 0}.
2. Let l∗∗ = 4l∗ + 2. If l < nq̄ , l ∈ {l∗ , 2l∗ } and ql > 0, then l ≡ 1 (mod l∗∗ ).
For q̄, q̄ ∈ P we write q̄ prop q̄ if q̄ q̄ , and both q̄ and q̄ are proper.
Observation 2. 1. If p̄i : i ∈ N
is such that each p̄i ∈ P, and i < j ∈ N ⇒
p̄i prop p̄j , then p̄ = ∪i∈N p̄i is proper.
2. Assume that q̄ ∈ P is proper for l∗ and n ∈ N. Then the following event
holds in Mq̄n with probability 1:
(∗)q̄,l∗ If m1 , m2 , m3 ∈ [n] and {m1 , m2 , m3 } is a triangle in Mq̄n , then
{m1 , m2 , m3 } = {l, l + l∗ , l + 2l∗ } for some l > 0.
We can now define the sentence ψ for which we have failure of the 0-1 law.
Definition 10. Let k be an even natural number. Let ψk be the L sentence
“saying”: There exists x0 , x1 , ..., xk such that:
– (x0 , x1 , ..., xk ) is without repetitions.
– For each even 0 ≤ i < k, {xi , xi+1 , xi+2 } is a triangle.
– The valency of x0 and xk is 2.
– For each even 0 < i < k the valency of xi is 4.
– For each odd 0 < i < k the valency of xi is 2.
If the above holds (in a graph G) we say that (x0 , x1 , ..., xk ) is a chain of triangles
(in G).
Definition 11. Let n ∈ N, k ∈ N be even and l∗ ∈ [n]. For 1 ≤ m < n − k · l∗ a
sequence (m0 , m1 , ..., mk ) is called a candidate of type (n, l∗ , k, m) if it is without
repetitions, m0 = m and for each even 0 ≤ i < k, {mi , mi+1 , mi+2 } = {l, l +
l∗ , l + 2l∗ } for some l > 0. Note that for given (n, l∗ , k, m), there are at most 4
candidates of type (n, l∗ , k, m) (and at most 2 if k > 2).
claim 1. Let n ∈ N, k ∈ N be even, and q̄ ∈ P be proper for l∗ . For 1 ≤ m <
n − k · l∗ let Eq̄,m
n
be the following event (on the probability space Mq̄n ): “No
candidate of of type (n, l∗ , k, m) is a
chain of triangles.” Then Mq̄n satisfies with
probability 1: Mq̄ |= ¬ψk iff Mq̄ |= 1≤m<n−k·l∗ Eq̄,m
n n n
Proof. The “only if” direction is immediate. For the “if” direction note that by
Observation 2(2), with probability 1, only a candidate can be a chain of triangles,
and the claim follows immediately.
The following claim shows that by adding enough zeros at the end of q̄ we can
make sure that ψk holds in Mq̄n with probability close to 1. Note that we do
not make a “strong” use of the properness of q̄, i.e we do not use item (2) of
Definition 9.
claim 2. Let q̄ ∈ Pf in be proper for l∗ , k ∈ N be even, and ζ > 0 be some
n
rational. Then there exists q̄ ∈ Pf in such that q̄ prop q̄ and P r[Mq̄q̄ |= ψk ] ≥
1 − ζ.
Proof. For n > nq̄ denote by q̄ n the member of P with nq̄n = n and (q n )l is
ql if l < nq̄ and 0 otherwise. Note that q̄ prop q̄ n , hence if we show that for n
586 S. Shelah and M. Doron
p∗q̄
Denote the expression on the right by and note that it is positive and depends
only on k and q̄ (but not on n). Now assume that n > 6 · n∗ and that 1 ≤ m <
m ≤ n − n∗ are such that m − m > 2 · n∗ . Then the distance between the
sequences s(m) and s(m ) is larger than nq̄ and hence the events Em and Em
∗
are independent. We conclude that P r[Mq̄n |= ψk ] ≤ (1 − p∗q̄ )n/(2·n +1) →n→∞ 0
and hence by choosing n large enough we are done.
The following claim shows that under our assumptions we can always find a long
initial segment q̄ of some member of Gen1 (p̄) such that ψk holds in Mq̄n with
probability close to 0. This is where we make use of our assumptions on p̄ and
the properness of q̄.
claim3. Let p̄ ∈ Pinf , > 0 and assume that for an unbounded set of n ∈ N we
have l=1 (1 − pl ) ≤ n− . Let k ∈ N be even such that k · > 2. Let q̄ ∈ Genr1 (p̄)
n
be proper for l∗ , and ζ > 0 be some rational. Then there exists r > r and
n
q̄ ∈ Genr1 (p̄) such that q̄ prop q̄ and P r[Mq̄q̄ |= ¬ψk ] ≥ 1 − ζ.
Proof. First recalling Definition 9 let l∗∗ = 3l∗ + 2, and for l ≥ nq̄ define r(l) :=
(l − nq̄ + 1)/l∗∗ . Now for each n > nq̄ + l∗∗ denote by q̄n the member of P
defined by:
⎧
⎨ ql 0 < l < nq̄
(qn )l = 0 nq̄ ≤ l < n and l ≡ 1 mod l∗∗
⎩
pr+r(l) nq̄ ≤ l < n and l ≡ 1 mod l∗∗ .
Note that nq̄n = n, q̄n ∈ Genr1 (p̄) where r = r + r(n − 1) > r and q̄ prop q̄n .
Hence if we show that for some n large enough we have P r[Mq̄nn |= ¬ψk ] ≥ 1 − ζ
then we will be done by putting q̄ = q̄n . As before let n∗ := max{kl∗ , nq̄ + l∗ }.
Now fix some n > n∗ and for 1 ≤ m < n − k · l∗ let s(m) be some candidate
of type (n, l∗ , k, m). Denote by E = E(s(m)) the event that s(m) is a chain of
triangles in Mq̄nn . We then have:
(n−n∗ )/2
P r[Mq̄nn |= E] ≤ (ql∗ ) · (q2l∗ )
k k/2
·( (1 − (qi )l ))k .
n∗ +1
Now denote: ∗
n
p∗q̄ := (ql∗ ) · (q2l∗ )
k k/2
· ( (1 − (qi )l ))−k
l=1
Hereditary Zero-One Laws for Graphs 587
and note that it is positive and does not depend on n. Together we get:
(n−n∗ )/2
(n−n∗ )/(2l∗∗ )
P r[Mq̄nn |= E] ≤ p∗q̄ ·( (1 − (qi )l )) ≤ k
p∗q̄ ·( (1 − pl ))k .
l=1 l=1
which are a chain of triangles is at most p∗q̄ · ( l=1 (1 − pl ))k · 4n. Let
∗
E be the following event: “No candidate is a chain of triangles”. Then using
Claim 1 and Markov’s inequality we get:
(n−n∗ )/(2l∗∗ )
∗
P r[Mq̄n |= ψk ] = P r[Mq̄n |= E ] ≤ p∗q̄ ·( (1 − pl ))k · 4n.
l=1
l=1 (1−pl ) ≤ ((n−n∗ )/(2l∗∗ ))− , and note that for n large enough
we have ((n − n∗ )/(2l∗∗ ))− ≤ n−/2 . Hence for infinitely many n ∈ N we have
P r[Mq̄n |= ψk ] ≤ p∗q̄ · 4 · n1−·k/2 , and as · k > 2 this tends to 0 as n tends to
∞, so we are done.
We are now ready to prove Lemma 7. First as (∗) of 5 doesnot hold we have
some > 0 such that for an unbounded set of n ∈ N, we have l=1 (1−pl ) ≤ n− .
n
Let k ∈ N be even such that k · > 2. Now for each i ∈ N we will construct a
pair (q̄i , ri ) such that the following holds:
1. For i ∈ N, q̄i ∈ Genr1i (p̄) and put ni := nq̄i .
2. For i ∈ N, q̄i prop q̄i+1 .
3. For each odd i > 0, P r[Mq̄nii |= ψk ] ≥ 1 − 1i and ri = ri−1 .
4. For each even i > 0, P r[Mq̄nii |= ¬ψk ] ≥ 1 − 1i and ri > ri−1 .
Clearly if we construct such (q̄i , ri ) : i ∈ N
then by taking q̄ = ∪i∈N q̄i (recall
Observation 1), we have q̄ ∈ Gen1 (p̄) and both ψk and ¬ψk holds with probability
approaching 1 in Mq̄n , thus finishing the proof. We turn to the construction of
(q̄i , ri ) : i ∈ N
, and naturally we use induction on i ∈ N.
Case 1: i = 0. Let l1 < l2 be the first and second indexes such that pli > 0. Put
r0 := l2 . If l2 ≤ 2l1 define q̄0 by:
⎧
⎨ pl l ≤ l1
(q0 )l = 0 l1 ≤ l < 2l1
⎩
pl2 l = 2l1 .
Otherwise if l2 > 2l1 define q̄0 by:
⎧
⎪
⎪ 0 l < l2 /2
⎨
p l1 l = l2 /2
(q0 )l =
⎪
⎪ 0 l2 /2 < l < 2l2 /2
⎩
p l2 l = 2l2 /2.
588 S. Shelah and M. Doron
clearly q̄0 ∈ Genr10 (p̄) as desired, and note that q̄0 is proper (for either l1 or
l2 /2).
Case 2: i > 0 is odd. First set ri = ri−1 . Next we use Claim 2 where we set:
q̄i−1 for q̄, 1i for ζ and q̄i is the one promised by the claim. Note that indeed
q̄i−1 prop q̄i , q̄i ∈ genri (p̄) and P r[Mq̄nii |= ψk ] ≥ 1 − 1i .
Case 3: i > 0 is even. We use Claim 3 where we set: q̄i−1 for q̄, 1i for ζ and (ri , q̄i )
are (r , q̄ ) promised by the claim. Note that indeed q̄i−1 prop q̄i , q̄i ∈ Genr1i (p̄)
and P r[Mq̄nii |= ψk ] ≥ 1 − 1i . This completes the proof of Lemma 7.
3 Decreasing Coordinates
In this section we prove Theorem 5 for j ∈ {2, 3}. As before, the “if” direction
is an immediate consequence of Theorem 2. Moreover as Gen3 (p̄) ⊆ Gen2 (p̄) it
remains to prove that if (∗) of 5 fails then the 0-1 law strongly fails for some
q̄
∈ Gen3 (p̄). We divide the proof into two cases according to the behavior of
n
l=1 pi , which is an approximation of the expected number of neighbors of a
given node in Mp̄n . Define:
n
(∗∗) ⇐⇒ lim log( pi )/ log n = 0.
n→∞
i=1
n
Assume that (∗∗) above fails. Then for some > 0, the set {n ∈ N : i=1 pi ≥
n } is unbounded,
n hence we finish by Lemma 12. On the other hand if (∗∗)
holds then i=1 pi increases slower than any positive power
of n; formally for
n
all δ > 0 for some nδ ∈ N we have that n > nδ implies i=1 pi ≤ n . As
δ
we assumenthat (∗) of Theorem 5 fails we have that for some > 0 the set
{n ∈ N : i=1 (1 − pi ) ≤ n− } is unbounded. Together (with −/6 as δ) we have
that the assumptions of Lemma 13 hold, hence we finish the proof.
Lemma 12. Let p̄ ∈ Pinf be such that pl < 1 for
l > 0. Assume that for some
> 0 we have for an unbounded set of n ∈ N: l≤n pl ≥ n . Then for some
q̄ ∈ Gen3 (p̄) and ψ = ψisolated := ∃x∀y¬x ∼ y, both ψ and ¬ψ holds infinitely
often in Mq̄n .
Proof. We construct a series, (q̄1 , q̄2 , ...) such that for i > 0: q̄i ∈ Pf in , q̄i q̄i+1
and ∪i>0 q̄i ∈ Gen3 (p̄). For i ≥ 1 denote ni := nq̄i . We will show that:
∗even For even i > 1: P r[Mq̄nii |= ψ] ≥ 1 − 1i .
∗odd For odd i > 1: P r[Mq̄nii |= ¬ψ] ≥ 1 − 1i .
Taking q̄ = ∪i>0 q̄i will then complete the proof. We construct q̄i by induction
on i > 0:
Case 1: i = 1: Let n1 = 2 and (q1 )1 = p1 .
Case 2: even i > 1: As (q̄i−1 , ni−1 ) is given, let us define q̄i where ni > ni−1
is to be determined later: (qi )l = (qi−1 )l for l < ni−1 and (qi )l = 0 for ni−1 ≤
Hereditary Zero-One Laws for Graphs 589
l < ni . For x ∈ [ni ] let Ex be the event: “x is an isolated point”. Denote
p := ( 0<l<ni−1 (1 − (qi−1 )l )2 and note that p > 0 and does not depend on
ni . Now for x ∈ [ni ], P r[Mq̄nii |= Ex ] ≥ p , furthermore if x, x ∈ [ni ] and
|x − x | > ni−1 then Ex and Ex are independent in Mq̄nii . We conclude that
P r[Mq̄nii |= ¬ψ] ≤ (1 − p ) ni /(ni−1 +1)
which approaches 0 as ni → ∞. So by
choosing ni large enough we have ∗even .
Case 3: odd i > 1: As in case 2 let us define q̄i where ni > ni−1 is to be
determined later: (qi )l = (qi−1 )l for l < ni−1 and (qi )l = pl for ni−1 ≤ l < ni .
Let n
= max{n < ni /2 : n =
2m for some m ∈ N}, so ni /4 ≤ n < ni /2. Denote
a = 0<l≤n (qi )l and a = 0<l≤ ni /4
(qi )l . Again let Ex be the event: “x is
isolated”. Now as n < ni /2, P r[Mq̄nii |= Ex ] ≤ 0<l≤n (1 − (qi )l ). By a repeated
a n
use of: (1−x)(1−y) ≤ (1− x+y 2 ) we get P r[Mq̄i |= Ex ] ≤ (1− n )
2 ni
which for n
large enough is smaller then 2·e , and as a ≤ a, we get P r[Mq̄i |= Ex ] ≤ 2·e−a .
−a ni
n /4
(α) p ≤ n/6 .
l≤n l
(β) l≤n (1 − pl ) ≤ n− .
Let k = 6 + 1 and ψ = ψk be the sentence “saying” there exists a connected
component which includes a path of length k, formally:
ψk := ∃x1 ...∃xk
xi = xj ∧ xi ∼ xi+1 ∧ ∀y[( xi = y) → ( ¬xi ∼ y)].
1≤i=j≤k 1≤i<k 1≤i≤k 1≤i≤k
Then for some q̄ ∈ Gen3 (p̄), each of ψ and ¬ψ holds infinitely often in Mq̄n .
Proof. The proof follows the same line as the proof of 12. We construct an
increasing series, (q̄1 , q̄2 , ...), and demand ∗even and ∗odd as in 12. Taking q̄ =
∪i>0 q̄i will then complete the proof. We construct q̄i by induction on i > 0:
Case 1: i = 1: Let l(∗) := min{l > 0 : pl > 0} and define n1 = l(∗) + 1 and
(q1 )l = pl for l < n1 .
590 S. Shelah and M. Doron
Case 2: even i > 1: As before, for ni > ni−1 define: (qi )l = (qi−1 )l for l < ni−1
and (qi )l = 0 for ni−1 ≤ l < ni . For 1 ≤ x < ni − k · l(∗) let E x be the
event: “(x, x + l(∗), ..., x + l(∗)(k − 1)) exemplifies ψ.” Formally E x holds in
Mq̄nii iff {(x, x + l(∗), ..., x + l(∗)(k − 1))} is isolated and for 0 ≤ j < k − 1,
{x+jl(∗), x+(j +1)l(∗)} is an edge of Mq̄nii . The remainder of this case is similar
to case 2 of Lemma 12 so we will not go into details. Note that P r[Mq̄nii |= E x ] > 0
and does not depend on ni , and if |x − x | is large enough (again not depending
on ni ) then E x and E x are independent in Mq̄nii . We conclude that by choosing
ni large enough we have ∗even .
Case 3: odd i > 1: In this case we make use of the fact that almost always, no
x ∈ [n] has to many neighbors. Formally:
claim 4. Let q̄ ∈ Pinf be such that ql <
Proof. First note that the size of the set {l > 0 : ql > n−δ } is at most n2δ .
Hence by ignoring at most 2n2δ neighbors of each x ∈ [n], and changing the
number of neighbors in the definition of Eδn to 6n2δ we may assume that for
all l > 0, ql ≤ n−δ . The idea is that the number of neighbors of each x ∈ [n]
can be approximated (or in our case only
bounded from above) by a Poisson
random variable with parameter close to ni=l ql . Formally, for each l > 0 let
ln = 1] = ql . For n ∈ N let X
n
Bl be a Bernoulli random variable with P r[B
n
be the random variable defined by X := l=1 Bl . For l > 0 let P ol be a
Poisson random variable with parameter λl := − log(1 − ql ) that is for i =
i
−λl (λl )
0, 1, 2, ... P r[P o
n
n
n
n
n
λ = λl = − log(1 − ql ) = − log( (1 − ql )) = − log( (1 − ql ))
l=1 l=1 l=1 l=1
n
n
l=1 ql n 1 ql n 1
≤ − log[(1 − ) · ] = − log[(1 − l=1 ) · ]
n n n n
n 1 δ 1 2δ
≈ − log[e− l=1 ql
· ] ≤ − log[e−n · ] ≤ − log[e−n ] = n2δ .
n 2n
Hereditary Zero-One Laws for Graphs 591
k
P r[(x1 , ..., xk ) is isolated in Mq̄nii ] = (1−(qi )|xj −y| ) ≤ ( (1−(qi )l ))k .
j=1 y=xj l=1
(1 − (qi )l ) ≤ (1 − pl ) ≤ (1 − ql ) · (ni /2)− ≤ (ni )−/2 .
l=1 l=ni −1 l<ni
In this section we analyze the hereditary 0-1 law for p̄ where some of the pi -s may
equal 1. For p̄ ∈ Pinf let U ∗ (p̄) := {l > 0 : pl = 1}. The situation U ∗ (p̄) = ∅ was
discussed briefly at the end of section 4 of [2], and an example was given there of
some p̄ consisting of only ones and zeros with |U ∗ (p̄)| = ∞ such that the 0-1 law
fails for Mp̄n . We follow the lines of that example and prove that if |U ∗ (p̄)| = ∞
and j ∈ {1, 2, 3}, then the j-hereditary 0-1 law for L fails for p̄. This is done in
14. The case 0 < |U ∗ (p̄)| < ∞ is also studied and a full characterization of the
j-hereditary 0-1 law for L is given in Conclusion 1 for j ∈ {2, 3}, and for j = 1,
1 < |U ∗ (p̄)|. The case j = 1 and 1 = |U ∗ (p̄)| is discussed in section 5.
Theorem 14. Let p̄ ∈ Pinf be such that U ∗ (p̄) is infinite, and j be in {1, 2, 3}.
Then Mp̄n does not satisfy the j-hereditary weak convergence law for L.
Proof. We start with the case j = 1. The idea here is similar to that of section
2. We show that some q̄ ∈ Gen1 (p̄) has a structure (similar to the “proper”
structure defined in 9) that allows us to identify the sections “close” to 1 or n in
Mq̄n . It is then easy to see that if q̄ has infinitely many ones and infinitely many
“long” sections of consecutive zeros, then the sentence saying: “there exists an
edge connecting vertices close to the the edges”, will exemplify the failure of the
0-1 law for Mq̄n . This is formulated below. Consider the following demands on
q̄ ∈ Pinf :
1. Let l∗ < l∗∗ be the first two members of U ∗ (q̄); then l∗ is odd and l∗∗ = 2 · l∗ .
2. If l1 , l2 , l3 all belong to {l > 0 : ql > 0} and l1 + l2 = l3 then l1 = l2 = l∗ .
3. The set {n ∈ N : n − 2l∗ < l < n ⇒ ql = 0} is infinite.
4. The set U ∗ (q̄) is infinite.
We first claim that some q̄ ∈ Gen1 (p̄) satisfies the demands (1)-(4) above. This is
straightforward. We inductively add enough zeros before each nonzero member
of p̄ guaranteeing that it is larger than the sum of any two (not necessarily
different) nonzero members preceding it. We continue until we reach l∗ , then
by adding zeros either before l∗ or before l∗∗ we can guarantee that l∗ is odd
and that l∗∗ = 2 · l∗ , and hence (1) holds. We then continue the same process
from l∗∗ , adding at least 2l∗ zero’s at each step. This guarantees (2) and (3). (4)
follows immediately from our assumption that U ∗ (p̄) is infinite. Assume that q̄
satisfies (1)-(4) and n ∈ N. With probability 1 we have:
To see this use (1) for the “if” direction and (2) for the “only if” direction. We
conclude that letting ψext (x) be the L formula saying that x belongs to exactly
one triangle, for each n ∈ N and m ∈ [n] with probability 1 we have:
We are now ready to prove the failure of the weak convergence law in Mq̄n , but
in the first stage let us only show the failure of the convergence law. This will
be useful for other cases (see Remark 15 below). Define
Recall that l∗ is the first member of U ∗ (q̄), and hence for some p > 0 (not
depending on n) for any x, y ∈ [1, l∗ ] we have P r[Mq̄n |= ¬x ∼ y] ≥ p and
similarly for any x, y ∈ (n − l∗ , n]. We conclude that:
l∗
P r[(∃x∃y)(x, y ∈ [1, l∗ ] or x, y ∈ (n − l∗ , n]) and x ∼ y] ≤ 1 − p2( 2 ) < 1.
By all the above, for each l such that ql = 1 we have P r[Mq̄l+1 |= ψ] = 1, as the
pair (1, l + 1) exemplifies ψ in Mq̄l+1 with probability 1. On the other hand if n
l∗
is such that n − 2l∗ < l < n ⇒ q = 0 then P r[M n |= ψ] ≤ 1 − p2( 2 ) . Hence
l q̄
by (3) and (4) above, ψ exemplifies the failure of the convergence law for Mq̄n as
required.
We return to the proof of the failure of the weak convergence law. Define:
ψ = ∃x0 ...∃x2l∗ −1 [ xi = xi ∧ ∀y(( y = xi ) → ¬ψext (y))
0≤i<i <2l∗ 0≤i<2l∗
∧ ψext (xi ) ∧ x2i ∼ x2i+1 ].
0≤i<2l∗ 0≤i<l∗
We will show that each of ψ and ¬ψ holds infinitely often in Mq̄n . First let n ∈ N
be such that qn−l∗ = 1. Then by choosing for each i in the range 0 ≤ i < l∗ ,
x2i := i+1 and x2i+1 := n−l∗ +1+i, we will get that the sequence (x0 , ..., x2l∗ −1 )
exemplifies ψ in Mq̄n (with probability 1). As by assumption (4) above the set
{n ∈ N : qn−l∗ = 1} is unbounded we have lim supn→∞ [Mq̄n |= ψ ] = 1. For the
other direction let n ∈ N be such that for each l in the range n−2l∗ < l < n, ql =
0. Then Mq̄n satisfies (again with probability 1) for each x, y ∈ [1, l∗ ] ∪ (n − l∗ , n]
such that x ∼ y: x ∈ [1, l∗ ] iff y ∈ [1, l∗ ]. Now assume that (x0 , ..., x2l∗ −1 )
exemplifies ψ in Mq̄n . Then for each i in the range 0 ≤ i < l∗ , x2i ∈ [1, l∗ ] iff
x2i+1 ∈ [1, l∗ ]. We conclude that the set [1, l∗ ] is of even size, thus contradicting
(1). So we have P r[Mq̄n |= ψ ] = 0. But by assumption (3) above the set of natural
numbers, n, for which we have n − 2l∗ < l < n implies ql = 0 is unbounded, and
hence we have lim supn→∞ [Mq̄n |= ¬ψ ] = 1 as desired.
We turn to the proof of the case j ∈ {2, 3}, and as Gen3 (p̄) ⊆ Gen2 (p̄) it is
enough to prove that for some q̄ ∈ Gen3 (p̄) the 0-1 law for L strongly fails in
Mq̄n . Motivated by the example mentioned above appearing in the end of section
4 of [2], we let ψ be the sentence in L implying that each edge of the graph is
contained in a cycle of length 4. Once again
we use an inductive construction
of (q̄1 , q̄2 , q̄3 , ...) in Pf in such that q̄ = i>0 q̄i ∈ Gen3 (p̄) and both ψ and ¬ψ
hold infinitely often in Mq̄n . For i = 1 let nq̄1 = n1 := min{l : pl = 1} + 1
and define (q1 )l = 0 if 0 < l < n1 − 1 and (q1 )n1 −1 = 1. For even i > 1
let nq̄i = ni := min{l > 4ni−1 : pl = 1} + 1 and define (qi )l = (qi−1 )l if
594 S. Shelah and M. Doron
0 < l < ni−1 , (qi )l = 0 if ni−1 ≤ l < ni − 1 and (q1 )n1 −1 = 1. For odd i > 1
recall n1 = min{l : pl = 1} + 1 and let nq̄i = ni := ni−1 + n1 . Now define
(qi )l = (qi−1 )l if 0 < l < ni−1 and (qi )l = 0 if ni−1 ≤ l < ni . Clearly we have
for even i > 1, P r[Mq̄nnii+1 |= ψ] = 0 and for odd i > 1 P r[Mq̄nnii |= ψ] = 1. Note
+1
that indeed i>0 q̄i ∈ Gen3 (p̄), and hence we are done.
Remark 15. In the proof of the failure of the convergence law in the case j = 1
the assumption |U ∗ (p̄)| = ∞ is not needed, our proof works under the weaker
assumption |U ∗ (p̄)| ≥ 2 and for some p > 0, {l > 0 : pl > p} is infinite. See
below more on the case j = 1 and 1 < |U ∗ (p̄)| < ∞.
Lemma 16. Let q̄ ∈ Pinf and assume:
1. Let l∗ < l∗∗ be the first two members of U ∗ (q̄) (in particular assume
|U ∗ (q̄)| ≥ 2); then l∗∗ = 2 · l∗ .
2. If l1 , l2 , l3 all belong to {l > 0 : ql > 0} and l1 + l2 = l3 then {l1 , l2 , l3 } =
{l, l + l∗ , l + l∗∗ } for some l ≥ 0.
3. Let l∗∗∗ be the first member of {l > 0 : 0 < ql < 1} (in particular assume
|{l > 0 : 0 < ql < 1}| ≥ 1); then the set {n ∈ N : n ≤ l ≤ n + l∗∗ + l∗∗∗ ⇒
ql = 0} is infinite.
Then the 0-1 law for L fails for Mq̄n .
Proof. The proof is similar to the case j = 1 in the proof of Theorem 14, so
we will not go into detail. Below n is some large enough natural number (say
larger than 3 · l∗∗ · l∗∗∗ ) such that (3) above holds, and if we say that some
property holds in Mq̄n we mean it holds there with probability 1. Let ψext
1
(x) be
the formula in L implying that x belongs to at most two distinct triangles. Then
for all m ∈ [n]:
Mq̄n |= ψext
1
[m] iff m ∈ [1, l∗∗ ] ∪ (n − l∗∗ , n].
Similarly for any natural t < n/3l∗∗ define (using induction on t):
t
ψext (x) := (∃y∃z)x ∼ y ∧ x ∼ z ∧ y ∼ z ∧ (ψext
t−1
(y) ∨ ψext
t−1
(z))
Mq̄n |= ψext
t
[m] iff m ∈ [1, tl∗∗ ] ∪ (n − tl∗∗ , n].
Now for 1 ≤ t < n/3l∗∗ let m∗ (t) be the minimal number of edges in
Mq̄n |[1,t·l∗∗ ]∪(n−t·l∗∗ ,n] i.e only edges with probability one and within one of the
intervals are counted, formally
Let 1 ≤ t∗ < n/3l∗∗ be such that l∗∗∗ < l∗∗ · t∗ (it exists as n is large enough).
Note that m∗ (t∗ ) depends only on q̄ and not on n and hence we can define
∗ ∗
ψ := “There exist exactly m∗ (t∗ ) couples {x, y} s.t. ψext
t
(x) ∧ ψext
t
(y) ∧ x ∼ y.”
Hereditary Zero-One Laws for Graphs 595
and note that p does not depend on n, then (recalling assumption (3) above)
we have P r[Mq̄n |= ψ] ≥ (p )2 > 0 thus completing the proof.
Lemma 17. Let q̄ ∈ Pinf be such that for some l1 < l2 ∈ N \ {0} we have:
0 < pl1 < 1, pl2 = 1 and pl = 0 for all l ∈ {l1 , l2 }. Then the 0-1 law for L fails
for Mq̄n .
Proof. Let ψ be the sentence in L “saying” that some vertex has exactly one
neighbor and this neighbor has at least three neighbors. Formally:
ψ := (∃x)(∃!y)x ∼ y ∧(∀z)[x ∼ z → (∃u1 ∃u2 ∃u3 ) ui = uj ∧ z ∼ ui ].
0<i<j≤3 0<i≤3
We first show that for some p > 0 and n0 ∈ N, for all n > n0 we have P r[Mq̄n |=
ψ] > p. To see this simply take n0 = l1 + l2 + 1 and p = (1 − pl1 )(pl1 ). Now for
n > n0 in Mq̄n , with probability 1 − pl1 the node 1 ∈ [n] has exactly one neighbor
(namely 1 + l2 ∈ [n]) and with probability at least pl1 , 1 + l2 is connected to
1 + l1 + l2 , and hence has three neighbors (1, 1 + 2l2 and 1 + l1 + l2 ). This
yields the desired result. On the other hand for some p > 0 we have for all
n ∈ N, P r[Mq̄n |= ¬ψ] > p . To see this note that for all n, only members of
[1, l2 ] ∪ (n − l2 , n] can possibly exemplify ψ, as all members of (l2 , n − l2 ] have at
least two neighbors with probability one. For each x ∈ [1, l2 ] ∪ (n − l2 , n], with
probability at least (1 − p1 )2 , x does not exemplify ψ (since the unique neighbor
of x has less then three neighbors). As the size of [1, l2 ] ∪ (n − l2 , n] is 2 · l2 we
get P r[Mq̄n |= ¬ψ] > (1 − p1 )2l2 := p > 0. Together we are done.
Lemma 18. Let p̄ ∈ Pinf be such that |U ∗ (p̄)| < ∞ and pi ∈ {0, 1} for i > 0.
Then Mp̄n satisfies the 0-1 law for L.
Proof. Let S n be the (not random) structure in vocabulary {Suc}, with universe
[n] and Suc is the successor relation on [n]. It is straightforward to see that any
sentence ψ ∈ L has a sentence ψ S ∈ {Suc} such that
1 S n |= ψ S
P r[Mp̄ |= ψ] =
n
0 S n |= ψ S .
Also by a special case of Gaifman’s result from [1] we have: for each k ∈ N there
exists some nk ∈ N such that if n, n > nk then S n and S n have the same first
order theory of quantifier depth k. Together we are done.
1. The 2-hereditary 0-1 law holds for p̄ iff |{l > 0 : pl > 0}| ≤ 1.
2. The 3-hereditary 0-1 law holds for p̄ iff {l > 0 : 0 < pl < 1} = ∅.
3. If furthermore 1 < |U ∗ (p̄)| then the 1-hereditary 0-1 law holds for p̄ iff {l >
0 : 0 < pl < 1} = ∅.
Proof. For (1) note that if indeed |{i > 0 : pl > 0}| > 1 then some q̄ ∈ Gen2 (p̄)
is as in the assumption of Lemma 17; otherwise any q̄ ∈ Gen2 (p̄) has at most 1
nonzero member and hence Mq̄n satisfies the 0-1 law by either 18 or 2.
For (2) note that if {i > 0 : 0 < pl < 1} = ∅ then some q̄ ∈ Gen3 (p̄) is as in
the assumption of Lemma 17; otherwise any q̄ ∈ Gen3 (p̄) is as in the assumption
of Lemma 18 and we are done.
Similarly for (3) note that if 1 < |U ∗ (p̄)| and {l > 0 : 0 < pl < 1} = ∅
then some q̄ ∈ Gen1 (p̄) satisfies assumptions (1)-(3) of Lemma 16; otherwise
any q̄ ∈ Gen1 (p̄) is as in the assumption of Lemma 18 and we are done.
We try to determine when the 1-hereditary 0-1 law holds. The assumption of
(∗) is justified as the proof in section 2 works also in this case and in fact in any
∗
case
that U (p̄) is finite. To see this replace in section 2 products of the form
l<n (1 − p l ) by l<n,l∈U ∗ (p̄) (1 − pl ), sentences of the form “x has valency m”
by “x has valency m + 2|U ∗ (p̄)|”, and similar simple changes. So if (∗) fails then
the 1-hereditary weak convergence law fails, and we are done. It seems that our
ability to “identify” the l∗ -boundary (i.e. the set [1, l∗ ] ∪ (n − l∗ , n]) in Mp̄n is
closely related to the holding of the 0-1 law. In Conclusion 2 we use this idea and
give a necessary condition on p̄ for the 1-hereditary weak convergence law. The
proof uses methods similar to those of the previous sections. Finding a sufficient
condition for the 1-hereditary 0-1 law seems to be harder. It turns out that the
analysis of this case is, in a way, similar to the analysis when we add the successor
relation to our vocabulary. This is because the edges of the form {l, l+l∗} appear
with probability 1 similarly to the successor relation. There are, however, some
n
obvious differences. Let L+ be the vocabulary {∼, S}, and let (M + )p̄ be the
+ n
random L+ structure with universe [n], ∼ is the same as in Mp̄n , and S (M )p̄ is
the successor relation on [n]. Now if for some l∗∗ > 0, 0 < pl∗∗ < 1 then (M + )np̄
does not satisfy the 0-1 law for L+ . This is because the elements 1 and l∗∗ + 1
n
are definable in L+ and hence some L+ sentence holds in (M + )p̄ iff {1, l∗∗ + 1}
n
is an edge of (M + )p̄ which holds with probability pl∗∗ . In our case, as in L we
can not distinguish edges of the form {l, l + l∗ } from the rest of the edges, the
Hereditary Zero-One Laws for Graphs 597
0-1 law may hold even if such l∗ exists. In Lemma 24 below we show that if, in
fact, we can not “identify the edges” in Mp̄n then the 0-1 law holds in Mp̄n . This
is translated in Theorem 27 to a sufficient condition on p̄ for the 0-1 law holding
in Mp̄n , but not necessarily for the 1-hereditary 0-1 law. The proof uses “local”
properties of graphs. It seems that some form of “1-hereditary” version of 27 is
possible. In any case we could not find a necessary and sufficient condition for
the 1-hereditary 0-1 law, and the analysis of this case is not complete.
We first find a necessary condition on p̄ for the 1-hereditary weak convergence
law. Let us start with a definition of a structure on a sequence q̄ ∈ P that enables
us to “identify” the l∗ -boundary in Mq̄n .
φ1 (y1 , z1 , y2 , z2 ) := y1 ∼ z1 ∧ z1 ∼ z2 ∧ z2 ∼ y2 ∧ y2 ∼ y1 ∧ y1 = z2 ∧ z1 = y2 .
Observation 3. Let q̄ ∈ P be nice and n ∈ N be such that n < nq̄ . Then the
following holds in Mq̄n with probability 1:
The following claim shows that if q̄ is nice (and has a certain structure) then,
with probability close to 1, φ33,0 [y] holds in Mq̄n for all y ∈ [1, l∗ ] ∪ (n − l∗ , n].
This, together with (4) in the observation above gives us a “definition” of the
l∗ -boundary in Mq̄n .
claim 5. Let q̄ ∈ Pf in be nice and denote n = nq̄ . Assume that for all l > 0,
ql > 0 implies l < n/3. Assume further that for some > 0, 0 < ql < 1 ⇒ <
ql < 1 − . Let y0 ∈ [1, l∗ ] ∪ (n − l∗ , n]. Denote m := |{0 < l < np̄ : 0 < ql < 1}|.
Then:
P r[Mq̄n |= ¬φ33,0 [y0 ]] ≤ ( q|y0 −y| )(1 − 11 )m/2−1 .
{y∈[n]:|y0 −y|=l∗ }
l0
y0 z05
55 55
5 55
l1 55 l1
55 55
y1 55 z15 55
55 55 55 55
55 55l2 55 55l2
55 55 55 55
55 55 55 55
55 5 55 55
55l2 55 55l2 55
55 55 5
55 l0 55
55 y2 55 z2
55 55
55 l1 55 l1
5
l0
y3 z3
Fig. 1.
The following holds in Mq̄n with probability 1: If for some l1 , l2 < n/3 such
that (l0 , l1 , l2 ) is without repetitions, we have:
For (l1 , l2 ) ∈ Ly0 ,z0 , the probability that (∗)1 and (∗)2 holds, is (1 −
ql0 )(ql0 )2 (ql1 )4 (ql2 )4 . Denote the event that (∗)1 and (∗)2 holds by E y0 ,z0 (l1 , l2 ).
Note that if (l1 , l2 ), (l1 , l2 ) ∈ Ly0 ,z0 are such that (l1 , l2 , l1 , l2 ) is without repe-
titions and l1 + l2 = l1 + l2 then the events E y0 ,z0 (l1 , l2 ) and E y0 ,z0 (l1 , l2 ) are
independent. Now recall that m := |{l > 0 : < ql < 1 − }|. Hence we have
some L ⊆ Ly0 ,z0 such that: |L | = m/2 − 1, and if (l1 , l2 ), (l1 , l2 ) ∈ L then the
events E y0 ,z0 (l1 , l2 ) and E y0 ,z0 (l1 , l2 ) are independent. We conclude that
at most ( {y∈[n]:|y0 −y|=l∗ } q|y0 −y| )(1 − 11 )m/2−1 . Now by (3) in Observation 3,
Mq̄n |= φ20,3 [y0 , y0 + l∗ ]. By Markov’s inequality and the definition of φ30,3 (x) we
are done.
We now prove two lemmas which allow us to construct a sequence q̄ such that
for ϕ := ∃xφ30,3 (x) each of ϕ and ¬ϕ will hold infinitely often in Mq̄n .
Lemma 20. Assume p̄ satisfies l>0 pl = ∞, and let q̄ ∈ Genr1 (p̄) be nice. Let
ζ > 0 be some rational number. Then there exists some r > r and q̄ ∈ Genr1 (p̄)
nq̄
such that: q̄ is nice, q̄ q̄ and P r[Mq̄ |= ϕ] ≤ ζ.
Proof. Define p1 := ( l∈[nq̄ ]\{l∗ } (1 − pl ))2 , and choose r > r large enough so
that r<l≤r pl ≥ 2l∗ · p1 /ζ. Now define q̄ ∈ Genr1 (p̄) in the following way:
⎧
⎪
⎪ ql 0 < l < nq̄
⎨
0 nq̄ ≤ l < (r − r) · nq̄
ql =
⎪ pr+i
⎪ l = (r − r + i) · nq̄ for some 0 < i ≤ (r − r)
⎩
0 (r − r) · nq̄ ≤ l < 2(r − r) · nq̄ and l ≡ 0 (mod nq̄ ).
Note that indeed q̄ is nice and q̄ q̄ . Denote n := nq̄ = 2(r − r) · nq̄ . Note
further that every member of Mq̄n have at most one neighbor of distance more
more than n/2, and all the rest of its neighbors are of distance at most nq̄ . We
now bound from above the probability of Mq̄n |= ∃xφ30,3 (x). Let x be in [1, l∗ ].
For each i in the range 0 < i ≤ (r − r) denote yi := x + (r − r + i) · nq̄
(hence yi ∈ [n/2, n]) and let Ei be the following event: “Mq̄n |= yi ∼ z iff
z ∈ {x, yi + l∗ , yi − l∗ }”. By the definition of q̄ , each yi can only be connected
either to x or to members of [y − nq̄ , y + nq̄ ], and hence we have
−r+i)·n · p = pr+i · p .
1 1
P r[Ei ] = q(r q̄
As i = j ⇒ n/2 > |yi − yj | > nq̄ we have that the Ei -s are independent events.
Now if Ei holds then by the definition of φ20,3 we have Mq̄n |= ¬φ20,3 [x, yi ], and
600 S. Shelah and M. Doron
as Mq̄n |= ¬φ20,3 [x, x + l∗ ] this implies Mq̄n |= ¬φ30,3 [x]. Let the random variable
X denote the number of i in the range 0 < i ≤ (r − r) such that Ei holds in
Mq̄n . Then by Chebyshev’s inequality we have:
This is true for each x ∈ [1, l∗ ] and the symmetric argument gives the same
bound for each x ∈ (n − l∗ , n]. Finally note that if x, x + l∗ both belong to [n]
then Mq̄n |= ¬φ20,3 [x, x + l∗ ] (see Observation 3(4)). Hence if x ∈ (l∗ , n − l∗ ] then
Mq̄n |= ¬φ30,3 [x]. We conclude that:
∞ 21. Assume p̄ satisfies 0 < pl < 1 ⇒ < pl < 1 − for some > 0,
Lemma
and n=1 pn = ∞. Let q̄ ∈ Genr1 (p̄) be nice, and ζ > 0 be some rational number.
Then there exist some r > r and q̄ ∈ Genr1 (p̄) such that: q̄ is nice, q̄ q̄ and
nq̄
P r[Mq̄ |= ϕ] ≥ 1 − ζ.
Proof. This is a direct consequence of Claim 5. For each r > r denote m(r ) :=
|{0 < l ≤ r : 0 < pl < 1}|. Trivially we can choose r > r such that m(r )(1 −
11 )m(r )/2−1 ≤ ζ. As q̄ is nice there exists some nice q̄ ∈ Genr1 (p̄) such that
q̄ q̄ . Note that
q|1−y| ≤ ql ≤ m(r )
{y∈[n]:|1−y|=l∗ } {0<l<nq̄ :l=l∗ }
We skip the proof of this claim because an almost identical lemma is proved in
[2] (see Lemma at page 8 there).
We can now finish the proof of Lemma 24. Recall that φ(x) is an r-local
formula. We consider two possibilities. First assume that for some r-proper
(l, U, u0 , H) ∈ H we have H |= φ[u0 ]. Let ζ > 0 be some real. Then by the claim
above, for n large enough, with probability at least 1 − ζ there exist f1 , ..., fm
strong embeddings of H into Mp̄n such that Im(fi ) : 1 ≤ i ≤ m
are pairwise
disjoint. By observation (1) above we have:
n n
– For 1 ≤ i < j ≤ m, B Mp̄ (r, fi (u0 )) ∩ B Mp̄ (r, fj (u0 )) = ∅.
– For 1 ≤ i ≤ m, Mp̄n |= φ[fi (u0 )].
Hence f1 (u0 ), ..., fm (u0 ) exemplifies ψ in Mp̄n , so P r[Mp̄n |= ψ] ≥ 1 − ζ and as ζ
was arbitrary we have limn→∞ P r[Mp̄n |= ψ] = 1 and we are done.
Hereditary Zero-One Laws for Graphs 603
Remark 25. Lemma 24 above gives a sufficient condition for the 0-1 law. If we are
only interested in the convergence law, then a weaker condition is sufficient; all
we need is that the probability of any local property holding in the l∗ -boundary
converges. Formally:
Assume that for all r ∈ N and r-local L-formulas, φ(x), and for all 1 ≤ l ≤ l∗
we have: Both P r[Mp̄n |= φ[l] : n ∈ N
and P r[Mp̄n |= φ[n − l + 1] : n ∈ N
We now use 24 to get a sufficient condition on p̄ for the 0-1 law holding in Mp̄n .
Our proof relies on the assumption that Mp̄n contains few cycles, and only those
that are “unavoidable”. We start with a definition of such cycles:
l=1 pl = ∞ and
∞
l=1 (p l )2
< ∞. Then M n
p̄ satisfies the 0-1 law for L.
Proof. Let φ(x) be some r-local formula, and j ∗ be in {1, 2, ..., l∗} ∪
{−1, −2, ..., −l∗}. For n ∈ N let zn∗ = z ∗ (n, j ∗ ) equal j ∗ if j ∗ > 0 and n − j ∗ + 1
if j ∗ < 0 (so zn∗ belongs to [1, l∗ ] ∪ (n − l∗ , n]). We will show that with
probability approaching 1 as n → ∞ there exists some y ∗ ∈ [n] such that
Mp̄n
B (r, y )∩([1, l∗ ]∪(n−l∗ , n]) = ∅ and Mp̄n |= φ[zn∗ ] ↔ φ[y ∗ ]. This will complete
∗
the proof by Lemma 24. For simplicity of notation assume j ∗ = 1 hence zn∗ = 1
(the proof of the other cases is similar). We use the notations of the proof of 24.
In particular recall the definition of the set H and of an r-proper member of H.
Now if for two r-proper members of H, (l1 , x1 , U 1 , H 1 ) and (l2 , x2 , U 2 , H 2 ) we
have H 1 |= φ[x1 ] and H 2 |= ¬φ[x2 ] then by Claim 6 we are done. Otherwise all
r-proper members of H give the same value to φ[x] and without loss of gener-
ality assume that if (l, x, U, H) ∈ H is a r-proper then H |= φ[x] (the dual case
is identical). If limn→∞ P r[Mp̄n |= φ[1]] = 1 then again we are done by 6. Hence
we may assume that:
For some > 0, for an unbounded set of n ∈ N, P r[Mp̄n |= ¬φ[1]] ≥ .
In the construction below we use the following notations: 2 denotes the set {0, 1}.
k
2 denotes the set of sequences of length
k of members of 2, and if η belongs to
k
2 we write |η| = k. ≤k 2 denotes 0≤i≤k k 2 and similarly <k 2.
denotes the
empty sequence, and for η, η ∈ ≤k 2, ηˆη denotes the concatenation of η and η .
Finally for η ∈ k 2 and k < k, η|k is the initial segment of length k of η.
Call ȳ a saturated tree of depth k in [n] if:
– ȳ = yη ∈ [n] : η ∈ ≤k 2
.
– ȳ is without repetitions.
– {y
0 , y
1 } = {y
+ l∗ , y
− l∗ }.
– If 0 < l < k and η ∈ l 2 then {yη + l∗ , yη − l∗ } ⊆ {yηˆ
0 , yηˆ
1 , yη|l−1 }.
Let G be a graph with set of vertices [n], and i ∈ [n]. We say that ȳ is a cycle
free saturated tree of depth k for i in G if:
Hereditary Zero-One Laws for Graphs 605
claim 7. For n ∈ N and G a graph on [n] denote by Ik∗ (G) the set ([1, l∗ ] ∪ (n −
l∗ , n]) ∩ B G (1, k). Let E n,k be the event: “There exists a cycle free saturated
forest of depth k for Ik∗ (G)”. Then for each k ∈ N:
lim P r[E n,k holds in Mp̄n ] = 1.
n→∞
Now as p̄ has no unavoidable, cycles let m12k be as in 26(7). Then the expected
number of cycles of length ≤ 12k starting in i = x0 is
p(x̄) ≤
k ≤12k,x̄=(x0 ,...,xk )
is a possible cycle
6k
(m12k )12k · (pli )2 ≤ (m1 2k)12k · ( (pl )2 )6k .
0<l1 ,...,l6k <n i=1 0<l<n
∞ ∗
l=1 (pl ) := c < ∞, if we take m =
2 2
But as 0<l<n (pl ) is bounded by
∗ 6k
(m12k )12k
· (c ) /ζ then we have 1 as desired.
606 S. Shelah and M. Doron
Step 2. We show that there exists a positive lower bound on the probability that
a cycle passes through a given edge of Mp̄n . Formally: Let n ∈ N and i, j ∈ [n] be
2
such that p|i−j| > 0. Denote By En,i,j the event: “There does not exists a cycle
of length ≤ 6k containing the edge {i, j}”. Then there exists some q2 > 0 such
that:
2 For any n ∈ N and i, j ∈ [n] such that p|i−j| > 0, P rMp̄n [En,i,j
2
|i ∼ j] ≥ q2 .
To see this call a path x̄ = (x0 , ..., xk ) good for i, j ∈ [n] if x0 = j, xk = i,
x̄ does not contain the edge {i, j} and does not contain the same edge more
2
than once. Let En,i,j be the event: “There does not exists a path good for i, j of
length < 6k”. Note that for i, j ∈ [n] and G a graph on [n] such that G |= i ∼ j
we have: (i, j, x2 , ..., xk ) is a cycle in G iff (j, x2 , ..., kk ) is a path in G good for
2 2
i, j. Hence for such G we have: En,i,j holds in G iff En,i,j holds in G. Since the
2
events i ∼ j and En,i,j are independent in Mp̄ we conclude:
n
2 2
2
P rMp̄n [En,i,j |i ∼ j] = P rMp̄n [En,i,j |i ∼ j] = P rMp̄n [En,i,j ].
Next recalling Definition 26(7) let mk be as there. Since l>0 (pl )2 < ∞, (pl )2
converges to 0 as l approaches infinity, and hence so does pl . Hence for some
m0 ∈ N we have that l > m0 implies pl < 1/2. Let m∗k := max{m6k , m0 }. We now
define for a possible path x̄ = (x0 , ...xk ), Large(x̄) = {0 ≤ r < k : |lrx̄ | > m∗k }.
Note that as p̄ has no unavoidable cycles we have for any possible cycle x̄ of length
≤ 6k, Large(x̄) ⊆ Sym(x̄), and |Large(x̄)| is even. We now make the following
2,k∗
claim: For each k ∗ in the range 0 ≤ k ∗ ≤ k/2 let En,i,j be the event: “There
does not exists a path, x̄, good for i, j of length < 6k with |Large(x̄)| = 2k ∗ ”.
Then there exists a positive probability q2,k∗ such that for any n ∈ N and
i, j ∈ [n] we have:
2,k∗
P rMp̄n [En,i,j ] ≥ q2,k∗ .
Then by taking q2 = 0≤k∗ ≤ k/2
q2,k∗ we will have 2 . Let us prove the claim.
For k ∗ = 0 we have (recalling that no cycle consists only of edges of length l∗ ):
k −1
2,0
P rMp̄n [En,i,j ] = (1 − p|lx̄r |)
k ≤6k, x̄=(i=x0 ,j=x1 ,...,xk ) r=1
is a possible cycle, |Large(x̄)|=0
∗ 6k−1
≥ (1 − max{pl : 0 < l ≤ m∗k , l = l∗ })6k·(mk ) .
But as the last expression is positive and depends only on p̄ and k we are done.
For k ∗ > 0 we have:
k −1
2,k∗
P rMp̄n [En,i,j ] = (1 − p|lx̄m |)
k ≤6k, x̄=(i=x0 ,j=x1 ,...,xk ) m=1
is a possible cycle, |Large(x̄)|=k∗
Hereditary Zero-One Laws for Graphs 607
k −1
= (1 − p|lx̄m |) ·
k ≤6k, x̄=(i=x0 ,j=x1 ,...,xk ) m=1
is a possible cycle,
|Large(x̄)|=k∗ ,0∈Large(x̄)
k −1
(1 − p|lx̄m |).
k ≤6k, x̄=(i=x0 ,j=x1 ,...,xk ) m=1
is a possible cycle,
|Large(x̄)|=k∗ ,0∈Large(x̄)
and hence l1 ,...,lk∗ >m∗ (1 − m=1 (plm )2 ) > 0 and we have a bound as desired.
k
Similarly the product on the third line is at least
∗
k −1
∗ (6k−2k∗ −1) ∗
·(6k)2k
[ (1 − (plm )2 ) · 1/2](mk ) ,
l1 ,...,lk∗ −1 >m∗
k
m=1
r=1,...,k
(2l∗ +1)
and let q3 = q2 . We then have:
3 For any n ∈ N and i, j ∈ [n] such that p|i−j| > 0 and j + kl∗ , j − kl∗ ∈ [n],
3
P rMp̄n [En,i,j |i ∼ j] ≥ q3 .
This follows immediately from 2 , and the fact that if i, i , j, j all belong to
2
[n] then the probability P rMp̄n [En,i,j |En,i
2
,j ] is no smaller then the probability
2
P rMp̄n [En,i,j ].
Step 4. For i, j ∈ [n] such that j + kl∗ , j − kl∗ ∈ [n] denote by En,i,j
4
the event:
∗
“En,i,j holds and for x ∈ {j + rl : r ∈ {−k, −k + 1, ..., k}} and y ∈ [n] \ {i} we
3
have x ∼ y ⇒ (|x − y| = l∗ ∨ |x − y| > m2k )”. Then for some q4 > 0 we have:
4 For any n ∈ N and i, j ∈ [n] such that p|i−j| > 0 and j + kl∗ , j − kl∗ ∈ [n],
4
P rMp̄n [En,i,j |i ∼ j] ≥ q4 .
To see this simply take q4 = q3 · ( l∈{1,...,m2k }\{l∗ } (1 − pl ))2k+1 , and use 3 .
Step 5. For n ∈ N, S ⊆ [n], and i ∈ [n] let En,S,i 5
be the event: “For some
∗
j ∈ [n] \ S we have i ∼ j, |i − j| = l and En,i,j ”. Then for each δ > 0 and s ∈ N,
4
First let δ > 0 and s ∈ N be fixed. Second for n ∈ N, S ⊆ [n] and i ∈ [n] denote by
Jin,S the set of all possible candidates for j, namely Jin,S := {j ∈ (kl∗ , n−kl∗]\S :
|i − j| = l∗ }. For j ∈ Jin,∅ let Uj := {j + rl∗ : r ∈ {−k, −k + 1, ..., k}}. For m ∈ N
and G a graph on [n] call j ∈ Jin,S a candidate of type (n, m, S, i) in G, if each
j ∈ U (j), belongs to at most m different cycles of length at most 6k in G.
Denote the set of all candidates of type (n, m, S, i) in G by Jin,S (G). Now let
Xin,m be the random variable on Mp̄n defined by:
Xin,m (Mp̄n ) = {p|i−j| : j ∈ Jin,S (Mp̄n )}.
[To see this use (1) and construct J 1 by adding the candidate with the
largest p|i−j| that satisfies (a). Note that each new candidate excludes at
most m2k (2l∗ + 1) others.]
3. Let j belong to J 1 (G). Then the set {j ∈ J 1 (G) : Cj (G) ∩ Cj (G) = ∅}
has size at most m∗∗ . [To see this use (2)(b) above, the fact that two cycles
of length ≤ 6k that intersect in an edge give a cycle of length ≤ 12k and
similar trivial facts.]
Hereditary Zero-One Laws for Graphs 609
4. From (3) we conclude that there exists J 2 (G) ⊆ J 1 (G) and j1 , ...jr
an
enumeration of J 2 (G) such that:
(a) For any 1 ≤ r ≤ r the sets C(jr ) and ∪1≤r <r C(jr ) are disjoint.
Now for each j ∈ Jin,S let Ej∗ be the event: “i ∼ j and En,i,j 4
”. By 4 we
n,S ∗
have for each j ∈ Ji , P rMp̄n [Ej ] ≥ q4 · p|i−j| . Recall that we condition the
∗
probability space Mp̄n to the event Xin,m (Mp̄n ) ≥ Rin,S /2, and let j1 , ...jr
be
the enumeration of J 2 (Mp̄n ) from (4) above. (Formally speaking r and each jr
is a function of Mp̄n ). We then have for 1 ≤ r < r ≤ r, P rMp̄n [Ej∗r |Ej∗r ] ≥
P rMp̄n [Ej∗r ], and P rMp̄n [Ej∗r |¬Ej∗r ] ≥ P rMp̄n [Ej∗r ]. To see this use (2)(a) and
(4)(a) above and the definition of Cj (G).
Let the random variables X and X be defined as follows. X is the number
of j ∈ J 2 (Mp̄n ) such that Ej∗ holds in Mp̄n . In other words X is the sum of r
random variables Y1 , ..., Yr
, where for each r in the range 1 ≤ r ≤ r, Yr
equals 1 if Ej∗r holds, and 0 otherwise. X is the sum of r independent random
variables Y1 , ..., Yr
, where for each r in the range 1 ≤ r ≤ r Yr equals 1 with
probability q4 · p|i−jr | and 0 with probability 1 − q4 · p|i−jr | . Then by the last
paragraph for any 0 ≤ t ≤ r,
V ar(X ) 1
5
P rMp̄n [¬En,S,i ] ≤ P rMp̄n [X = 0] ≤ P r[X = 0] ≤ ≤ ≤δ
Exp(X )2 Exp(X )
as desired.
Step 6. We turn to the construction of the cycle free saturated forest. Let > 0,
and we will prove that for n ∈ N large enough we have P r[E n,k holds in Mp̄n ] ≥
1 − . Let δ = /(l∗ 2k+2 ) and s = 2l∗ ((k + 2k )(2l∗ k + 1)). Let n ∈ N be large
enough so that 5 holds for n, k, δ and s. We now choose (formally we show
that with probability at least 1 − such a choice exists) by induction on (i, η) ∈
Ik∗ (Mp̄n ) × ≤k 2 (ordered by the lexicographic order) yηi ∈ [n] such that:
Before we describe the choice of yηi , we need to define sets Sηi ⊆ [n]. For a graph
G on [n] and i ∈ Ik∗ (G) let Si∗ (G) be the set of vertices in the first (in some fixed
610 S. Shelah and M. Doron
order) path of length ≤ k from 1 to i in G. Now let S ∗ (G) = i∈Ik∗ (G) Si∗ (G).
i
For (i, η) ∈ Ik∗ (Mp̄n ) × ≤k 2 and y η ∈ [n] : (i , η ) <lex (i, η)
define:
Sηi (G) = S ∗ (G) ∪ {[yηi − kl∗ , yηi + kl∗ ] : (i η ) <lex (i, η)}.
Note that indeed |S ∗ (G)| ≤ s for all G. In the construction below when we write
Sηi we mean Sηi (Mp̄n ) where yηi ∈ [n] : (i , η ) <lex (i, η)
were already chosen.
Now the choice of yηi is as follows:
– If η =
by 5 with probability at least 1− δ, En,S
5 n
i ,i holds in Mp̄ and hence
η
satisfies (1)-(4).
of a saturated tree follows. Lastly we need to show that (c) in the definition of
a saturated forest holds. To see this note that if i1 , i2 ∈ i∗k (Mp̄n ) then by the
definition of Sηi (Mp̄n ) there exists a path of length ≤ 2k from i1 to i2 with all its
vertices in Sηi (Mp̄n ). Now if x̄ is a path of length ≤ k from y
i1 i1
to i2 and (y
, i1 )
is not an edge of x̄, then necessarily {y
i1
, i1 } is included in some cycle of length
≤ 3k + 2. This is a contradiction to the choice of y
i1
. This completes the proof
of the claim.
By and the claim above we conclude that, for some large enough n ∈ N, there
exists a graph G = ([n], ∼) such that:
1. G |= ¬φ[1].
2. P r[Mp̄n = G] > 0.
Hereditary Zero-One Laws for Graphs 611
(Why? If distG (1, yηi ) < r then |η| < r(i), and the construction of Step 2).
612 S. Shelah and M. Doron
Now define f (y) = f (yj ) + 0≤t<k lt . We have to show that f (y) is well defined.
Assume that both x̄1 = (x0 , ...xk1 ) and x̄2 = (x0 , ...xk1 ) are paths as above. Then
k1 = k2 and x̄ = (x0 , ..., xk1 , xk2 −1 , ..., x0 ) is a cycle of length k1 +k2 ≤ 2r. By (v)
Hereditary Zero-One Laws for Graphs 613
in the definition of a saturated tree we know that for each s < s(j), |yj − zsj | >
m2r . Hence as p̄ is without unavoidable cycles we have for each s < s(j) and
0 ≤ t < k1 + k2 , if |ltx̄ | = |yj − zsj | then t ∈ Sym(x̄). (see Definition 26(6,7)).
Now put for w ∈ {1, 2} and s < s(j), m+ w (s) := |{0 ≤ t < kw : lt
x̄w
= yj − zsj }|
−
and similarly mw (s) := |{0 ≤ t < kw : −lt = yj − zs }|. By the definition of x̄
x̄w j
− −
we have, m+1 (s) − m1 (s) = m2 (s) − m2 (s). But from the definition of lt (x̄) we
+
Now as 0≤t<k1 ltx̄1 = 0≤t<k2 ltx̄2 we get 0≤t<k1 lt (x1 ) = 0≤t<k2 lt (x2 ) as
desired.
We now show that f |Bj is one-to-one. Let y 1 = y 2 be in Bj . So for w ∈ {1, 2}
we have a path x̄w = (xw w w
0 , ...xkw ) from y to yj . As before, for s < s(j) denote
mw (s) := |{0 ≤ t < kw : lt = yj − zs }| and similarly m−
+ x̄w j
w (s). By the definition
of fBj we have
− −
f (y 1 ) − f (y 2 ) = y 1 − y 2 + 1 (s) − m1 (s)) − (m2 (s) − m2 (s))] · l(j, s).
[(m+ +
s<s(j)
− −
Now if for each s < s(j), m+ 1 (s) − m1 (s) = m2 (s) − m2 (s) then we are done
+
s < s(j)
each define m+ (s) and m− (s) as above, hence we have f (y) = yj +
+ − + −
s<s(j) (m (s)−m (s))l(j, s). Consider two cases. First if (m (s)−m (s)) = 0
for each s < s(j) then f (y) = y. Hence f (y) ∈ f (B 0 ) = B 0 (by (β) above),
f (y) ∈ f (Y ) (as f (Y ) ∩ [n] = ∅) and f (y) ∈ f (∪j <j Bj ) (by (γ) and the
induction hypothesis). So f (y) ∈ Fj . Second assume that for some s < s(j),
(m+ (s) − m− (s)) = 0. Then by the ⊗ we have f (y) ∈ [n] and furthermore
f (y) ∈ Fj . In both cases the demands for f |Bj are met and we are done. After
finishing the construction for all j < j ∗ we have f |B 1 such that:
•7 f |B 1 is one-to-one.
•8 f (B 1 ) is disjoint to f (B 0 ) ∪ f (Y ).
•9 If y ∈ B 1 and distG (1, y) < r then f (y) + l∗ , f (y) − l∗ ∈ f (B 1 ). In fact
f (y + l∗ ) = f (y) + l∗ and f (y − l∗ ) = f (y) − l∗ . (By the construction of
Step 3.)
(◦) If y ∈ B and distG (1, y) < r then f (y) + l∗ , f (y) − l∗ ∈ f (B). Furthermore:
(◦◦) {y, f −1 (f (y) − l∗ )} and {y, f −1 (f (y) + l∗ )} are edges of G.
614 S. Shelah and M. Doron
For (◦◦) use: •2 with the definition of f |B 0 , •5 +•5 with the fact that G |= i ∼ y
i
,
•6 with the construction of Step 2 and •9 .
We turn to the definition of (l, u0 , U, H) and the isomorphism h : B → H. Let
lmin = min{f (b) : b ∈ B} and lmax = max{f (b) : b ∈ B}. Define:
– l = lmin + lmax + 1.
– u0 = lmin + 2.
– U = {z + lmin + 1 : z ∈ Im(f )}.
– For b ∈ B, h(b) = f (b) + lmin + 1.
– For u, v ∈ U , H |= u ∼ v iff G |= h−1 (u) ∼ h−1 (v).
As f was one-to-one so is h, and trivially it is onto U and maps 1 to u0 . Also
by the definition of H, h is a graph isomorphism. So it remains to show that
(l, u0 , U, H) is r-proper. First (∗)1 in the definition of proper is immediate from
the definition of H. Second for (∗)2 in the definition of proper let u ∈ U be
such that distH (u0 , u) < r. Denote y := h−1 (u); then by the definition of H we
have distG (1, y) < r, hence by (◦), f (y) + l∗ , f (y) − l∗ ∈ f (B) and hence by the
definition of h and U , u + l∗, u − l∗ ∈ U as desired. Lastly to see (∗)3 let u, u ∈ U
and denote y = h−1 (u) and y = h−1 (u ). Assume |u − u | = l∗ ; then by (◦◦)
we have G |= y ∼ y and by the definition of H, H |= u ∼ u . Now assume that
H |= u ∼ u ; then G |= y ∼ y . Using observation (δ) above and rereading 1-3
we see that |u − u| is either l∗ , |y − y |, lη for some η ∈ <r 2 (see Step 2) or l(j, s)
for some j < j ∗ , s < s(j) (see step 3). In all cases we have P|u−u | > 0. Together
we have (∗)3 as desired. This completes the proof of Theorem 27.
References
1. Gaifman, H.: On local and nonlocal properties. In: Proceedings of the Herbrand
symposium (Marseilles, 1981). Stud. Logic Found. Math., vol. 107, pp. 105–135.
North-Holland, Amsterdam (1982)
2. L
uczak, T., Shelah, S.: Convergence in homogeneous random graphs. Random Struc-
tures Algorithms 6(4), 371–391 (1995)
3. Shelah, S.: Hereditary convergence laws with successor (in preparation)
On Monadic Theories of Monadic Predicates
Wolfgang Thomas
1 Introduction
Over the past century, starting with Löwenheim [16] in 1915, monadic second-
order logic has been developed as a framework in which decision procedures
can be provided for interesting theories of high expressive power. In building
this rich domain of effective logic, two techniques were crucial. The first was
based on the correspondence between monadic second-order formulas and finite
automata. This “match made in heaven” (cf. Vardi [28]) was first established
for weak monadic second-order logic over the successor structure S1 = (N, +1)
by Büchi, Elgot, and Trakhtenbrot. Büchi [2] and Rabin [19] extended this
to the full monadic second-order theory of S1 and of the binary tree S2 =
({0, 1}∗, ·0, ·1). The logic-automata connection first led to the decidability of
MT(S1 ) and MT(S2 ), the monadic second-order theories of S1 and S2 , respec-
tively (or shorter: the “monadic theory” of these structures). The results were
extended to many further logical systems and led to new approaches in verifica-
tion, data base theory, and further areas of computer science.
A. Blass, N. Dershowitz, and W. Reisig (Eds.): Gurevich Festschrift, LNCS 6300, pp. 615–626, 2010.
c Springer-Verlag Berlin Heidelberg 2010
616 W. Thomas
The second technique, technically more demanding but more general in its
scope, is the “composition method” as developed by Shelah [24] (building on
earlier work by Ehrenfeucht, Fraı̈ssé, Läuchli, and others). The idea here is to
consider finite fragments of a theory and to compose such theory-fragments ac-
cording to the combination of models. The method has been applied successfully
over orderings, trees, and graphs. Over orderings, the “combination” is concate-
nation. Shelah’s work provided a deep analysis of monadic theories of orderings
where automata do not help (or at least are hard to imagine), for example over
dense orderings.
In both approaches, Yuri Gurevich has played a central role and contributed
most influential papers. For the automata theoretic approach, it might suffice to
recall his path-breaking work with Harrington [12] on the monadic second-order
theory of the binary tree. As an example of his papers involving the composition
method, we mention the work [13,14] which explains over which “short” orderings
(neither embedding ω1 nor its reverse) the monadic theory is decidable. For the
reader who wants to enter the field, Yuri’s survey Monadic second-order theories
[11] is still the first choice.
In the present paper, a very small mosaic piece is added to this rich picture.
We consider the expansions of the binary tree S2 by recursive monadic predicates
P . We study which complexity (on the scale of recursion theory) the monadic
second-order theory of such an expansion (S2 , P ) can have, and we compare the
weak and the strong monadic second-order theory of the structures (S2 , P ).
As a starting point we take the corresponding results on expansions of the
successor structure S1 by recursive predicates. We recall (in Sect. 2) that for
recursive P ⊆ N, the monadic theory of (S1 , P ) belongs to a low level of the
arithmetical hierarchy, namely to the class Δ03 . It is also known that for any
monadic predicate P , the unrestricted monadic theory of (S1 , P ) is decidable
iff the weak monadic theory is (where set quantification is restricted to finite
sets). In contrast, we show in Sect. 3 that for recursive P the monadic theory
of (S2 , P ), which in general is confined to the analytical class Δ12 , can be Π11 -
hard. In Sections 4 and 5 we prove that there is a predicate P such that the weak
monadic theory of (S2 , P ) is decidable but the full monadic theory is undecidable.
For the proofs, both the automata theoretic and the composition method are
useful.1
We assume that the reader is familiar with the basics of the subject. We use
standard terminology on monadic theories, automata, and recursion theory (see,
e.g., [10,11,21,27]).
1
The second result should be attributed to the late Andrei Muchnik; it is stated
in a densely written abstract Automata on infinite objects, monadic theories, and
complexity of the Dagstuhl seminar report [7] of 1992. This abstract, written jointly
by A. Muchnik and A.L. Semenov, lists – in a dozen of lines – ten topics and results,
among them “an example of predicate on tree for which the weak monadic theory
is decidable and the monadic theory undecidable”. A manuscript with Muchnik’s
proof does not seem to exist. The talk itself, which was a memorable scientific event
appreciated by all who attended (among them the present author), dealt with a
different result, the “Muchnik tree iteration theorem”; see for example [1].
On Monadic Theories of Monadic Predicates 617
Hence it will be hard to show decidability of MT(S1 , P); for a detailed analysis
see [4]. On the other hand, no “natural” examples of predicates P are known
such that MT(S1 , P ) is undecidable. The known undecidability results rely on
predicates built for the purpose, as in Proposition 1 above.
The conversion of monadic formulas into automata provides nice examples of
predicates P where MT(S1 , P ) is decidable. We use the results of Büchi [2] and
McNaughton [17] which together yield a transformation from monadic formulas
to deterministic ω-automata: For each monadic second-order formula ϕ(X) in
the monadic second-order language of S1 = (N, +1) one can construct a deter-
ministic Muller automaton Aϕ such that for each predicate Q
S1 |= ϕ[Q] iff Aϕ accepts χQ .
We can use the left-hand side for a fixed predicate P , replacing in ϕ(X) each
occurrence of X by the predicate constant P . Then we have for each sentence ϕ
of the monadic second-order language of the structure (S1 , P ):
(S1 , P ) |= ϕ iff Aϕ accepts χP .
618 W. Thomas
This reduces the decision problem for the theory MT(S1 , P ) to the following
acceptance problem AccP : Given a Muller automaton A over the input alphabet
{0, 1}, does A accept χP ?
This reduction can be exploited in a concrete way, regarding example predi-
cates P , and also in a general way, regarding the recursion theoretic complexity
of theories MT(S1 , P ).
Concrete examples of predicates P such that MT(S1 , P ) is decidable were first
proposed by Elgot and Rabin [9], namely, the set of factorial numbers, the set
of k-th powers and the set of powers of k, for each k > 1. The idea is to solve
the acceptance problem AccP as follows: A given automaton A accepts χP iff
A accepts a modified sequence χ where the distances between successive letters
1 are contracted below a certain length (a contracted 0-segment just should
induce the same state transformation as the original one and should cause the
automaton to visit the same states as the original one). In each of the cases
mentioned above (factorials, k-th powers, powers of k), the contracted sequence
χ turns out to be ultimately periodic (where phase and period depend on A).
So one can decide whether A accepts χ and hence whether it accepts χP . The
method has been extended to further predicates (see e.g. [8]), and criteria for
the decidability of MT(S1 , P ) have been developed in [23,4,22].
For the general aspect we analyze the acceptance problem AccP for a Muller
automaton A = (S, Σ, s0 , δ, F ) in more detail. As usual, we write S for the set
of states, Σ for the input alphabet, s0 for the initial state, δ for the transition
function from S × Σ to S, and F ⊆ 2S for the acceptance component; recall
that A accepts an input word α if the set of states visited infinitely often in the
unique run of A on α coincides with a set in F . Let us write δ(s0 , α[0, j]) for
the state reached by A after processing the initial sement α(0) . . . α(j). Then,
taking α = χP , the automaton A accepts χP iff the following condition holds:
F ∈F s∈F (∀i∃j > i δ(s0 , χP [0, j]) = s)
(∗)A,P .
∧ s∈S\F (¬∀i∃j > i δ(s0 , χP [0, j]) = s)
MT(S1 , P ) ≤tt P .
Here ≤tt is truth-table reducibility and P is the second jump of P . (In [25]
it is shown that the slightly sharper bounded truth-table reducibility does not
suffice.) We conclude the following fact, first noted in [3]:
Proposition 2. ([3]) For each recursive P ⊆ N, the theory MT(S1 , P ) belongs
to the class Δ03 of the arithmetical hierarchy.
In particular, it is not possible to show the undecidability of a theory MT(S1 , P )
by a reduction of true first-order arithmetic to it.
On Monadic Theories of Monadic Predicates 619
In the same way as described above for theories MT(S1 , P ), the automata theo-
retic approach can be applied to study the complexity of the monadic theory of
an expansion (S2 , P ) of the binary tree. Here we identify a structure (S2 , P ) with
a {0, 1}-labelled tree tP which has label 1 at node u iff u ∈ P . We know from
Rabin’s Tree Theorem [19] that for each monadic sentence ϕ in the language of
(S2 , P ) one can construct a Rabin tree automaton Aϕ such that
(S2 , P ) |= ϕ iff Aϕ accepts tP .
For recursive P , the right-hand side is a Σ21 -statement of the form ∃X∀Y ψ(X, Y )
with first-order formula ψ, namely, “there is an Aϕ -run on tP such that each
infinite path of this run satisfies the Rabin acceptance condition”. Since Rabin
automata are closed under complement, the statement can also be phrased in
Π21 -form. This proves the first statement of the following result:
Theorem 1. For recursive P ⊆ {0, 1}∗, the theory MT(S2 , P ) belongs to the
class Δ12 , and there is a recursive P ⊆ {0, 1}∗ such that MT(S2 , P ) is Π11 -hard.
For the proof of the second statement we have to find a recursive P such that
a known Π11 -complete set is reducible to MT(S2 , P ). As Π11 -complete set we
use a coding of finite-path trees (cf. [21, Ch. 16.3]). We work with the infinitely
branching tree Sω whose nodes are sequences (n1 , . . . nk ) of natural numbers.
The empty sequence is the root, and the nodes (n1 , . . . , nk , i) are the successors
of (n1 , . . . , nk ). Paths in Sω are defined accordingly. We say that a subset S
of Sω defines a finite-path tree if S is closed under taking predecessors and if
it does not contain an infinite path. For a recursion theoretic treatment, we
use a computable bijective coding of the finite sequences over N by natural
2
Although this Proposition is very close to Proposition 2, a result of [3], it was left as
an open problem in [3]. In a more general context an answer was then given in [26].
620 W. Thomas
Let
or two labels 1 occur. (Let us call such a path associated to (n1 , . . . , nk ) “once
1-labelled”, respectively “twice 1-labelled”.) So, using P , we can easily express
in monadic logic for any given e whether fe is a characteristic function. The
function fe is the characteristic function of a finite-path tree if moreover the
nodes re 1n1 +1 0 . . . 1nk +1 0 whose associated path is twice 1-labelled form a set
that that is closed under prefixes (i.e., there is no prefix whose associated path
is only once 1-labelled), and that each path through re (1+ 0)ω eventually hits a
node outside the coded tree, i.e., a node whose associated path is only once 1-
labelled. All these conditions can be expressed by a monadic sentence ϕe . Hence
we have e ∈ FPT iff (S2 , P ) |= ϕe , as desired.
Clearly, each tree tn belongs to T0 . We use the following lemma shown in [20]
(see also [27]):
Lemma 3. For each Büchi tree automaton A with < n states accepting tn one
can construct a regular tree tn ∈ T0 which is again accepted by A.
Let us sketch the proof. Assume that the Büchi tree automaton A with < n
states and the set F of final states accepts tn . Then one can construct a regular
run of A on tn (since tn is regular and accepted). We define a path in as
follows: Pick a node u1 = 1k1 on the right-hand branch where (1k1 ) ∈ F . Pick
a node u2 = 1k1 01k2 on the right-hand branch starting in 1k1 0 where again
(u2 ) ∈ F , and so on until such a node un = 1k1 01k2 . . . 1kn with (un ) ∈ F
is chosen. These nodes exist since on each path of infinitely many visits of F
occur. Now tn (ui 0) = 1 for i = 1, . . . , n by definition of tn . Since A has < n
states, there are ui , uj with i < j such that (ui ) = (uj ); observe that between
these nodes a 1-labelled node of tn occurs (for example at ui 0). Repeating the
tn -segment determined by the path segment from ui (included) to uj (excluded)
indefinitely, we obtain a regular tree tn which is accepted by A and which has a
path with infinitely many labels 1.
A set of trees definable in weak monadic logic is easily seen to be recognized
by a Büchi tree automaton. So the lemma also shows that T0 is not definable in
weak monadic logic.
As a next step we now construct a tree sω from tω . First we pick, for each
m-type τ (m = 1, 2, . . .), a Büchi tree automaton Aτ that defines τ . Let nτ be
the number of states of Aτ . Define
Nm := max{nτ | τ is m-type} + 1.
6 Conclusion
The study of the monadic theory of structures (S2 , P ) with monadic predicate
P seems far from finished. Let us list three open problems.
On Monadic Theories of Monadic Predicates 625
Acknowledgment
Many thanks are due to Nachum Dershowitz for his patience and help and to
Christof Löding and Alex Rabinovich for their comments.
References
1. Berwanger, D., Blumensath, A.: The monadic theory of tree-like structures. In:
Grädel, E., Thomas, W., Wilke, T. (eds.) Automata, Logics, and Infinite Games.
LNCS, vol. 2500, pp. 285–302. Springer, Heidelberg (2002)
2. Büchi, J.R.: On a decision method in restricted second-order arithmetic. In: Nagel,
E., et al. (eds.) Logic, Methodology, and Philosophy of Science: Proceedings of the
1960 International Congress, pp. 1–11. Stanford Univ. Press, Stanford (1962)
3. Büchi, J.R., Landweber, L.H.: Definability in the monadic second-order theory of
successor. J. Symb. Logic 34, 166–170 (1969)
4. Bateman, P.T., Jockusch, C.G., Woods, A.R.: Decidability and undecidability of
theories with a predicate for the primes. J. Symb. Logic 58, 672–687 (1993)
5. Carayol, A., Löding, C.: MSO on the infinite binary tree: Choice and order. In: Du-
parc, J., Henzinger, T.A. (eds.) CSL 2007. LNCS, vol. 4646, pp. 161–176. Springer,
Heidelberg (2007)
626 W. Thomas
6. Carayol, A., Löding, C., Niwiński, D., Walukiewicz, I.: Choice functions and well-
orderings over the infinite binary tree. Central Europ. J. of Math. (to appear)
7. Compton, K., Pin, J.E., Thomas, W. (eds.): Automata Theory: Infinite Computa-
tions. Dagstuhl Seminar Report 9202 (1992)
8. Carton, O., Thomas, W.: The monadic theory of morphic infinite words and gen-
eralizations. Information and Computation 176, 51–76 (2002)
9. Elgot, C.C., Rabin, M.O.: Decidability and undecidability of extensions of second
(first) order theory of (generalized) successor. J. Symb. Logic 31, 169–181 (1966)
10. Grädel, E., Thomas, W., Wilke, T. (eds.): Automata, Logics, and Infinite Games.
LNCS, vol. 2500. Springer, Heidelberg (2002)
11. Gurevich, Y.: Monadic theories. In: Barwise, J., Feferman, S. (eds.) Model-
Theoretic Logics, pp. 479–506. Springer, Berlin (1985)
12. Gurevich, Y., Harrington, L.: Trees, automata, and games. In: Proc. 14th STOC,
pp. 60–65 (1982)
13. Gurevich, Y.: Modest theory of short chains. J. Symb. Logic 44, 481–490 (1979)
14. Gurevich, Y., Shelah, S.: Modest theory of short chains II. J. Symb. Logic 44,
491–502 (1979)
15. Gurevich, Y., Shelah, S.: Rabin’s uniformization problem. J. Symb. Logic 48, 1105–
1119 (1983)
16. Löwenheim, L.: Über Möglichkeiten im Relativkalkül. Math. Ann. 76, 447–470
(1915)
17. McNaughton, R.: Testing and generating infinite sequences by a finite automaton.
Inf. Contr. 9, 521–530 (1966)
18. Montanari, A., Puppis, G.: A contraction method to decide MSO theories of de-
terministic trees. In: Proc. 22nd IEEE Symposium on Logic in Computer Science
(LICS), pp. 141–150
19. Rabin, M.O.: Decidability of second-order theories and automata on infinite trees.
Trans. Amer. Math. Soc. 141, 1–35 (1969)
20. Rabin, M.O.: Weakly definable relations and special automata. In: Bar-Hillel, Y.
(ed.) Math. Logic and Foundations of Set Theory, pp. 1–23. North-Holland, Ams-
terdam (1970)
21. Rogers, H.: The Theory of Recursive Functions and Effective Computability.
McGraw-Hill, New York (1967)
22. Rabinovich, A., Thomas, W.: Decidable theories of the ordering of natural numbers
with unary predicates. In: Ésik, Z. (ed.) CSL 2006. LNCS, vol. 4207, pp. 562–574.
Springer, Heidelberg (2006)
23. Semenov, A.: Decidability of monadic theories. In: Chytil, M.P., Koubek, V. (eds.)
MFCS 1984. LNCS, vol. 176, pp. 162–175. Springer, Heidelberg (1984)
24. Shelah, S.: The monadic theory of order. Ann. Math. 102, 379–419 (1975)
25. Thomas, W.: The theory of successor with an extra predicate. Math. Ann. 237,
121–132 (1978)
26. Thomas, W.: On the bounded monadic theory of well-ordered structures. J. Symb.
Logic 45, 334–338 (1980)
27. Thomas, W.: Languages, automata and logic. In: Rozenberg, G., Salomaa, A. (eds.)
Handbook of Formal Language Theory, vol. 3. Springer, New York (1997)
28. Vardi, M.Y.: Logic and Automata: A match made in heaven. In: Baeten, J.C.M.,
Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS, vol. 2719,
pp. 64–65. Springer, Heidelberg (2003)
Author Index